Tags
Language
Tags
April 2024
Su Mo Tu We Th Fr Sa
31 1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 1 2 3 4

Frontiers in Massive Data Analysis

Posted By: exLib
Frontiers in Massive Data Analysis

Frontiers in Massive Data Analysis
Committee on Applied and Theoretical Statistics Board on Mathematical Sciences and Their Applications; Division on Engineering and Physical Sciences
NAS Press | 2013 | ISBN: 0309287782 9780309287784 | 191 pages | PDF | 15 MB

This book presents the Committee on the Analysis of Massive Data's work to make sense of the current state of data analysis for mining of massive sets of data, to identify gaps in the current practice and to develop methods to fill these gaps. The issue includes the committee's recommendations, details concerning types of data that build into massive data, and information on the seven computational giants of massive data analysis.


From Facebook to Google searches to bookmarking a webpage in our browsers, today's society has become one with an enormous amount of data. Some internet-based companies such as Yahoo! are even storing exabytes (10 to the 18 bytes) of data.
Like these companies and the rest of the world, scientific communities are also generating large amounts of data-—mostly terabytes and in some cases near petabytes—from experiments, observations, and numerical simulation.
However, the scientific community, along with defense enterprise, has been a leader in generating and using large data sets for many years.
The issue that arises with this new type of large data is how to handle it—this includes sharing the data, enabling data security, working with different data formats and structures, dealing with the highly distributed data sources, and more.

Contents
SUMMARY
1 INTRODUCTION
The Challenge
What Has Changed in Recent Years?
Organization of This Report
References
2 MASSIVE DATA IN SCIENCE, TECHNOLOGY, COMMERCE, NATIONAL DEFENSE, TELECOMMUNICATIONS, AND OTHER ENDEAVORS
Where Are Massive Data Appearing?
Challenges to the Analysis of Massive Data
Trends in Massive Data Analysis
Examples
References
3 SCALING THE INFRASTRUCTURE FOR DATA MANAGEMENT
Scaling the Number of Data Sets
Scaling Computing Technology through Distributed and Parallel Systems
Trends and Future Research
References
4 TEMPORAL DATA AND REAL-TIME ALGORITHMS
Introduction
Data Acquisition
Data Processing, Representation, and Inference
System and Hardware for Temporal Data Sets
Challenges
References
5 LARGE-SCALE DATA REPRESENTATIONS
Overview
Goals of Data Representation
Challenges and Future Directions
References
6 RESOURCES, TRADE-OFFS, AND LIMITATIONS
Introduction
Relevant Aspects of Theoretical Computer Science
Gaps and Opportunities
References
7 BUILDING MODELS FROM MASSIVE DATA
Introduction to Statistical Models
Data Cleaning
Classes of Models
Model Tuning and Evaluation
Challenges
References
8 SAMPLING AND MASSIVE DATA
Common Techniques of Statistical Sampling
Challenges When Sampling from Massive Data
References
9 HUMAN INTERACTION WITH DATA
Introduction
State of the Art
Hybrid Human/Computer Data Analysis
Opportunities, Challenges, and Directions
10 THE SEVEN COMPUTATIONAL GIANTS OF MASSIVE DATA ANALYSIS
Basic Statistics
Generalized N-Body Problems
Graph-Theoretic Computations
Linear Algebraic Computations
Optimizations
Integration
Alignment Problems
Discussion
References
11 CONCLUSION
APPENDIXES
A Acronyms
B Biographical Sketches of Committee Members
with TOC BookMarkLinks