Handbook of Big Data provides a state-of-the-art overview of the analysis of large-scale datasets. Featuring contributions from well-known experts in statistics and computer science, this handbook presents a carefully curated collection of techniques from both industry and academia. Thus, the text instills a working understanding of key statistical and computing ideas that can be readily applied in research and practice. Offering balanced coverage of methodology, theory, and applications, this handbook: * Describes modern, scalable approaches for analyzing increasingly large datasets * Defines the underlying concepts of the available analytical tools and techniques * Details intercommunity advances in computational statistics and machine learning Handbook of Big Data also identifies areas in need of further development, encouraging greater communication and collaboration between researchers in big data sub-specialties such as genomics, computational biology, and finance.
GENERAL PERSPECTIVES ON BIG DATA The Advent of Data Science: Some Considerations on the Unreasonable Effectiveness of Data Richard Starmans Big n versus Big p in Big Data Norman Matloff DATA-CENTRIC, EXPLORATORY METHODS Divide and Recombine: Approach for Detailed Analysis and Visualization of Large Complex Data Ryan Hafen Integrate Big Data for Better Operation, Control, and Protection of Power Systems Guang Lin Interactive Visual Analysis of Big Data Carlos Scheidegger A Visualization Tool for Mining Large Correlation Tables: The Association Navigator Andreas Buja, Abba M. Krieger, and Edward I. George EFFICIENT ALGORITHMS High-Dimensional Computational Geometry Alexandr Andoni IRLBA: Fast Partial SVD Method James Baglama Structural Properties Underlying High-Quality Randomized Numerical Linear Algebra Algorithms Michael W. Mahoney and Petros Drineas Something for (Almost) Nothing: New Advances in Sublinear-Time Algorithms Ronitt Rubinfeld and Eric Blais GRAPH APPROACHES Networks Elizabeth L. Ogburn and Alexander Volfovsky Mining Large Graphs David F. Gleich and Michael W. Mahoney MODEL FITTING AND REGULARIZATION Estimator and Model Selection Using Cross-Validation Ivan Diaz Stochastic Gradient Methods for Principled Estimation with Large Datasets Panos Toulis and Edoardo M. Airoldi Learning Structured Distributions Ilias Diakonikolas Penalized Estimation in Complex Models Jacob Bien and Daniela Witten High-Dimensional Regression and Inference Lukas Meier ENSEMBLE METHODS Divide and Recombine Subsemble, Exploiting the Power of Cross-Validation Stephanie Sapp and Erin LeDell Scalable Super Learning Erin LeDell CAUSAL INFERENCE Tutorial for Causal Inference Laura Balzer, Maya Petersen, and Mark van der Laan A Review of Some Recent Advances in Causal Inference Marloes H. Maathuis and Preetam Nandy TARGETED LEARNING Targeted Learning for Variable Importance Sherri Rose Online Estimation of the Average Treatment Effect Sam Lendle Mining with Inference: Data-Adaptive Target Parameters Alan Hubbard and Mark van der Laan