Robust Methods for Data Reduction gives a non-technical overview of robust data reduction techniques, encouraging the use of these important and useful methods in practical applications. The main areas covered include principal components analysis, sparse principal component analysis, canonical correlation analysis, factor analysis, clustering, double clustering, and discriminant analysis. The first part of the book illustrates how dimension reduction techniques synthesize available information by reducing the dimensionality of the data. The second part focuses on cluster and discriminant analysis. The authors explain how to perform sample reduction by finding groups in the data. Despite considerable theoretical achievements, robust methods are not often used in practice. This book fills the gap between theoretical robust techniques and the analysis of real data sets in the area of data reduction. Using real examples, the authors show how to implement the procedures in R. The code and data for the examples are available on the book's CRC Press web page.
Introduction and Overview What is contamination Evaluating robustness What is data reduction An overview of robust dimension reduction An overview of robust sample reduction Example datasets Multivariate Estimation Methods Robust univariate methods Classical multivariate estimation Robust multivariate estimation Identification of multivariate outliers Examples Dimension Reduction Principal Component Analysis Classical PCA PCA based on robust covariance estimation PCA based on projection pursuit Spherical PCA PCA in high dimensions Outlier identification using principal components Examples Sparse Robust PCA Basic concepts and sPCA Robust sPCA Choice of the degree of sparsity Sparse projection pursuit Examples Canonical Correlation Analysis Classical canonical correlation analysis CCA based on robust covariance estimation Other methods Examples Factor Analysis The FA model Robust factor analysis Examples Sample Reduction k-Means and Model-Based Clustering A brief overview of applications of cluster analysis Basic concepts k-means Model-based clustering Choosing the number of clusters Robust Clustering Partitioning around medoids Trimmed k-means Snipped k-means Choosing the trimming and snipping levels Examples Robust Model-Based Clustering Robust heterogeneous clustering based on trimming Robust heterogeneous clustering based on snipping Examples Double Clustering Double k-means Trimmed double k-means Snipped double k-means Robustness properties Discriminant Analysis Classical discriminant analysis Robust discriminant analysis Appendix: Use of the Software R for Data Reduction Bibliography Index