Data Mining with R: Learning with Case Studies, Second Edition uses practical examples to illustrate the power of R and data mining. Providing an extensive update to the best-selling first edition, this new edition is divided into two parts. The first part will feature introductory material, including a new chapter that provides an introduction to data mining, to complement the already existing introduction to R. The second part includes case studies, and the new edition strongly revises the R code of the case studies making it more up-to-date with recent packages that have emerged in R.
The book does not assume any prior knowledge about R. Readers who are new to R and data mining should be able to follow the case studies, and they are designed to be self-contained so the reader can start anywhere in the document.
The book is accompanied by a set of freely available R source files that can be obtained at the books web site. These files include all the code used in the case studies, and they facilitate the "do-it-yourself" approach followed in the book.
Designed for users of data analysis tools, as well as researchers and developers, the book should be useful for anyone interested in entering the "world" of R and data mining.
Introduction
I R AND DATA MINING
Introduction to R
Starting with R
Basic Interaction with the R Console
R Objects and Variables
R Functions
Vectors
Vectorization
Factors
Generating Sequences
Sub-Setting
Matrices and Arrays
Lists
Data Frames
Useful Extensions to Data Frames
Objects, Classes, and Methods
Managing Your Sessions
Introduction to Data Mining
A Birds Eye View on Data Mining
Data Collection and Business Understanding
Data Pre-Processing
Modeling
Evaluation
Reporting and Deployment
II CASE STUDIES
Predicting Algae Blooms
Problem Description and Objectives
Data Description
Loading the Data into R
Data Visualization and Summarization
Unknown Values
Obtaining Prediction Models
Model Evaluation and Selection
Predictions for the Seven Algae
Summary
Predicting Stock Market Returns
Problem Description and Objectives
The Available Data
The Prediction Models
From Predictions into Actions
Model Evaluation and Selection
The Trading System
Summary
Detecting Fraudulent Transactions
Problem Description and Objectives
The Available Data
Defining the Data Mining Tasks
Obtaining Outlier Rankings
Summary
Classifying Microarray Samples
Problem Description and Objectives
The Available Data
Gene (Feature) Selection
Predicting Cytogenetic Abnormalities
Summary