This textbook provides future data analysts with the tools, methods, and skills needed to answer data-focused, real-life questions; to carry out data analysis; and to visualize and interpret results to support better decisions in business, economics, and public policy. Data wrangling and exploration, regression analysis, machine learning, and causal analysis are comprehensively covered, as well as when, why, and how the methods work, and how they relate to each other. As the most effective way to communicate data analysis, running case studies play a central role in this textbook. Each case starts with an industry-relevant question and answers it by using real-world data and applying the tools and methods covered in the textbook. Learning is then consolidated by 360 practice questions and 120 data exercises. Extensive online resources, including raw and cleaned data and codes for all analysis in Stata, R, and Python, can be found at www.gabors-data-analysis.com.
Provides students with a clear explanation of data analysis, as one third of the book consists of running case studies that develop the data analysis process logically through the book by using real-world scenarios and data
Fills an important and growing niche between technical econometrics books and more basic business analytics texts
Ideal for students who do not want to take more econometrics courses but would rather gain hands-on experience of working with real data. Suitable for non-PhD track students in economics and business
Coding language neutral. The text does not include code in any language, and hence, may be used in a variety of settings
Uses R and Stata and Python to teach methods, a far more useful and industry relevant approach than the spreadsheet programs used by most business analytics books
Full suite of ancillaries, including code and data used in case studies that have been carefully curated to match the printed text
Part I. Data Exploration:
1. Origins of data
2. Preparing data for analysis
3. Exploratory data analysis
4. Comparison and correlation
5. Generalizing from data
6. Testing hypotheses
Part II. Regression Analysis:
7. Simple regression
8. Complicated patterns and messy data
9. Generalizing results of a regression
10. Multiple linear regression
11. Modeling probabilities
12. Regression with time series data
Part III. Prediction:
13. A framework for prediction
14. Model building for prediction
15. Regression trees
16. Random forest and boosting
17. Probability prediction and classification
18. Forecasting from time series data
Part IV. Causal Analysis:
19. A framework for causal analysis
20. Designing and analyzing experiments
21. Regression and matching with observational data
22. Difference-in-differences
23. Methods for panel data
24. Appropriate control groups for panel data
Bibliography
Index.