Table of Contents
1. Data Visualization
1.1 What is data?
1.2 What is data visualization?
1.3 R for data visualization
1.4 Data frames
1.5 Bar charts
1.6 Pie charts
1.7 Scatter plots
1.8 Line charts
1.9 Data visualization example
2. Descriptive Statistics
2.1 Survey sampling
2.2 Measures of center
2.3 Measures of variability
2.4 Box plots
2.5 Histograms
2.6 Violin plots
3. Probability and Counting
3.1 Introduction to probability
3.2 Addition rule and complements
3.3 Multiplication rule and independence
3.4 Conditional probability
3.5 Bayes’ Theorem
3.6 Combinations and permutations
4. Probability Distributions
4.1 Introduction to random variables
4.2 Properties of discrete probability distributions
4.3 Binomial distribution
4.4 Hypergeometric distribution
4.5 Poisson distribution
4.6 Properties of continuous probability distributions
4.7 Normal distribution
4.8 Student’s t-Distribution
4.9 F-distribution
4.10 Chi-square distribution
5. Inferential Statistics
5.1 Confidence intervals
5.2 Confidence intervals for population means
5.3 Confidence intervals for population proportions
5.4 Hypothesis testing
5.5 Hypothesis test for a population mean
5.6 Hypothesis test for a population proportion
5.7 Hypothesis test for the difference between two population means
5.8 Hypothesis test for the difference between two population proportions
5.9 One-way analysis of variance (one-way ANOVA)
6. Linear Regression
6.1 Introduction to simple linear regression (SLR)
6.2 SLR assumptions
6.3 Correlation and coefficient of determination
6.4 Interpreting SLR models
6.5 Confidence and prediction intervals for SLR models
6.6 Testing SLR parameters
6.7 Multiple regression
6.8 Categorical predictor variables
6.9 Interaction terms
6.10 Linear regression example
7. Chi-square Tests for Categorical Data
7.1 Categorical data
7.2 Fisher’s exact test
7.3 Introduction to chi-square tests
7.4 Chi-square test for homogeneity and independence
7.5 Relative risk and odds ratios
8. Time Series
8.1 What is a time series?
8.2 Time series patterns and stationarity
8.3 Moving average and exponential smoothing forecasting
8.4 Forecasting using regression
9. Monte Carlo Methods
9.1 What is a Monte Carlo simulation?
9.2 Building simulations
9.3 Optimization and forecasting
9.4 What-if analysis
9.5 Advanced simulations
10. Data Mining
10.1 What is data mining?
10.2 Data preparation
10.3 Analyzing results
10.4 Supervised learning
10.5 Unsupervised learning
11. Decision Tree Learning
11.1 Introduction to decision trees
11.2 Classification and regression trees (CART)
11.3 ID3 and C4.5 algorithms
11.4 Classification tree example
11.5 Regression tree example
11.6 Random forests
12. Ethics
12.1 Misleading statistics
12.2 Abuse of the p-value
12.3 Data privacy
12.4 Ethical guidelines
13. Appendix A: Distribution Tables
13.1 t-distribution table
13.2 z-distribution table
13.3 Chi-squared distribution table
14. Appendix B: CSV Files
14.1 Data sets
What You’ll Find In This zyBook:
More action with less text.
- An exceptionally student-focused introduction to applied statistics.
- Traditionally difficult topics are made easier using animations and learning questions.
- Several chapters on data analytics are included.
- R coding environments are provided throughout to allow students to experiment.
- Commonly combined with “Applied Regression Analysis” with numerous configurations possible.
The zyBooks Approach
Less text doesn’t mean less learning.
This zyBook provides a concise introduction to bivariate and multivariate statistics using an applied approach with real-world data. Equations for common statistical quantities are provided, but most concepts are explained using animations rather than rigorous mathematical proof. This content is recommended for STEM majors who may not have a solid foundation on statistics, but want a friendly introduction to data analytics. R coding environments are provided that allows students to experiment with datasets that are both interesting and relevant to students’ day-to-day lives.
Senior Contributors
Joel Berrier
Assistant Professor, Dept. of Physics and Astronomy, Univ. of Nebraska, Kearny, Ph.D. Physics and Astronomy, UC Irvine
Chris Chan
Content lead: Mathematics, zyBooks, M.A. Mathematics, San Francisco State Univ.
Scott Nestler
Associate Teaching Professor, Mendoza College of Business, Univ. of Notre Dame, Ph.D. Management Science, Univ. of Maryland, College Park
Iain Pardoe
Mathematics and Statistics Instructor, Thompson Rivers Univ., Pennsylvania State Univ., and Statistics.com, PhD Statistics, Univ. of Minnesota
Ron Siu
Content developer, zyBooks, M.S. Biomedical Engineering, UCLA; M.S. Developmental Biology, Stanford
Rodney X. Sturdivant
Professor, Dept. of Mathematics and Physics, Azusa Pacific Univ., Ph.D. Biostatistics, U Mass Amherst
Krista Watts
Assistant Professor, Director—Center for Data Analysis and Statistics, United States Military Academy, West Point, Ph.D. Biostatistics, Harvard