Statistics for Data Analytics
zyBooks 2017

1. Statistics Background
1.1 Measures of center
1.3 Histograms
1.4 Experiments and events
1.5 Discrete random variables and their distributions
1.6 Properties of discrete probability distributions
1.7 Continuous probability distributions and properties
1.8 Specific distributions
1.9 Hypotheses
1.10 Hypothesis testing
1.11 Comparing two population means
1.12 Confidence intervals

2. Parametric Analysis
2.1 Parameterized population models
2.2 Student’s t-test
2.3 Comparing 2 samples: 2-sample t-test
2.4 Comparing 3+ samples: ANOVA
2.5 Linear regression

3. Nonparametric Analysis
3.1 Parametric vs. nonparametric statistics
3.2 Resampling: Randomization and bootstrapping
3.3 Wilcoxon rank-sum test
3.4 Kruskal-Wallis test
3.5 Multiple tests

4. Categorical Analysis
4.1 Comparing samples having categorical data
4.2 2 samples, 2 categories: Fisher’s exact test
4.3 2 samples, 2 categories, large sample: Chi-square test
4.4 3+ samples and/or 3+ categories: Chi-square test
4.5 2 samples, 2 categories: Relative risk and odds ratios

5. Principal Cmpt Analysis
5.1 Introduction to PCA
5.2 Calculating principal components for two variables
5.3 Extending PCA to more variables
5.4 Determining the number of components
5.5 Interpreting principal components

6. Multiple Regression
6.1 The multiple linear regression model
6.2 Model parameter estimation and testing
6.3 Regression diagnostics
6.4 Model interpretation
6.5 Categorical predictors
6.6 Transformations and Interactions
6.7 Multiple linear regression example

7. Logistic Regression
7.1 Introduction to the logistic regression model
7.2 Parameter estimation
7.3 Probability estimates
7.4 Inference 1: Wald tests
7.5 Inference 2: LR tests and AIC
7.6 Interpretation
7.7 Assessing fit