## Table of Contents

1. Data Visualization

1.1 What is data?

1.2 What is data visualization?

1.3 R for data visualization

1.4 Data frames

1.5 Bar charts

1.6 Pie charts

1.7 Scatter plots

1.8 Line charts

2. Descriptive Statistics

2.1 Survey sampling

2.2 Measures of center

2.3 Measures of spread

2.4 Box plots

2.5 Histograms

2.6 Violin plots

3. Probability Distributions

3.1 Experiments and events

3.2 Random variables

3.3 Discrete random variables and their distributions

3.4 Properties of discrete probability distributions

3.5 Properties of continuous probability distributions

3.6 The normal distribution

3.7 The Student’s t-Distribution

3.8 The F-distribution

4. Inferential Statistics

4.1 Confidence intervals

4.2 Confidence intervals for population means

4.3 Confidence intervals for population proportions

4.4 Hypothesis tests

4.5 One-sample hypothesis tests for population means

4.6 One-sample z-test for population proportions

4.7 Two-sample hypothesis tests for population means

4.8 Two-sample z-test for population proportions

4.9 Analysis of variance (ANOVA)

5. Linear Regression

5.1 Simple linear regression

5.2 Least squares method

5.3 Simple linear regression assumptions

5.4 Interpreting linear models

5.5 Correlation

5.6 Model assessment

5.7 Multiple regression

5.8 Categorical predictors and non-linear relationships

6. Time Series Analysis

6.1 What is a time series?

6.2 Time series patterns and stationarity

6.3 Moving average and exponential smoothing forecasting

6.4 Forecasting using regression

7. Monte Carlo Methods

7.1 What is a Monte Carlo simulation?

7.2 Building simulations

7.3 Optimization and forecasting

8. Data Mining

8.1 What is data mining?

8.2 Data preparation

8.3 Model evaluation

8.4 Supervised learning

8.5 Unsupervised learning

9. Ethics

9.1 Misleading statistics

9.2 Abuse of the p-value

9.3 Data privacy

9.4 Ethical guidelines

10. Additional Material10. Additional Material

10.1 Tables

10.2 Spreadsheets

10.3 Spreadsheet plotting

10.4 Dot plots

10.5 Animations

10.6 Data visualization: Case study

10.7 Dashboards

10.8 Specific distributions

10.9 Combinations and permutations

10.10 Linear regression example

10.11 What-if analysis

10.12 Advanced simulations

## What You’ll Find In This zyBook:

### More action with less text.

- An exceptionally student-focused introduction to data analytics
- Traditionally-hard topics are made learnable via hundreds of animations and learning questions
- Included statistics/probability background enables all students to succeed
- Commonly combined with “Statistics for Data Analytics“; numerous configurations possible

## The zyBooks Approach

### Less text doesn’t mean less learning.

Data analytics is one of the fastest growing subjects today. Techniques in data analysis can help solve various problems such as identifying new opportunities to generate profit or improving health outcomes in hospitals. Since the subject relies heavily on statistics, the topic often pose difficulties for students. This zyBook represents entirely new material created specifically to help students master the subject. Written natively for the modern web, the zyBook uses less text, and teaches through hundreds of animations and learning questions.

The zyBook provides a solid background in probability and statistics needed to understand and apply techniques covered in later chapters such as time series analysis, Monte Carlo simulation, and data mining. A chapter on ethics provides real examples and encourages professionalism and safety.

In recent years, R has become popular among data analysts, researchers, and statisticians because of the wide variety of statistical computing and graphical modeling packages available in the language. Links to a live coding environment are provided to allow students to practice writing R functions for data visualization, inferential statistics, linear regression, and other algorithms.