## Table of Contents

1. Data Visualization

1.1 What is data?

1.2 What is data visualization?

1.3 Python for data visualization

1.4 Data frames

1.5 Bar charts

1.6 Pie charts

1.7 Scatter plots

1.8 Line charts

2. Descriptive Statistics

2.1 Survey sampling

2.2 Measures of center

2.3 Measures of spread

2.4 Box plots

2.5 Histograms

2.6 Violin plots

3. Probability Distributions

3.1 Experiments and events

3.2 Random variables

3.3 Discrete random variables and their distributions

3.4 Properties of discrete probability distributions

3.5 Properties of continuous probability distributions

3.6 The normal distribution

3.7 The Student’s t-Distribution

3.8 The F-distribution

4. Inferential Statistics

4.1 Confidence intervals

4.2 Confidence intervals for population means

4.3 Confidence intervals for population proportions

4.4 Hypothesis tests

4.5 One-sample hypothesis tests for population means

4.6 One-sample z-test for population proportions

4.7 Two-sample hypothesis tests for population means

4.8 Two-sample z-test for population proportions

4.9 Analysis of variance (ANOVA)

5. Linear Regression

5.1 Simple linear regression

5.2 Least squares method

5.3 Simple linear regression assumptions

5.4 Interpreting linear models

5.5 Correlation

5.6 Model assessment

5.7 Multiple regression

5.8 Categorical predictors and non-linear relationships

6. Time Series Analysis

6.1 What is a time series?

6.2 Time series patterns and stationarity

6.3 Moving average and exponential smoothing forecasting

6.4 Forecasting using regression

7. Monte Carlo Methods

7.1 What is a Monte Carlo simulation?

7.2 Building simulations

7.3 Optimization and forecasting

8. Data Mining

8.1 What is data mining?

8.2 Data preparation

8.3 Model evaluation

8.4 Supervised learning

8.5 Unsupervised learning

9. Ethics

9.1 Misleading statistics

9.2 Abuse of the p-value

9.3 Data privacy

9.4 Ethical guidelines

## What You’ll Find In This zyBook:

### More action with less text.

- An exceptionally student-focused introduction to data analytics
- Traditionally-hard topics are made learnable via hundreds of animations and learning questions
- Included statistics/probability background enables all students to succeed
- R coding practice are provided throughout to allow students to experiment
- Commonly combined with “Statistics for Data Analytics“; numerous configurations possible

## The zyBooks Approach

### Less text doesn’t mean less learning.

Data analytics is one of the fastest growing subjects today. Techniques in data analysis can help solve various problems such as identifying new opportunities to generate profit or improving health outcomes in hospitals. Since the subject relies heavily on statistics, the topic often pose difficulties for students. This zyBook represents entirely new material created specifically to help students master the subject. Written natively for the modern web, the zyBook uses less text, and teaches through hundreds of animations and learning questions.

The zyBook provides a solid background in probability and statistics needed to understand and apply techniques covered in later chapters such as time series analysis, Monte Carlo simulation, and data mining. A chapter on ethics provides real examples and encourages professionalism and safety.

In recent years, Python has gained ground a popular language among data analysts, researchers, and statisticians because of the language’s clean syntax and popularity among software developers. Links to a live coding environment are provided to allow students to practice writing python functions for data visualization, inferential statistics, linear regression, and other algorithms.