Table of Contents

1. Data Visualization

1.1 What is data?
1.2 What is data visualization?
1.3 R for data visualization
1.4 Data frames
1.5 Bar charts
1.6 Pie charts
1.7 Scatter plots
1.8 Line charts

2. Descriptive Statistics

2.1 Survey sampling
2.2 Measures of center
2.3 Measures of spread
2.4 Box plots
2.5 Histograms
2.6 Violin plots

3. Probability and Counting

3.1 Introduction to probability
3.2 Addition rule and complements
3.3 Multiplication rule and independence
3.4 Conditional probability and Bayes’ Theorem
3.5 Combinations and permutations

4. Probability Distributions

4.1 Introduction to random variables
4.2 Properties of discrete probability distributions
4.3 Binomial distribution
4.4 Hypergeometric distribution
4.5 Poisson distribution
4.6 Properties of continuous probability distributions
4.7 Normal distributiobn
4.8 Student’s t-distribution
4.9 F-distribution
4.10 Chi-square distribution

5. Inferential Statistics

5.1 Confidence intervals
5.2 Confidence intervals for population means
5.3 Confidence intervals for population proportions
5.4 Hypothesis tests
5.5 One-sample hypothesis tests for population means
5.6 One-sample z-test for population proportions
5.7 Two-sample hypothesis tests for population means
5.8 Two-sample z-test for population proportions
5.9 Analysis of variance (ANOVA)
5.10 Chi-square tests for categorical variables

6. Linear Regression

6.1 Simple linear regression (SLR)
6.2 SLR assumptions
6.3 Correlation coefficient and coefficient of determination
6.4 Interpreting linear models
6.5 Testing SLR parameters
6.6 Multiple regression
6.7 Categorical predictors and non-linear relationships

7. Time Series Analysis

7.1 What is a time series?
7.2 Time series patterns and stationarity
7.3 Moving average and exponential smoothing forecasting
7.4 Forecasting using regression

8. Monte Carlo Methods

8.1 What is a Monte Carlo simulation?
8.2 Building simulations
8.3 Optimization and forecasting

9. Data Mining

9.1 What is data mining?
9.2 Data preparation
9.3 Model evaluation
9.4 Supervised learning
9.5 Unsupervised learning

10. Ethics

10.1 Misleading statistics
10.2 Abuse of the p-value
10.3 Data privacy
10.4 Ethical guidelines

11. Appendix

11.1 z-distribution table
11.2 t-distribution table
11.3 Chi-squared distribution table

12. Additional Material

12.1 Tables
12.2 Spreadsheets
12.3 Spreadsheet plotting
12.4 Dot plots
12.5 Animations
12.6 Data visualization: Case study
12.7 Dashboards
12.8 Linear regression example
12.9 What-if analysis
12.10 Advanced simulations

What You’ll Find In This zyBook:

More action with less text.

  • An exceptionally student-focused introduction to data analytics
  • Traditionally-hard topics are made learnable via hundreds of animations and learning questions
  • Included statistics/probability background enables all students to succeed
  • Commonly combined with “Statistics for Data Analytics“; numerous configurations possible

Instructors: Interested in evaluating this zyBook for your class? Sign up for a Free Trial and check out the first chapter of any zyBook today!

The zyBooks Approach

Less text doesn’t mean less learning.

Data analytics is one of the fastest growing subjects today. Techniques in data analysis can help solve various problems such as identifying new opportunities to generate profit or improving health outcomes in hospitals. Since the subject relies heavily on statistics, the topic often pose difficulties for students. This zyBook represents entirely new material created specifically to help students master the subject. Written natively for the modern web, the zyBook uses less text, and teaches through hundreds of animations and learning questions.

The zyBook provides a solid background in probability and statistics needed to understand and apply techniques covered in later chapters such as time series analysis, Monte Carlo simulation, and data mining. A chapter on ethics provides real examples and encourages professionalism and safety.

In recent years, R has become popular among data analysts, researchers, and statisticians because of the wide variety of statistical computing and graphical modeling packages available in the language. Links to a live coding environment are provided to allow students to practice writing R functions for data visualization, inferential statistics, linear regression, and other algorithms.

“I have now been asked to teach Discrete Mathematics again … because of my past experience with zyBooks I agreed to teach this topic again only if I could use the zyBook again.”

Timothy StanleyLecturer, Utah Valley University