Data Science Foundations

1. Introduction to Data Science

1.1 Historical overview
1.2 Why data science?
1.3 Careers in data science
1.4 Data science lifecycle
1.5 Ethics in data science
1.6 Case study: Netflix

2. Probability and Statistics

2.1 Data collection
2.2 Descriptive statistics
2.3 Probability
2.4 Probability distributions
2.5 Inferential statistics
2.6 Inference for proportions and means
2.7 Case study: Flight delays

3. Data Wrangling

3.1 Data wrangling
3.2 Exploring data
3.3 Structuring data
3.4 Cleaning data
3.5 Enriching data
3.6 Data engineering
3.7 Case study: Diamond prices
3.8 Case study: App reviews

4. Data Visualization

4.1 Visualizing data with one feature
4.2 Visualizing data with multiple features
4.3 Best practices for visualizing data
4.4 Tools for visualizing data
4.5 Performing exploratory data analysis
4.6 Detecting outliers
4.7 Case study: Palmer penguins
4.8 Case study: World stock market indices

5. Regression

5.1 Introduction to regression
5.2 Simple linear regression
5.3 Linear regression assumptions
5.4 Multiple linear regression
5.5 Logistic regression
5.6 Case study: Energy consumption
5.7 Case study: Customer churn

6. Evaluating Model Performance

6.1 Model error
6.2 Training, validation, and test sets
6.3 Loss functions for regression
6.4 Loss functions for classification
6.5 Binary classification metrics
6.6 Cross-validation
6.7 Bootstrap method
6.8 Comparing models
6.9 Case study: Home prices
6.10 Case study: Credit risk

7. Supervised Learning

7.1 Introduction to supervised learning
7.2 K-nearest neighbors
7.3 Naive Bayes classification
7.4 Support vector machines
7.5 Case study: Classifying cells
7.6 Case study: Estimating annual precipitation

8. Unsupervised Learning

8.1 Introduction to unsupervised learning
8.2 K-means clustering
8.3 Hierarchical clustering
8.4 Detecting outliers using DBSCAN
8.5 Analyzing factors
8.6 Analyzing factors using PCA
8.7 Case study: Travel reviews
8.8 Case study: Cardiovascular health

9. Decision Trees

9.1 Introduction to decision trees
9.2 Regression trees
9.3 Classification trees
9.4 Random forests
9.5 Case study: Marijuana legalization

10. Artificial Neural Networks

10.1 Introduction to artificial neural networks
10.2 Single-layer perceptron
10.3 Nonlinear activation functions
10.4 Multilayer perceptron
10.5 Case study: Bike share demand

11. Ensemble Techniques

11.1 Introduction to ensemble models
11.2 Boosting
11.3 Bagging
11.4 Stacking
11.5 Case study: Bob Ross

12. Artificial Intelligence

12.1 Artificial intelligence
12.2 Machine learning
12.3 Computer vision
12.4 Natural language processing
12.5 Risks and ethics in AI
12.6 Generative AI
12.7 Prompt engineering

Teach with Case Studies: Enhancing Data Science and Statistics Education

At zyBooks, we recognize the value of real-world applications in fostering a deeper understanding of complex concepts. Our case studies serve as a bridge between theoretical knowledge and practical application.

Read Aimee’s Article

How to Teach Data Science

These best practices are invaluable in helping students master data science – best practices you can put into action in your own classroom right away.

Read the complete guide

AI in the Classroom

Stay informed with our comprehensive AI in education resources and expert insights.

Read the AI series

Teach data science with the only interactive introduction to foundational algorithms and techniques

Data Science Foundations is the first complete, interactive introduction that develops an applied understanding of topics in data science.

Covers data science through a conceptual lens without assuming a background in programming or statistics
Includes data preprocessing, regression techniques, supervised and unsupervised learning algorithms, decision trees, neural networks, ensemble methods, model evaluation techniques, and artificial intelligence
Continuously updated with the latest advances in data science
Case studies illustrate the data science lifecycle from start to finish with real data
Adopters have access to a test bank with over 250 questions
New AI Chapter includes essential Artificial Intelligence concepts, an overview of current AI applications, participation activities, and challenge activities

Co-author Dr. Schwab-McCoy explains the benefits of zyBooks for both data science instructors and students:

What is a zyBook?

Data Science Foundations is an interactive learning solution that helps students visualize concepts, enabling them to learn more effectively than with a traditional textbook. Check out our research.

zyBooks benefits students and instructors:

Instructor benefits
Customize your course by reorganizing existing content or adding your own
Continuous publication model updates your course with the latest content and technologies
Gain insight into students’ progress, reading and participation with robust reporting
Build quizzes and exams with over 250 included test questions

Student benefits
Learning questions and other content serve as an interactive form of reading
Instant feedback on labs and homework
Concepts come to life through extensive animations embedded into the interactive content
Save chapters as PDFs to reference the material at any time

Authors

Aimee Schwab-McCoy
Senior Manager, Content Development, Data Science, Mathematics and Statistics / PhD in Statistics, University of Nebraska–Lincoln

Chris Chan
MA in Mathematics, San Francisco State University

Matt Rissler
PhD in Mathematics, University of Notre Dame

Table of Contents

Teach data science with the only interactive introduction to foundational algorithms and techniques

What is a zyBook?

zyBooks benefits students and instructors:

Authors

Check out these related zyBooks

Why zyBooks?

Catalog

Instructors

Students

Data Science Foundations

Table of Contents

Teach data science with the only interactive introduction to foundational algorithms and techniques

What is a zyBook?

zyBooks benefits students and instructors:

Authors

Check out these related zyBooks

Ready to see zyBooks in action?

Why zyBooks?

Catalog

Instructors

Students