Table of Contents

1.1 Historical overview
1.2 Why data science?
1.3 Careers in data science
1.4 Data science lifecycle
1.5 Ethics in data science
1.6 Case study: Netflix

2.1 Programming with Python and Jupyter
2.2 Python data types
2.3 Python functions
2.4 Data science packages
2.5 NumPy package
2.6 pandas package
2.7 matplotlib package
2.8 Case study: Hawks

3.1 Data collection
3.2 Descriptive statistics
3.3 Probability
3.4 Probability distributions
3.5 Inferential statistics
3.6 Inference for proportions and means
3.7 Case study: Flight delays

4.1 Relational databases
4.2 Simple queries
4.3 Special operators and clauses
4.4 Aggregate functions
4.5 Join queries
4.6 Subqueries
4.7 Queries in Python
4.8 Case study: Queries in SQL and pandas

5.1 Data wrangling
5.2 Manipulating data
5.3 Structuring data
5.4 Cleaning data
5.5 Enriching data
5.6 Case study: Diamond prices

6.1 Visualizing data with one feature
6.2 Visualizing data with multiple features
6.3 Best practices for visualizing data
6.4 Tools for visualizing data
6.5 Performing exploratory data analysis
6.6 Detecting outliers
6.7 Case study: Palmer penguins

7.1 Introduction to regression
7.2 Simple linear regression
7.3 Linear regression assumptions
7.4 Multiple linear regression
7.5 Logistic regression
7.6 Case study: Energy consumption
7.7 Case study: Customer churn

8.1 Model error
8.2 Training, validation, and test sets
8.3 Loss functions for regression
8.4 Loss functions for classification
8.5 Binary classification metrics
8.6 Cross-validation
8.7 Bootstrap method
8.8 Comparing models
8.9 Case study: Home prices

9.1 Introduction to supervised learning
9.2 K-nearest neighbors
9.3 Naive Bayes classification
9.4 Support vector machines
9.5 Case study: Classifying cells

10.1 Introduction to unsupervised learning
10.2 K-means clustering
10.3 Hierarchical clustering
10.4 Detecting outliers using DBSCAN
10.5 Analyzing factors
10.6 Analyzing factors using PCA
10.7 Case study: Travel reviews

11.1 Introduction to decision trees
11.2 Regression trees
11.3 Classification trees
11.4 Random forests
11.5 Case study: Marijuana legalization

12.1 Introduction to artificial neural networks
12.2 Single-layer perceptron
12.3 Nonlinear activation functions
12.4 Multilayer perceptron
12.5 Case study: Bike share demand

13.1 Introduction to ensemble models
13.2 Boosting
13.3 Bagging
13.4 Stacking
13.5 Case study: Bob Ross

14.1 Datasets: CSV files

Teach data science with Python with the only interactive introduction that’s fully integrated with Jupyter Notebooks

Data Science Foundations with Python is the first complete, interactive introduction to the foundational algorithms and techniques for Python in data science.

  • Students can write and edit live code, create data visualizations, and output images right in the zyBook
  • Includes data preprocessing, regression techniques, supervised and unsupervised learning algorithms, decision trees, neural networks, ensemble methods, and model evaluation techniques
  • Jupyter Notebooks is embedded in the zyBook, so students work with real-world, professional tools
  • Continuously updated with the latest advances in data science

Data science is interactive; it requires coding and live investigations of data sets. To do all that within a digital zyBook is really powerful.”
– Co-author Dr. Aimee Schwab-McCoy

Dr. Schwab-McCoy explains the benefits of zyBooks for data science instructors and students:

What is a zyBook?


Data Science Foundations with Python is a web-native, interactive zyBook that helps students visualize concepts to learn faster and more effectively than with a traditional textbook.

Since 2012, over 1,700 academic institutions have adopted web-native zyBooks to transform their STEM education.

zyBooks benefit students and instructors:

  • Instructor benefits
  • Customize your course by reorganizing existing content, or adding your own
  • Continuous publication model updates your course with the latest content and technologies
  • Gain insight into students’ progress, reading and participation with robust reporting
  • Save time with auto-graded labs and challenge activities that seamlessly integrate with your LMS gradebook
  • Build quizzes and exams with over 300 included test questions
  • Student benefits
  • Learning questions and other content serve as an interactive form of reading
  • Instant feedback on labs and homework
  • Concepts come to life through extensive animations embedded into the interactive content
  • Save chapters as PDFs to reference the material at any time
  • Gain real-life, professional experience working with industry standard Jupyter Notebooks

Embedded Jupyter Notebooks

The Data Science Foundations with Python zyBook is fully integrated with the industry standard Jupyter Notebooks web-based computing platform. So students will gain real-life experience writing and editing live code, creating data visualizations, and experimenting by changing parameters of different models to evaluate their performance with a professional application.

Jupyter Notebooks can also be downloaded for offline use.

In this video, Dr. Schwab-McCoy demonstrates the power of zyBooks’ embedded Jupyter Notebooks:

Authors

Chris Chan
Senior Manager, Content Development in Math, Stats, and Data Science / zyBooks / MA in Mathematics / San Francisco State University

Matt Rissler
Data Science Content Developer / PhD in Mathematics / University of Notre Dame

Aimee Schwab-McCoy
Data Science Content Developer / PhD in Statistics / University of Nebraska–Lincoln

Instructors: Interested in evaluating this zyBook for your class?

Check out these related zyBooks