Table of Contents

1.1 Historical overview
1.2 Why data science?
1.3 Careers in data science
1.4 Data science lifecycle
1.5 Ethics in data science
1.6 Case study: Netflix

2.1 Data collection
2.2 Descriptive statistics
2.3 Probability
2.4 Probability distributions
2.5 Inferential statistics
2.6 Inference for proportions and means

3.1 Data wrangling
3.2 Structuring data
3.3 Cleaning data
3.4 Enriching data

4.1 Visualizing data with one feature
4.2 Visualizing data with multiple features
4.3 Best practices for visualizing data
4.4 Tools for visualizing data
4.5 Performing exploratory data analysis
4.6 Detecting outliers

5.1 Introduction to regression
5.2 Simple linear regression
5.3 Linear regression assumptions
5.4 Multiple linear regression
5.5 Logistic regression

6.1 Model error
6.2 Binary classification metrics
6.3 Regression metrics
6.4 Training, validation, and test sets
6.5 Cross-validation
6.6 Bootstrap method
6.7 Comparing models

7.1 Introduction to supervised learning
7.2 K-nearest neighbors
7.3 Naive Bayes classification
7.4 Support vector machines

8.1 Introduction to unsupervised learning
8.2 K-means clustering
8.3 Hierarchical clustering
8.4 Detecting outliers using DBSCAN
8.5 Analyzing factors
8.6 Analyzing factors using PCA

9.1 Introduction to decision trees
9.2 Regression trees
9.3 Classification trees
9.4 Random forests

10.1 Introduction to artificial neural networks
10.2 Single-layer perceptron
10.3 Nonlinear activation functions
10.4 Multilayer perceptron

11.1 Introduction to ensemble models
11.2 Boosting
11.3 Bagging
11.4 Stacking

What You’ll Find In This zyBook:

More action with less text.

  • Builds student understanding and confidence through learning questions and coding activities
  • Students learn the necessary skills required for the more quantitative and technical aspects of data science and machine learning
  • Each section covers topics from a conceptual standpoint without assuming prerequisite knowledge in statistics and programming
  • Test bank with more than 260 questions

Instructors: Interested in evaluating this zyBook for your class? Sign up for a Free Trial and check out the first chapter of any zyBook today!

The zyBooks Approach

Less text doesn’t mean less learning.

Data Science Foundations provides an interactive introduction to common algorithms and techniques in data science. This zyBook covers data preprocessing, regression techniques, supervised and unsupervised learning algorithms, decision trees, neural networks, ensemble methods, and model evaluation techniques.

“It is already clear that this represents the future of programming text books. Its basic expository content is the equal of any paper text, but it really shines in using the natural advantages of online vs. static teaching material ­ animation and interactivity ­ to excellent effect, giving the student an additional dimension of insight.”

Authors

Chris Chan
Senior Manager, Content Development in Math, Stats, and Data Science / zyBooks / M.A. in Mathematics / San Francisco State University

Matt Rissler
Data Science Content Developer / Ph.D. in Mathematics / University of Notre Dame

Aimee Schwab-McCoy
Data Science Content Developer / Ph.D. in Statistics / University of Nebraska–Lincoln