Data Science with R | Jupyter Notebooks

1. Introduction to Data Science

1.1 Historical overview
1.2 Why data science?
1.3 Careers in data science
1.4 Data science lifecycle
1.5 Ethics in data science
1.6 Case study: Netflix

2. R for Data Science

2.1 Introduction to R and Jupyter
2.2 Basics of base R
2.3 Working with dataframes
2.4 Using R packages
2.5 Base R graphics
2.6 Efficient coding in R
2.7 Case study: Hawks

3. Probability and Statistics

3.1 Data collection
3.2 Descriptive statistics
3.3 Probability
3.4 Probability distributions
3.5 Inferential statistics
3.6 Inference for proportions and means
3.7 Case study: Flight delays

4. SQL for Data Science

4.1 Relational databases
4.2 Simple queries
4.3 Special operators and clauses
4.4 Aggregate functions
4.5 Join queries
4.6 Subqueries
4.7 Case study: Queries in SQL and R

5. Data Wrangling

5.1 Data wrangling
5.2 Exploring data
5.3 Structuring data
5.4 Cleaning data
5.5 Enriching data
5.6 Data engineering
5.7 Case study: Diamond prices
5.8 Case study: App reviews

6. Data Visualization

6.1 Visualizing data with one feature
6.2 Visualizing data with multiple features
6.3 Best practices for visualizing data
6.4 Tools for visualizing data
6.5 Performing exploratory data analysis
6.6 Detecting outliers
6.7 Case study: Palmer penguins
6.8 Case study: World stock market indices

7. Regression

7.1 Introduction to regression
7.2 Simple linear regression
7.3 Linear regression assumptions
7.4 Multiple linear regression
7.5 Logistic regression
7.6 Case study: Energy consumption
7.7 Case study: Customer churn

8. Evaluating Model Performance

8.1 Model error
8.2 Training, validation, and test sets
8.3 Loss functions for regression
8.4 Loss functions for classification
8.5 Binary classification metrics
8.6 Cross-validation
8.7 Bootstrap method
8.8 Comparing models
8.9 Case study: Home prices
8.10 Case study: Credit risk

9. Supervised Learning

9.1 Introduction to supervised learning
9.2 K-nearest neighbors
9.3 Naive Bayes classification
9.4 Support vector machines
9.5 Case study: Classifying cells
9.6 Case study: Estimating annual precipitation

10. Unsupervised Learning

10.1 Introduction to unsupervised learning
10.2 K-means clustering
10.3 Hierarchical clustering
10.4 Detecting outliers using DBSCAN
10.5 Analyzing factors
10.6 Analyzing factors using PCA
10.7 Case study: Travel reviews
10.8 Case study: Cardiovascular health

11. Decision Trees

11.1 Introduction to decision trees
11.2 Regression trees
11.3 Classification trees
11.4 Random forests
11.5 Case study: Marijuana legalization

12. Artificial Neural Networks

12.1 Introduction to artificial neural networks
12.2 Single-layer perceptron
12.3 Nonlinear activation functions
12.4 Multilayer perceptron
12.5 Case study: Bike share demand

13. Ensemble Techniques

13.1 Introduction to ensemble models
13.2 Boosting
13.3 Bagging
13.4 Stacking
13.5 Case study: Bob Ross

14. Artificial Intelligence

14.1 Artificial intelligence
14.2 Machine learning
14.3 Computer vision
14.4 Natural language processing
14.5 Risks and ethics in AI
14.6 Generative AI
14.7 Prompt engineering

15. Appendix

15.1 Datasets: CSV files

Six Steps to Building Confidence in R

In this zyBooks guide, you will find six steps to helping your students get comfortable with this important programming language.

Read the complete article

How to Teach Data Science

These best practices are invaluable in helping students master data science – best practices you can put into action in your own classroom right away.

Read the complete guide

AI in the Classroom

Stay informed with our comprehensive AI in education resources and expert insights.

Read the AI series

Teach data science with R with the only interactive introduction that’s fully integrated with Jupyter Notebooks

Data Science Foundations with R is the first complete, interactive introduction to the foundational algorithms and techniques for R in data science.

The “Concepts, then computing” approach builds and reinforces student learning before introducing R examples
Embedded Jupyter Notebooks give students real-world practice programming in R
Includes data preprocessing, regression techniques, supervised and unsupervised learning algorithms, decision trees, neural networks, ensemble methods, model evaluation techniques, and artificial intelligence
Continuously updated with the latest advances in data science
Case studies illustrate the data science lifecycle from start to finish with real data
Adopters have access to a test bank with over 350 questions
zyLabs users can add their own Jupyter Notebooks via custom content
New AI Chapter includes essential Artificial Intelligence concepts, an overview of current AI applications, participation activities, and challenge activities

Data science is interactive; it requires coding and live investigations of data sets. To do all that within a digital zyBook is really powerful.”
– Co-author Dr. Aimee Schwab-McCoy

Dr. Schwab-McCoy explains the benefits of zyBooks for data science instructors and students:

What is a zyBook?

Data Science Foundations with R is an interactive learning solution that helps students visualize concepts, enabling them to learn more effectively than with a traditional textbook. Check out our research.

zyBooks benefits students and instructors:

Instructor benefits
Customize your course by reorganizing existing content or adding your own
Continuous publication model updates your course with the latest content and technologies
Gain insight into students’ progress, reading and participation with robust reporting
Save time with auto-graded labs and challenge activities that seamlessly integrate with your LMS gradebook
Build quizzes and exams with over 300 included test questions

Student benefits
Learning questions and other content serve as an interactive form of reading
Instant feedback on labs and homework
Concepts come to life through extensive animations embedded into the interactive content
Save chapters as PDFs to reference the material at any time
Gain real-life, professional experience working with industry standard Jupyter Notebooks

Embedded Jupyter Notebooks

The Data Science Foundations with R zyBook is fully integrated with the industry standard Jupyter Notebooks web-based computing platform. So students will gain real-life experience writing and editing live code, creating data visualizations, and experimenting by changing parameters of different models to evaluate their performance with a professional application.

Jupyter Notebooks can also be downloaded for offline use.

In this video, Dr. Schwab-McCoy demonstrates the power of zyBooks’ embedded Jupyter Notebooks:

Authors

Aimee Schwab-McCoy
Senior Manager, Content Development, Data Science, Mathematics and Statistics / PhD in Statistics, University of Nebraska–Lincoln

Chris Chan
MA in Mathematics, San Francisco State University

Matt Rissler
PhD in Mathematics, University of Notre Dame

Table of Contents

Teach data science with R with the only interactive introduction that’s fully integrated with Jupyter Notebooks

What is a zyBook?

zyBooks benefits students and instructors:

Embedded Jupyter Notebooks

Authors

Check out these related zyBooks

Why zyBooks?

Catalog

Instructors

Students

Data Science Foundations with R

Table of Contents

Teach data science with R with the only interactive introduction that’s fully integrated with Jupyter Notebooks

What is a zyBook?

zyBooks benefits students and instructors:

Embedded Jupyter Notebooks

Authors

Check out these related zyBooks

Ready to see zyBooks in action?

Why zyBooks?

Catalog

Instructors

Students