Python or R for Data Science?

Avatar photo Dr. Aimee Schwab-McCoy

A question we hear from instructors is, how do you choose the right language for your data science courses?

The two primary languages used in data science are Python and R. In fact, zyBooks publishes two programming versions of our foundational introduction to data science – one for each language. (And another one without coding.)

To give you some perspective, here’s a quick chart comparing Python to R:

General-purpose language widely used in industry and intro CS courses Used by statisticians, data analysts and research scientists
Very flexible for multiple applicationsSpecifically written for data and statistical analysis, displaying graphics, and statistical modeling 
Open source and supports object-oriented programmingOpen source and supports object-oriented programming 
Commonly used data science packages are pandas, seaborn, and scikit-learnCommonly used data science packages are tidyr, ggplot2, and dplyr.
Cleaner syntax; students find Python easier to learnMore difficult syntax, but commonly used data science packages within tidyverse ecosystem are designed to work together

In this short video, data science professor and zyBooks co-author Dr. Aimee Schwab-McCoy walks you through how to pick the right language for your students: 

Avatar photo
Author Bio

Dr. Aimee Schwab-McCoy

Before joining zyBooks, Aimee was a statistics professor at Creighton University, where she created a Data Science program. Aimee is an experienced statistics and data science education researcher and passionate about developing engaging resources for data science learners.