Ultra-Lightweight Early Prediction of At-Risk Students in CS1


Early prediction of students at risk of doing poorly in CS1 can
enable early interventions or class adjustments. Preferably,
prediction methods would be lightweight, not requiring much extra
activity or data-collection work from instructors beyond what they
already do. Previous methods included giving surveys, collecting
(potentially sensitive) demographic data, introducing clicker
questions into lectures, or using locally-developed systems that
analyze programming behavior, each requiring some effort by
instructors. Today, a widely used textbook / learning system in CS1
classes is zyBooks, used by several hundred thousand students
annually. The system automatically collects data related to reading,
homework, and programming assignments. For a 300+ student CS1
class, we found that three data metrics, auto-collected by that
system in early weeks (1-4), were good at predicting performance
on the week-6 midterm exam: non-earnest completion of the
assigned readings, struggle on the coding homework, and low
scores on the programming assignments, with correlation
magnitudes of 0.44, 0.58, and 0.72, respectively. We combined those
metrics in a decision tree model to predict students at-risk of failing
the midterm exam (<70%, meaning D or F), and achieved 85%
prediction accuracy with 82% sensitivity and 89% specificity, which
is higher than previously published early-prediction approaches.
The approach may mean that thousands of instructors already using
zyBooks or a similar system can get a more accurate early
prediction of at-risk students, without requiring extra effort or
activities, and avoiding collection of sensitive demographic data

More action with less text.

zyBooks strike the perfect balance between text volume and engaged learning, with studies showing that students spend more time learning. Performance has been proven to increase and we have research to show it.

zyBooks textbooks increase student engagement