Data Science for Ecology & Conservation

The course provides a practical foundation for data-driven inference and prediction in biodiversity science. The first part of the course is focused on providing a unified foundation for statistical analysis in ecology and aims to complement more general introductory statistical courses. Specific foci include an introduction and practice around General Linear Models (GLMs), and their use for hypothesis testing, prediction, and forecasting, as well as maximum likelihood and non-parametric approaches, all focused on biodiversity. The course then examines the promise and practice of emerging machine learning approaches specifically for prediction. With these foundations in place, the course will tackle several case studies addressing biogeography, community ecology, local conservation, and large-scale conservation decision-making. Students will be expected to work through cases studies individually or in small groups and present on their findings in class.  The course will familiarize students with R, the leading analysis and visualization software in much of the life sciences, through in-class guidance, homework and problem sets.

The course assumes that students are familiar with the core concepts of probability and statistical analysis and have at minimum completed an introductory statistics course such as S&DS 100 or similar by approval.

1 credit for Yale College students
Course Number: 
EEB 2212
Professor (Faculty Member): 
Day / time: 
W 1:30pm-3:20pm
Course Type: 
Undergraduate
Course term: 
Spring
Year: 
2026