Databases and Data Science Practicum
Department of Applied Statistics, Social Science, and Humanities
APSTA-GE 2017
Spring 2026
2 units
Course description
This course provides a hands-on introduction to extracting, transforming, and visualizing data using real-world datasets.
Students learn to query databases and join datasets using SQL, and learn to summarize, visualize, and map data in R using the tidyverse.
Students also gain experience with git, enhancing their ability to work in modern collaborative environments.
Alongside these modules, an ongoing emphasis of the course is to practice how to be a curious, skeptical, and articulate data scientist.
Student learning outcomes
Upon completion of the course, students will be able to:
- Perform file management functions on the command line
- Use
gitto pull code from and push code to GitHub - Conduct
SQLqueries on a database server, including joins - Transform and summarize data in the
tidyverseinR - Use
ggplotto make data visualizations
Prerequisites
APSTA-GE 2352 (Practicum in Applied Statistics: Statistical Computing) or equivalent.
Course schedule
Introduction
- Week 1:
- Tue., Jan. 20: Course syllabus + motivation
- Fri., Jan. 23: Assignment 0 due at 11:59pm ET
Command line interfaces
- Week 2:
- Tue., Jan. 27: Introduction + syntax
- Week 3:
- Tue., Feb. 3: Git and GitHub
- Fri., Feb. 6: Assignment 1 due at 11:59pm ET
Structured Query Language (SQL)
- Week 4:
- Tue., Feb. 10: CLI wrapup / SQL introduction
- (Optional) NYU Libraries’ “Love Data Week 2026” events, including:
- Mon., Feb. 9, 1–3pm: NYC Open Data, Scavenger Hunt, and Intro to Python and APIs
- Wed., Feb. 11, 12–1:30pm: Careers in Data Alumni Panel
- — Week off for President’s Day —
- Week 5:
- Tue., Feb. 24: Syntax + summaries
- Week 6:
- Tue., Mar. 3: Joins
- Fri., Mar. 6: Assignment 2 due at 11:59pm ET
Tidyverse
- Week 7:
- Tue., Mar. 10: Introduction + syntax
- — Week off for Spring Break —
- Week 8:
- Tue., Mar. 24: Messy data
- Week 9:
- Tue., Mar. 31: Pivoting + summarization
- Week 10:
- Tue., Apr. 7: Functions + programming
- Fri., Apr. 10: Assignment 3 due at 11:59pm ET
Visualization
- Week 11:
- Tue., Apr. 14: Introduction + grammar of graphics
- Week 12:
- Tue., Apr. 21: Best practices with visualization
- Week 13:
- Tue., Apr. 28: Geospatial mapping
- Wed., Apr. 29: Assignment 4 due at 11:59pm ET
Final project
- Week 14:
- Tue., May 5: No class: Office hours for final projects
- Week 15:
- Sun., May 10: Final project due at 11:59pm ET