Generative AI for Social Science

Department of Applied Statistics, Social Science, and Humanities
APSTA-GE 2048

Spring 2026
3 units

Course description

This course provides a hands-on introduction to using generative AI for social science research and impact. Students use generative AI to classify text corpora, parse unstructured documents, create synthetic data and simulations, and code with agents. The course also reviews ethical questions that arise with the use of generative AI in social science research and impact applications. Assignments focus on using generative AI to address social and policy questions with real-world datasets.

Student learning outcomes

Upon completion of the course, students will be able to:

  • Label unstructured text using a large language model
  • Simulate social science experiments by generating data from large language models (LLMs)
  • Validate AI-based research methods, like LLM-based labeling
  • Use agentic AI to create software prototypes
  • Describe the main limitations of using generative AI in social science research

Prerequisites

Prior coursework in statistical computing and working knowledge of R for data analysis (equivalent to both APSTA‑GE 2352 Practicum in Applied Statistics: Statistical Computing and APSTA-GE 2017 Databases and Data Science Practicum). Familiarity with Python is a plus, but not required.

Course schedule

Introduction

  • Week 1:
    • Fri., Jan. 23: Course motivation + syllabus
    • Asynchronous Instruction:
      • Software Carpentry: The Unix Shell
        (~3hr, including exercises)
      • “Missing Semester” lecture: The Shell
        (~1hr, including exercises)

Language modeling

  • Week 2:
    • Wed., Jan. 28: Assignment 0 due at 11:59pm ET
    • Fri., Jan. 30: Small language models
    • Asynchronous Instruction: Git Essential Training
      (~2hrs, including exercises)
  • Week 3:
    • Fri., Feb. 6: Large language models
    • Asynchronous Instruction: Introduction to Python
      (~5hrs, including exercises)

Using LLMs

  • Week 4:
    • Fri., Feb. 13: LLM APIs
    • Asynchronous Instruction: Programming Foundations: APIs and Web Services
      (~3hrs, including exercises)
  • Week 5:
    • Fri., Feb. 20: Prompt engineering
  • Week 6:
    • Fri., Feb. 27: Information retrieval and RAG
    • Sat., Feb. 28: Assignment 1 due at 11:59pm ET

Evaluating LLMs

  • Week 7:
    • Fri., Mar. 6: Benchmarks and evaluations
  • Week 8:
    • Fri., Mar. 13: LLM-as-a-judge and consistency
  • -– Week off for Spring Break -–

Synthetic data / simulation

  • Week 9:
    • Fri., Mar. 27: Simulations of human responses
  • Week 10:
    • Wed., Apr. 1: Assignment 2 due at 11:59pm ET
    • Fri., Apr. 3: Simulations of human experimental interactions

Agentic programming

  • Week 11:
    • Fri., Apr. 10: Building interfaces
  • Week 12:
    • Fri., Apr. 17: Deploying software

Course wrap-up

  • Week 13:
    • Fri., Apr. 24: Images, audio, and video
  • Week 14:
    • Fri., May 1: No class: Office hours for final projects
    • Fri., May. 1: Assignment 3 due at 11:59pm ET
  • Week 15:
    • Mon., May 11: Final project due at 11:59pm ET