Lab 4: Writing Regression Models
R scripts
- Download the student practice script (right-click → Save As)
- Download the instructor script with extensions (right-click → Save As)
Last week asked you to learn R and causal inference at the same time. That is a heavy lift. This week slows down and focuses on one skill: writing regression models in R and seeing how the results change when you change the formula.
Start with the student practice script. It is shorter, repeats the same workflow, and gives you clear places to edit the model. The instructor script comes second. It adds extra annotation, more examples, and an optional extension that returns to the Week 4 question about samples and populations.
What you will learn
- How to write a simple regression formula in R.
- How adding a second predictor can change a coefficient.
- How an interaction changes fitted lines.
- How to rerun a model after changing the formula.
- Optional: how factor terms, curved relationships, and sample-to-population differences extend the same ideas.
Packages
The student script uses tidyverse. The instructor script also uses
parameters.
required_packages <- c("tidyverse")
missing_packages <- required_packages[
!vapply(required_packages, \(pkg) requireNamespace(pkg, quietly = TRUE), logical(1))
]
if (length(missing_packages) > 0) {
install.packages(missing_packages)
}
library(tidyverse)
How to use this lab
- Open the student practice script first.
- Run one exercise at a time.
- Change only the formula after
~. - Rerun the model and the lines immediately below it.
- Write down what changed before moving to the next exercise.
Student practice script
The student script has three core exercises.
- One predictor. Fit
exam_score ~ study_hours, then change it toexam_score ~ 1and see what the fitted line becomes. - Add a second predictor. Start with
exam_score ~ study_hours, then addmotivationand see how the coefficient forstudy_hourschanges. - Add an interaction. Start with
exam_score ~ study_hours + workshop, then change it toexam_score ~ study_hours * workshopand compare the fitted lines.
The aim is not to memorise syntax. The aim is to notice what each change in the formula does.
Instructor script with extensions
Use the instructor script after you have worked through the student version.
It includes:
- More annotation around the simulation code.
- Extra exercises with a factor predictor and a curved relationship.
- Cleaner comparison tables.
- An optional extension on sample versus population estimands.
That final section reconnects the lab to the Week 4 theme. It is useful, but it is not the place to start if you are still getting comfortable with R syntax.
Questions to answer
- In exercise 1, what happens to the fitted line when you change
exam_score ~ study_hourstoexam_score ~ 1? - In exercise 2, does the coefficient for
study_hoursget larger or smaller after addingmotivation? - In exercise 3, what changes when you replace
+with*in the formula? - Which formula felt easiest to interpret, and which felt hardest?
Optional extension
If you finish early, open the instructor script and run the optional section at the end. In one short paragraph, explain why the conditional coefficients can look similar even when the average treatment effect changes across populations.