Causal Inference: Average Treatment Effects

Author
Affiliation

Joseph Bulbulia

Victoria University of Wellington, New Zealand

Published

March 25, 2025

Note

Required - (Hernan and Robins 2024) Chapters 1-3 link

Optional - (Neal 2020) Chapter 1-2 link

Neal, Brady. 2020. “Introduction to Causal Inference from a Machine Learning Perspective.” Course Lecture Notes (Draft). https://www.bradyneal.com/Introduction_to_Causal_Inference-Dec17_2020-Neal.pdf.
———. 2024. Causal Inference: What If? Chapman & Hall/CRC Monographs on Statistics & Applied Probab. Taylor & Francis. https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/.
Key concepts for the test(s):
  • The Fundamental Problem of Causal Inference
  • Causal Inference in Randomised Experiments
  • Causal Inference in Observational Studies - Average (Marginal) Treatment Effects
  • Three Fundamental Assumptions for Causal Inference
For the lab, copy and paste code chunks following the “LAB 6” section.

Seminar

Learning Outcomes

  • You will understand why causation is never directly observed.

  • You will understand how experiments address this “causal gap.”

  • You will understand how the application of three principles from experimental research allow human scientists to close this “causal gap” when making inferences about a population as whole – that is, inferences about “marginal effects.”

Opening

Robert Frost writes,

Two roads diverged in a yellow wood, And sorry I could not travel both And be one traveler, long I stood And looked down one as far as I could To where it bent in the undergrowth;

Then took the other, as just as fair, And having perhaps the better claim, Because it was grassy and wanted wear; Though as for that the passing there Had worn them really about the same,

And both that morning equally lay In leaves no step had trodden black. Oh, I kept the first for another day! Yet knowing how way leads on to way, I doubted if I should ever come back.

I shall be telling this with a sigh Somewhere ages and ages hence: Two roads diverged in a wood, and I— I took the one less traveled by, And that has made all the difference. – The Road Not Taken

Introduction: Motivating Example

Consider the following cross-cultural question:

Does bilingualism improve cognitive abilities in children?

There is evidence that bilingual children perform better oat cognitive tasks, but is learning more than one language a confounding factor?

How can we know? Each child might respond as the traveller in Frost’s poem:

“And sorry I could not travel both. And be one traveller\dots

Part 1: The Fundamental Problem of Causal Inference as a Missing Data Problem

Humans think in images, words, and songs – sometimes dance. However, a little mathematical notation helps to clarify the key concepts in causal inference.

To understand the fundamental problem of causal inference, we first define two potential outcomes for each individual in our study:

Let Y and A denote random variables.

Mathematically, we formulate a causal question by asking whether assignment to treatment A = a will lead to a difference to the outcome Y.

Let A = 1 denote getting the “bilingual” treatment, and A = 0 denote getting the monolingual treatment. Assume these are the only two treatments of interest.

  • Y_i(a = 1) The cognitive ability of child i if they were bilingual. This is the counterfactual outcome when treatement A = (a = 1).

  • Y_i(a = 0):: The cognitive ability of child i if they were monolingual. This is the counterfactual outcome when A = (a = 0).

Using this notation, we may define a quantitative causal effect of bilingualism on cognitive ability for individual i as the difference between these potential outcomes:

\text{Causal Effect}_i = Y_i(1) - Y_i(0)

We say there is a causal effect if:

Y_i(1) - Y_i(0) \neq 0

Writing out our contrast of interest this way makes it clear that, because individuals experience only one treatment condition, we cannot compute this contrast directly from any data we may observe. We call the missing observation “counterfactual.”

Y_i|A_i = 1 \implies Y_i(0)|A_i = 1~ \text{is counterfactual} Furthermore:

Y_i|A_i = 0 \implies Y_i(1)|A_i = 1~ \text{is counterfactual}

And sorry I could not travel both And be one traveller, long I stood \dots

Average Treatment Effect in randomised controlled experiments work from assumptions

Consider inherently missing observations in an experiment:

\text{ATE} = \left[ \begin{aligned} &\left( \underbrace{\mathbb{E}[Y(1)|A = 1]}_{\text{observed}} + \textcolor{red}{\underbrace{\mathbb{E}[Y(1)|A = 0]}_{\text{unobserved}}} \right) \\ &- \left( \underbrace{\mathbb{E}[Y(0)|A = 0]}_{\text{observed}} + \textcolor{red}{\underbrace{\mathbb{E}[Y(0)|A = 1]}_{\text{unobserved}}} \right) \end{aligned} \right]

Table 1

In Table 1, the expression, \mathbb{E}[Y(1)|A = 1] represents the average outcome when the treatment is given, which is observable. However, \mathbb{E}[Y(1)|A = 0] represents the average outcome if the treatment had been given to those who were untreated, which remains unobservable. Similarly, the quantity \mathbb{E}[Y(0)|A = 1] also remains unobservable. Note that the problem isn’t merely one of statistical analysis on the data. The problem is that the relevant data to identify individual causal effects are missing. Thus, it is evident that the fundamental problem of causal inference is an ever-present concern even in experiments!

All the data we require for causal inferences at the individual level are missing, Nevertheless, because the treatments have been randomised, the data we observe from randomised controlled experiments can allow us to infer what would happen, on average if the treatment were applied to the population from which the participants were sampled.

We call this causal affect the “Average Treatment Effect” or equivalently, the “Marginal Effect of the Treatment.”

How do we obtain Average Treatment Effects from non-randomised observational data?

Under stronger assumptions, we may sometimes infer causal effects from associations in data, even if the treatment variable A has not been randomised.

\text{ATE} = \sum_{l} \large( \mathbb{E}[Y|A=1, \textcolor{blue}{L=l}] - \mathbb{E}[Y|A=0, \textcolor{blue}{L=l}] \large) \times \textcolor{blue}{\Pr(L=l)}

Where L denotes a set of measured covariates, we can express this principle of “no confounding” mathematically in two complementary ways. Recall our notation: A \coprod B signifies that A is independent of B, and vice versa:

  1. Potential Outcomes Independent of Treatment (given L): Y(a) \coprod A \mid L
  2. Treatment Assignment Independent of Potential Outcomes (given L): A \coprod Y(a) \mid L

These formulations are crucial when working with causal diagrams, which visually encode these principles. The key idea is straightforward: ensuring a balance of confounders across treatment groups is fundamental to experimental and observational causal inference strategies.

With sufficiently large numbers of individuals, randomisation facilitates this balance, achieving A \coprod Y(a)*

The Three Fundamental Assumptions of Causal Inference.

Reviewing causal inference in experimental settings highlights three core assumptions essential for causal analysis.

Fundamental Assumption 1: Conditional Exchangeability

We say that conditional exchangability holds if the potential outcomes and treatment assignments are statistically independent, considering all measured confounders. This principle enables us to attribute observed group differences directly to the treatment. Randomisation provides unconditional exchangeability, simplifying the analytical process.

Many experiments, such as child-hood bilingualism, cannot be performed.

Challenge in Satisfying Conditional Exchangeability in Observational Settings

Achieving conditional exchangability is challenging in observational studies. This condition requires the groups being compared to be similar in every aspect except for the treatment. Consider bilingualism. In real-world data, individuals with access to green spaces may differ from those without access in several ways:

  • Socioeconomic status: the economic capacity of individuals often determines their opportunities for education and language learning, thereby affecting cognitive development.
  • Age demographics: language development is not even throughout childhood; when one learns a second language may make a difference.
  • Lifestyle choices: Children who learn different language
  • Personal values and social connections: environmental values and community ties may influence both the choice of residence and the utilisation of green spaces.

These and other unmeasured factors can introduce biases, complicating the interpretation of causal relationships in observational studies.

Fundamental Assumption 2: Causal Consistency

We say that causal consistency holds if there is no heterogeneity in the treatments that would prevent us from assuming that the observed outcomes under treatments correspond to their potential outcomes. For an individual ‘i’, we must be able to assume:

\begin{aligned} Y_{i}(1) &= (Y_{i}|A_{i} = 1) \quad \text{(Potential outcome if treated)} \\ Y_{i}(0) &= (Y_{i}|A_{i} = 0) \quad \text{(Potential outcome if untreated)} \end{aligned}

If this assumption holds, as well as the assumptions of conditional exchangeability and positivity (reviewed below), we can calculate the Average Treatment Effect (ATE) from observed data as:

\begin{aligned} \text{ATE} &= \mathbb{E}[Y(1)] - \mathbb{E}[Y(0)] \\ &= \mathbb{E}(Y|A=1) - \mathbb{E}(Y|A=0) \end{aligned}

This contrast assumes that the potential outcome under treatment is observable when the treatment is administered, setting Y_i(a) to Y_i|A_i=a.

Challenges in Satisfying the Causal Consistency Assumption in Observational Settings

The standardisation of treatments in randomised controlled experiments generally ensures the validity of the causal consistency assumption, which is seldom disputed. However, in observational settings, we cannot typically control the treatments that people receive. This fact imposes considerable challenges for satisfying this assumption. For example:

  • Cultures: the language one starts with, and the language one learns next, and the order of learning, may affect cognitive development.
  • Parents: the way in which one learns languages, either at schools or at home environments, may affect cognitive development.
  • Language: language impose different demands, consider the demands of a sign language (body-brain), a highly tonal language, the demands of different writing systems, the cognitive demands that arise in cultures with no written scripts…

Put another ways, causal inference does not merely require randomisation, it also requires control.

Fundamental Assumption 3: Positivity

We must assume that there is a non-zero probability of receiving each treatment level within covariate-defined subgroups.

P(A = a | L= l) > 0

This assumption is also met by the control that experimentalists exert over randomised controlled experiments and is rarely stated explicitly. However, in observational settings, this condition must be verified to avoid extrapolating results beyond observed data.

Challenges in Satisfying The Positivity Assumption in Observational Settings: The Relevant Treatments Do Not Exist in The Data

Positivity demands that each individual has the possibility of experiencing every level of the treatment to be compared. However, real-world constraints, such as the availability of bilingualism within different cultural settings, may preclude some groups from accessing bilinual learning. Where the treatments of interest are absent or scarcely represented in our dataset, the resulting causal inferences will lack empirical support. Consequently, any coefficients derived will represent extrapolations from statistical models, challenging the validity of our causal inferences (Westreich and Cole 2010; Hernan and Robins 2023).

Westreich, Daniel, and Stephen R. Cole. 2010. “Invited commentary: positivity in practice.” American Journal of Epidemiology 171 (6). https://doi.org/10.1093/aje/kwp436.
Hernan, M. A., and J. M. Robins. 2023. Causal Inference. Chapman & Hall/CRC Monographs on Statistics &Applied Probab. Taylor & Francis. https://books.google.co.nz/books?id=\_KnHIAAACAAJ.

Summary

We described the fundamental problem of causal inference, focusing on the critical distinction between correlation – associations in the data, and causation – the contrast between potential outcomes only one of which, at most, can be observed. We discussed how controlled experiments facilitate the estimation of average treatment effects (ATE) by systematically manipulating the variable of interest, allowing for the distribution of variables that might affect the outcome to be balanced across the treatment conditions. Our discussion then shifted to observational data, emphasising the challenges inherent in extracting causal relationships from data without the benefit of controlled interventions. We underscored the need to satisfy three key assumptions —- conditional exchangeability, causal consistency, and positivity—for inferring average treatment effects from observational data. These assumptions ensure that the treatment groups are comparable, the treatment effect is consistent across the population, and every individual has a non-zero probability of receiving each treatment level, respectively.

A note on causal diagrams: we have seen that causal diagrams are graphical tools offer a systematic approach to identifying and controlling for confounding variables – for helping to understand the conditions in which the assumption of “no unmeasured confounding” or equivalently of “balance in the confounders across treatments” may be satisfied. Causal diagrams, however, do not obviate the need for these assumptions; instead, they provide a framework for evaluating whether and how the first of the three assumptions – conditional exchangeability – may be satisfied in a given study.

Part 2: Lab

Focus

  • Application of regression and simulation in R for ATE estimation

Setting up your r script.

First, reinstall the margot package: https://go-bayes.github.io/margot/

# functions explained here: https://go-bayes.github.io/margot/

# installation
if (!require(devtools, quietly = TRUE)) {
  install.packages("devtools")
  library(devtools)
}

# reinstall the `margot` packagewith updates
# devtools::install_github("go-bayes/margot", quietly = TRUE)

# call package
library("margot")
library("tidyverse")
library("parameters")
library("skimr")
library("haven")
library("stdReg")
library('mice')
library("clarify")

# uncomment and check simulated data
# head(df_nz)

Download a copy of the data directory from the NZAVS OSF website

Find it the data directory here: https://osf.io/75snb/

The variables in the simulated data df_nz correspond to a subset of the variables in this directory.

Check N

# check total n in data
# total nzavs participants
n_total <- skimr::n_unique(df_nz$id)
print(n_total)
[1] 20000

Get data into shape for analysis

#### Data wrangling: select columns and transpose the data from long to wide
library(tidyverse)

# filter the original dataset for these IDs three waves
df_nz <- as.data.frame(df_nz)
df_nz <- haven::zap_formats(df_nz)
df_nz <- haven::zap_label(df_nz)
df_nz <- haven::zap_widths(df_nz)

name_exposure <-  "perfectionism"

# obtain ids for individuals who participated in 2018 and have no missing baseline exposure
ids_2018 <- df_nz %>%
   dplyr::filter(year_measured == 1, wave == 2018) %>%
   dplyr::filter(!is.na(!!sym(name_exposure))) |> # criteria, no missing
  pull(id)

# obtain ids for individuals who participated in 2019
ids_2019 <- df_nz %>%
   dplyr::filter(year_measured == 1, wave == 2019) %>%
   dplyr::filter(!is.na(!!sym(name_exposure))) |> # criteria, no missing
  pull(id)

# intersect IDs from 2018 and 2019 to ensure participation in both years
ids_2018_2019 <- intersect(ids_2018, ids_2019)


# data wrangling
dat_long <- df_nz |>
  dplyr::filter(id %in% ids_2018_2019 &
                  wave %in% c(2018, 2019, 2020)) |>
  arrange(id, wave) |>
  select(
    "id",
    "wave",
    "year_measured",
    "age",
    "male",
    "born_nz",
    "eth_cat",
    #factor(EthCat, labels = c("Euro", "Maori", "Pacific", "Asian")),
    "employed",
    # Are you currently employed? (this includes self-employment or casual work)

    "edu",
    "kessler6_sum",
    # "gen_cohort",
    "household_inc",
    "partner",
    # 0 = no, 1 = yes
    "parent",
    # 0 = no, 1 = yes
    "political_conservative",
    "hours_exercise",
    "agreeableness",
    # Mini-IPIP6 Agreeableness (also modelled as empathy facet)
    # Sympathize with others' feelings.
    # Am not interested in other people's problems.
    # Feel others' emotions.
    # Am not really interested in others.
    "conscientiousness",
    # see mini ipip6
    # Get chores done right away.
    # Like order.
    # Make a mess of things.
    # Often forget to put things back in their proper place.
    "extraversion",
    # Mini-IPIP6 Extraversion
    # Am the life of the party.
    # Don't talk a lot.
    # Keep in the background.
    # Talk to a lot of different people at parties.
    "honesty_humility",
    # see mini ipip6
    # Would like to be seen driving around in a very expensive car.
    # Would get a lot of pleasure from owning expensive luxury goods.
    # Feel entitled to more of everything.
    # Deserve more things in life.
    "openness",
    # see mini ipip6
    # Have a vivid imagination.
    # Have difficulty understanding abstract ideas.
    # Do not have a good imagination.
    # Am not interested in abstract ideas.
    "neuroticism",
    # see mini ipip6
    # Have frequent mood swings.
    # Am relaxed most of the time.
    # Get upset easily.
    # Seldom feel blue.
    "modesty",
    # # see mini ipip6
    # # I want people to know that I am an important person of high status,
    # # I am an ordinary person who is no better than others.
    # # I wouldn’t want people to treat me as though I were superior to them.
    # # I think that I am entitled to more respect than the average person is
    #"w_gend_age_ethnic",
    "sample_weights",
    "neighbourhood_community",
    # #I feel a sense of community with others in my local neighbourhood.
    "belong",
    "rural_gch_2018_l",
    "support",
    # "support_help",
    # # 'There are people I can depend on to help me if I really need it.
    # "support_turnto",
    # # There is no one I can turn to for guidance in times of stress.
    # "support_rnoguidance",
    #There is no one I can turn to for guidance in times of stress.
    "perfectionism",
    "kessler6_sum"
  ) |>
  mutate(
    #initialize 'censored'
    censored = ifelse(lead(year_measured) == 1, 1, 0),
    
    # modify 'censored' based on the condition; no need to check for NA here as 'censored' is already defined in the previous step
    censored =  ifelse(is.na(censored) &
                         year_measured == 1, 1, censored)
    
    # # Apply the case_when condition for setting 'censored' based on 'wave' and the dynamic column specified by 'nzavs_exposure'
    # censored = case_when(
    #   # Add this condition to keep previous modifications unless the specific condition is met!is.na(censored) ~ censored,
    #
    #   # Then check if 'wave' is 2019 and the specified exposure is NA, adjusting the condition to reflect the accurate logic
    #   wave == 2019 & !is.na(!!sym(nzavs_exposure)) ~ 1,
    #
    #   # Default case if none of the above apply; might not be necessary if all possibilities are covered
    #   TRUE ~ 0
    # )
  ) |>
  select(-year_measured) |>
  dplyr::mutate(
    household_inc_log = log(household_inc + 1),
    hours_exercise_log = log(hours_exercise + 1)  ) |>
  dplyr::select(
    -c(
      household_inc,
      hours_exercise
    )
  ) |>
  droplevels() |>
  # dplyr::rename(sample_weights = w_gend_age_ethnic,
  #               sample_origin =  sample_origin_names_combined) |>
  arrange(id, wave) |>
  droplevels() |>
  data.frame() |>
  droplevels() |>
  arrange(id, wave) |>
  mutate(
  rural_gch_2018_l = as.numeric(as.character(rural_gch_2018_l)),
  #   parent = as.numeric(as.character(parent)),
  partner = as.numeric(as.character(partner)),
  born_nz = as.numeric(as.character(born_nz)),
  censored = as.numeric(as.character(censored)),
  employed = as.numeric(as.character(employed))
  ) |>
  droplevels() |>
  arrange(id, wave) |>
  data.frame()

# check n in this sample
skimr::n_unique(dat_long$id)
[1] 14439
#check
#head(dat_long)

Convert data from long to wide and impute baseline values

# baseline variables
baseline_vars = c("age", "male", "edu", "partner", "employed")
exposure_var = c("perfectionism", "censored") # we will use the censored variable later
outcome_vars = c("kessler6_sum")


# function will add exposure and outcome to baseline
dat_long_wide <- margot::margot_wide_impute_baseline(dat_long, baseline_vars = baseline_vars, exposure_var = exposure_var, outcome_vars =outcome_vars)

 iter imp variable
  1   1  t0_employed  t0_edu  t0_kessler6_sum  t0_partner
  2   1  t0_employed  t0_edu  t0_kessler6_sum  t0_partner
  3   1  t0_employed  t0_edu  t0_kessler6_sum  t0_partner
  4   1  t0_employed  t0_edu  t0_kessler6_sum  t0_partner
  5   1  t0_employed  t0_edu  t0_kessler6_sum  t0_partner
# check
head(dat_long_wide)
  id   t0_age t0_male t0_edu t0_partner t0_employed t0_perfectionism
1  1 40.31341       1      8          1           1         4.303349
2  3 47.14449       1      3          1           0         2.660515
3  4 74.60987       0      6          1           0         4.297503
4  5 57.46461       0      6          1           1         4.046052
5  6 50.32225       0      7          1           0         4.030903
6  8 53.36900       1      6          1           1         1.967654
  t0_censored t0_kessler6_sum t1_perfectionism t1_censored t2_kessler6_sum
1           1        3.044502         3.670650           1        5.005375
2           1        2.962828         3.333177           0              NA
3           1        1.006293         4.350912           1        3.972263
4           1        1.986856         2.045632           1        2.038406
5           1        6.018567         4.350877           1        4.978472
6           1        2.987620         2.991798           1        2.029845
# standardise vars
dt <- dat_long_wide |> 
  dplyr::mutate(
    t0_age_z = scale(t0_age),
    t0_edu_z = scale(t0_edu),
    t0_kessler6_sum_z =  scale(t0_kessler6_sum),
    t0_perfectionism_z = scale(t0_perfectionism),
    t1_perfectionism_z = scale(t1_perfectionism),
    t2_kessler6_sum_z = scale(t2_kessler6_sum)) |> 
  data.frame()

# we are not handling missing data well, and this will throw bugs down stream
# to avoid this problem we remove attributes
dt<- margot::remove_numeric_attributes(dt)

Use regression to obtain conditional average treatment effect

# note that we are not handling missing data appropriately here

# take interaction of treatment and baseline vars
fit_1 <- glm(t2_kessler6_sum_z  ~ t1_perfectionism_z * (t0_age_z +  t0_edu_z + t0_kessler6_sum_z + t0_perfectionism_z), data = dt)

# summarise
regression_table  <- parameters::model_parameters(fit_1)

This regression coefficient has a causal interpretation (it is the ate, at the value when all variables in the model are set to zero)

regression_table[2, ]
Parameter          | Coefficient |       SE |       95% CI | t(11539) |      p
------------------------------------------------------------------------------
t1 perfectionism z |        0.22 | 9.36e-03 | [0.20, 0.24] |    23.32 | < .001

Average Treatment Effect

First we’ll use the stdReg package:

# fit the ate
fit_ate <- stdReg::stdGlm(fit_1,data = dt, X = "t1_perfectionism_z", x=seq(0,1))
print(summary(fit_ate))

# summarise the ate
ate_stdreg_method <- summary(fit_ate,  contrast = "difference",  reference = 0)

print( ate_stdreg_method )
# graph ate using stdReg method
plot(ate_stdreg_method)
Obtain ATE by hand using g-computation
# recall our fit
fit_1 <- glm(t2_kessler6_sum_z ~ t1_perfectionism_z + t0_age_z + t0_edu_z + t0_kessler6_sum_z + t0_perfectionism_z, data = dt, family = gaussian())

# step 2: create a new dataset where everyone receives the treatment
dt_treated <- dt
dt_treated$t1_perfectionism_z <- mean(dt$t1_perfectionism_z)  # setting treatment to the mean can be adjusted as needed

# step 3: predict outcomes under the treatment scenario
dt_treated$predicted_outcome <- predict(fit_1, newdata = dt_treated, type = "response")

# step 4: calculate the mean outcome under this treatment scenario
mean_outcome_treated <- mean(dt_treated$predicted_outcome)

# repeat Steps 2-4 for control or another treatment level
dt_control <- dt
dt_control$t1_perfectionism_z <- 1  # one sd increase in perfectionism

dt_control$predicted_outcome <- predict(fit_1, newdata = dt_control, type = "response")
mean_outcome_control <- mean(dt_control$predicted_outcome)

# step 5: calculate the average treatment effect
ate_gcomp <- round( mean_outcome_treated - mean_outcome_control, 2)
print(paste("Average Treatment Effect (ATE) of t1_perfectionism_z using g-computation: ", ate_gcomp))
[1] "Average Treatment Effect (ATE) of t1_perfectionism_z using g-computation:  -0.22"
Next obtain the ATE and standard errors by simulation using clarify
library("clarify")

# setup simulation with clarify to manipulate the treatment variable and predict outcomes
set.seed(123)
sim_coefs <- sim(fit_1)

# compute ate
sim_est <- sim_ame(sim_coefs, contrast = "diff", var =  "t1_perfectionism_z")

# summarise
summary(sim_est)
                            Estimate 2.5 % 97.5 %
E[dY/d(t1_perfectionism_z)]    0.221 0.202  0.239

Appendix A

Consider that in the causal inference literature, we may write this contrast two potential outcomes under treatment as:

\text{Causal Effect}_i = Y_i^{a=1} - Y_i^{a=0}

or

\text{Causal Effect}_i = Y_i^{1} - Y_i^{0}

or:

\text{Causal Effect}_i = Y_{1} - Y_0 Where subscripts are dropped. You will soon encounter that many

Packages

report::cite_packages()
  - Bulbulia J (2024). _margot: MARGinal Observational Treatment-effects_. doi:10.5281/zenodo.10907724 <https://doi.org/10.5281/zenodo.10907724>, R package version 0.3.1.0 Functions to obtain MARGinal Observational Treatment-effects from observational data., <https://go-bayes.github.io/margot/>.
  - Chang W (2023). _extrafont: Tools for Using Fonts_. R package version 0.19, <https://CRAN.R-project.org/package=extrafont>.
  - Greifer N, Worthington S, Iacus S, King G (2024). _clarify: Simulation-Based Inference for Regression Models_. R package version 0.2.1, <https://CRAN.R-project.org/package=clarify>.
  - Grolemund G, Wickham H (2011). "Dates and Times Made Easy with lubridate." _Journal of Statistical Software_, *40*(3), 1-25. <https://www.jstatsoft.org/v40/i03/>.
  - Lüdecke D, Ben-Shachar M, Patil I, Makowski D (2020). "Extracting, Computing and Exploring the Parameters of Statistical Models using R." _Journal of Open Source Software_, *5*(53), 2445. doi:10.21105/joss.02445 <https://doi.org/10.21105/joss.02445>.
  - Müller K, Wickham H (2023). _tibble: Simple Data Frames_. R package version 3.2.1, <https://CRAN.R-project.org/package=tibble>.
  - R Core Team (2024). _R: A Language and Environment for Statistical Computing_. R Foundation for Statistical Computing, Vienna, Austria. <https://www.R-project.org/>.
  - Sjolander A, Dahlqwist E (2021). _stdReg: Regression Standardization_. R package version 3.4.1, <https://CRAN.R-project.org/package=stdReg>.
  - van Buuren S, Groothuis-Oudshoorn K (2011). "mice: Multivariate Imputation by Chained Equations in R." _Journal of Statistical Software_, *45*(3), 1-67. doi:10.18637/jss.v045.i03 <https://doi.org/10.18637/jss.v045.i03>.
  - Waring E, Quinn M, McNamara A, Arino de la Rubia E, Zhu H, Ellis S (2022). _skimr: Compact and Flexible Summaries of Data_. R package version 2.1.5, <https://CRAN.R-project.org/package=skimr>.
  - Wickham H (2016). _ggplot2: Elegant Graphics for Data Analysis_. Springer-Verlag New York. ISBN 978-3-319-24277-4, <https://ggplot2.tidyverse.org>.
  - Wickham H (2023). _forcats: Tools for Working with Categorical Variables (Factors)_. R package version 1.0.0, <https://CRAN.R-project.org/package=forcats>.
  - Wickham H (2023). _stringr: Simple, Consistent Wrappers for Common String Operations_. R package version 1.5.1, <https://CRAN.R-project.org/package=stringr>.
  - Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). "Welcome to the tidyverse." _Journal of Open Source Software_, *4*(43), 1686. doi:10.21105/joss.01686 <https://doi.org/10.21105/joss.01686>.
  - Wickham H, Bryan J, Barrett M, Teucher A (2024). _usethis: Automate Package and Project Setup_. R package version 3.1.0, <https://CRAN.R-project.org/package=usethis>.
  - Wickham H, François R, Henry L, Müller K, Vaughan D (2023). _dplyr: A Grammar of Data Manipulation_. R package version 1.1.4, <https://CRAN.R-project.org/package=dplyr>.
  - Wickham H, Henry L (2023). _purrr: Functional Programming Tools_. R package version 1.0.2, <https://CRAN.R-project.org/package=purrr>.
  - Wickham H, Hester J, Bryan J (2024). _readr: Read Rectangular Text Data_. R package version 2.1.5, <https://CRAN.R-project.org/package=readr>.
  - Wickham H, Hester J, Chang W, Bryan J (2022). _devtools: Tools to Make Developing R Packages Easier_. R package version 2.4.5, <https://CRAN.R-project.org/package=devtools>.
  - Wickham H, Miller E, Smith D (2023). _haven: Import and Export 'SPSS', 'Stata' and 'SAS' Files_. R package version 2.5.4, <https://CRAN.R-project.org/package=haven>.
  - Wickham H, Vaughan D, Girlich M (2024). _tidyr: Tidy Messy Data_. R package version 1.3.1, <https://CRAN.R-project.org/package=tidyr>.
  - Xie Y (2024). _tinytex: Helper Functions to Install and Maintain TeX Live, and Compile LaTeX Documents_. R package version 0.54, <https://github.com/rstudio/tinytex>. Xie Y (2019). "TinyTeX: A lightweight, cross-platform, and easy-to-maintain LaTeX distribution based on TeX Live." _TUGboat_, *40*(1), 30-32. <https://tug.org/TUGboat/Contents/contents40-1.html>.