Causal Inference: Average Treatment Effects

PSYC 434 — Week 5

Opening example

In the 1980s and 1990s, observational studies suggested that postmenopausal oestrogen therapy protected against cardiovascular disease.

Large randomised trials later reported harm for key outcomes under trial protocols. Same clinical question. Different design. Different answer.

What this week delivers

Weeks 2–4 gave us tools to represent and diagnose bias. This week provides the formal machinery to move from structural assumptions to numerical estimates.

We define the average treatment effect (ATE), state the identification assumptions, and derive the standardisation formula.

Step 1: state the causal question

Let A = 1 denote a clearly defined exposure and A = 0 a clearly defined contrast. Let Y be an outcome measured at a fixed follow-up time.

For person i, the individual causal effect is:

Y_i(1) - Y_i(0).

We never observe both terms for one person at one time.

Step 2: move from individuals to populations

Because individual effects are unobservable, we target a population causal estimand:

\text{ATE} = \mathbb{E}[Y(1) - Y(0)] = \mathbb{E}[Y(1)] - \mathbb{E}[Y(0)].

This is the mean contrast if everyone in the target population received A = 1 versus everyone received A = 0.

Assumption 1: causal consistency

The switching equation: each person carries two potential outcomes, but we observe only the one selected by their actual treatment.

Y_i^{obs} = A_i \cdot Y_i(1) + (1 - A_i) \cdot Y_i(0)

This is what links counterfactual quantities to data.

Consistency: the two cases

For A_i = 1: \quad Y_i^{obs} = 1 \cdot Y_i(1) + 0 \cdot Y_i(0) = Y_i(1).

For A_i = 0: \quad Y_i^{obs} = 0 \cdot Y_i(1) + 1 \cdot Y_i(0) = Y_i(0).

Consistency fails when treatment versions are mixed under one label. If “oestrogen therapy” includes different formulations, dosages, and initiation times, Y(1) does not correspond to a single intervention.

Assumption 2: exchangeability

In a randomised trial (unconditional):

Y(a) \coprod A.

In observational data (conditional):

Y(a) \coprod A \mid L,

where L is a sufficient measured confounder set satisfying the backdoor criterion.

Assumption 3: positivity

For each covariate pattern used in adjustment:

0 < P(A = a \mid L = l) < 1.

Every covariate stratum must contain both treated and untreated individuals.

Identification: the standardisation formula

Assume consistency, exchangeability given L, and positivity. Then:

\mathbb{E}[Y(a)] = \sum_l \mathbb{E}[Y \mid A = a, L = l]\, P(L = l).

The ATE is identified by standardisation:

\text{ATE} = \sum_l \Big(\mathbb{E}[Y \mid A=1, L=l] - \mathbb{E}[Y \mid A=0, L=l]\Big) P(L=l).

What we can check, and what we cannot

Check Feasible?
Empirical overlap (positivity) Yes, via propensity score distributions
Practical positivity problems Yes, via trimming diagnostics
Exchangeability (no unmeasured confounding) No, requires subject-matter knowledge
Consistency (well-defined treatment) No, requires precise intervention definition

Design and subject-matter knowledge are central.

Return to the opening example

The oestrogen paradox is a design lesson. Observational investigators did not precisely define the timing of treatment initiation.

For women who initiated therapy many years after menopause, oestrogens were harmful. A population-level association that appeared protective was driven by selection: women who chose therapy were systematically healthier.

This illustrates the importance of establishing “time zero” and a limitation of cross-sectional data.

The causal workflow

Step What it asks
0. Time zero When does treatment assignment begin?
1. Treatment What precisely is the intervention?
2. Outcome What, when, and how is it measured?
3. Population Who do we aim to inform?
4. Exchangeability Are groups comparable given L?
5. Consistency Is the treatment well-defined?
6. Positivity Does overlap exist in every stratum?
7. Measurement Does error distort the contrast?
8. Representativeness Does attrition induce bias?
9. Documentation Are assumptions transparent?

TARGET: reporting a target trial emulation

The TARGET statement (Cashin et al. (2025)) provides a structured checklist for reporting studies that emulate a target trial. Use it when writing up results.

Section What to report
Title/Abstract Identify target trial emulation; data source; key assumptions and findings
Methods Eligibility, treatment strategies, assignment, follow-up from time zero, outcomes, causal contrasts, identifying assumptions, analysis
Results Participant flow, baseline characteristics, follow-up, missing data, effect estimates with precision, sensitivity analyses
Discussion Interpretation, limitations, differences between target trial and emulation

Readings

Required and optional readings for each week are listed on the course readings page.

Required: Cashin et al. (2025) (TARGET statement).

Appendix: notation variants

Equivalent notations for the individual contrast include superscript and explicit argument forms:

Y_i^{1} - Y_i^{0} \quad = \quad Y_i(a = 1) - Y_i(a = 0).

All three notations (Y_i(1) - Y_i(0), superscript, explicit argument) refer to the same contrast. This course uses the parenthetical form.

Cashin, Aidan G, Hopin J Hansford, Miguel A Hernán, et al. 2025. “Transparent Reporting of Observational Studies Emulating a Target Trial—the TARGET Statement.” JAMA 334 (12): 1084–93. https://doi.org/10.1001/jama.2025.13350.