In the 1980s and 1990s, observational studies suggested that postmenopausal oestrogen therapy protected against cardiovascular disease.
Large randomised trials later reported harm for key outcomes under trial protocols. Same clinical question. Different design. Different answer.
What this week delivers
Weeks 2–4 gave us tools to represent and diagnose bias. This week provides the formal machinery to move from structural assumptions to numerical estimates.
We define the average treatment effect (ATE), state the identification assumptions, and derive the standardisation formula.
Step 1: state the causal question
Let A = 1 denote a clearly defined exposure and A = 0 a clearly defined contrast. Let Y be an outcome measured at a fixed follow-up time.
For person i, the individual causal effect is:
Y_i(1) - Y_i(0).
We never observe both terms for one person at one time.
Step 2: move from individuals to populations
Because individual effects are unobservable, we target a population causal estimand:
Consistency fails when treatment versions are mixed under one label. If “oestrogen therapy” includes different formulations, dosages, and initiation times, Y(1) does not correspond to a single intervention.
Assumption 2: exchangeability
In a randomised trial (unconditional):
Y(a) \coprod A.
In observational data (conditional):
Y(a) \coprod A \mid L,
where L is a sufficient measured confounder set satisfying the backdoor criterion.
Assumption 3: positivity
For each covariate pattern used in adjustment:
0 < P(A = a \mid L = l) < 1.
Every covariate stratum must contain both treated and untreated individuals.
Identification: the standardisation formula
Assume consistency, exchangeability given L, and positivity. Then:
\mathbb{E}[Y(a)] = \sum_l \mathbb{E}[Y \mid A = a, L = l]\, P(L = l).
The oestrogen paradox is a design lesson. Observational investigators did not precisely define the timing of treatment initiation.
For women who initiated therapy many years after menopause, oestrogens were harmful. A population-level association that appeared protective was driven by selection: women who chose therapy were systematically healthier.
This illustrates the importance of establishing “time zero” and a limitation of cross-sectional data.
The causal workflow
Step
What it asks
0. Time zero
When does treatment assignment begin?
1. Treatment
What precisely is the intervention?
2. Outcome
What, when, and how is it measured?
3. Population
Who do we aim to inform?
4. Exchangeability
Are groups comparable given L?
5. Consistency
Is the treatment well-defined?
6. Positivity
Does overlap exist in every stratum?
7. Measurement
Does error distort the contrast?
8. Representativeness
Does attrition induce bias?
9. Documentation
Are assumptions transparent?
TARGET: reporting a target trial emulation
The TARGET statement (Cashin et al. (2025)) provides a structured checklist for reporting studies that emulate a target trial. Use it when writing up results.
Section
What to report
Title/Abstract
Identify target trial emulation; data source; key assumptions and findings
Methods
Eligibility, treatment strategies, assignment, follow-up from time zero, outcomes, causal contrasts, identifying assumptions, analysis