The Causal Workflow: Nine Steps

This page summarises a nine-step framework for conducting causal inference with observational data. Each step addresses a potential threat to valid causal interpretation. The framework draws on Hernan and Robins (What If, 2024), VanderWeele (2022), and Bulbulia (2024).

How to use this page

This is a reference resource. Use it as a checklist when planning your research report. Each step links back to the lecture where it was introduced.

Step 1: State a well-defined treatment

Specify the hypothetical intervention precisely enough that every member of the target population could, in principle, receive it. "Weight loss" is too vague because people lose weight via exercise, diet, depression, surgery, and more. A clearer intervention is: "engage in vigorous physical activity for at least 30 minutes per day."

Precision here underwrites the causal consistency assumption (Step 5). If the treatment is vaguely defined, different people effectively receive different interventions, and the potential outcome $Y (a)$ is not well-defined.

See Week 5 for the formal definition of causal consistency.

Step 2: State a well-defined outcome

Define the outcome so the causal contrast is meaningful and temporally anchored. "Wellbeing" is underspecified; "psychological distress one year post-intervention, measured with the Kessler-6" is interpretable and reproducible. Include timing, scale, and instrument.

See Week 10 for how measurement choices affect causal identification.

Step 3: Clarify the target population

Say exactly who you aim to inform. Eligibility rules define the source population, but sampling and participation can yield a study population with a different distribution of effect modifiers. If you intend to generalise beyond the source population (transportability), articulate the additional conditions required.

See Week 6 for how effect modification interacts with population composition.

Step 4: Evaluate exchangeability

Make the case that potential outcomes are independent of treatment conditional on covariates: $Y (a) ⊥ ⊥ A ∣ X$ . Use DAGs, subject-matter arguments, pre-treatment covariate balance checks, and overlap diagnostics. If exchangeability is doubtful, redesign (e.g., stronger measurement, alternative identification strategies) rather than rely solely on modelling.

See Week 5 for the formal definition of exchangeability.

Step 5: Ensure causal consistency

Consistency requires that, for individuals receiving a treatment version compatible with level $a$ , the observed outcome equals $Y (a)$ . It also presumes well-defined versions and no interference between units. When multiple versions exist, either refine the intervention so versions are irrelevant to $Y (a)$ , or condition on version-defining covariates.

See Week 5 for examples of consistency violations.

Step 6: Check positivity (overlap)

Each treatment level must occur with non-zero probability at every covariate profile needed for exchangeability:

$P (A = a ∣ L = l) > 0$

Diagnose limited overlap using propensity score distributions or extreme weights. Consider design-stage remedies (trimming, restriction, adaptive sampling) before estimation.

See Week 5 for the formal positivity assumption.

Step 7: Ensure measurement aligns with the scientific question

Verify that constructs are captured by instruments whose error structures do not distort the causal contrast of interest. Be explicit about forms of measurement error (classical, Berkson, differential, misclassification) and their structural implications for bias.

See Week 10 for how measurement error creates collider bias in DAGs.

Step 8: Preserve representativeness

End-of-study analyses should reflect the target population's distribution of effect modifiers. Differential attrition, non-response, or measurement processes tied to treatment and outcomes can induce selection bias. Plan strategies such as inverse probability weighting for censoring, multiple imputation, and sensitivity analyses for missing-not-at-random data.

See Week 4 for selection bias structures.

Step 9: Document transparently

Make assumptions, disagreements, and judgement calls legible. Register or timestamp your analytic plan. Include identification arguments (DAGs), code, and data where possible. Report robustness and sensitivity analyses. Transparent reasoning is a scientific result in its own right.

Summary table

Step	Requirement	Core assumption
1	Well-defined treatment	Consistency
2	Well-defined outcome	Interpretability
3	Target population	Generalisability
4	Exchangeability	Conditional independence
5	Causal consistency	No interference, well-defined versions
6	Positivity	Overlap
7	Measurement validity	No differential error
8	Representativeness	No selection bias
9	Transparent documentation	Reproducibility

Keyboard shortcuts

PSYC 434: Conducting Research Across Cultures