Week 2: Causal Diagrams — Five Elementary Structures

Readings

Barrett M (2023). ggdag: Analyze and Create Elegant Directed Acyclic Graphs. R package version 0.2.7.9000. https://github.com/malcolmbarrett/ggdag
"An Introduction to Directed Acyclic Graphs": https://r-causal.github.io/ggdag/articles/intro-to-dags.html
"Common Structures of Bias": https://r-causal.github.io/ggdag/articles/bias-structures.html

Key concepts for the test(s)

Confounding
Causal Directed Acyclic Graph
Five elementary causal structures
d-separation
Back door path
Conditioning
Fork bias
Collider bias
Mediator bias
Four rules of confounding control

Download your lab R script

Create a new .R file called 02-lab.R with your name, contact, date, and a title such as "Simulating the five basic causal structures in R."
Copy and paste the code chunks below during class.
Save in a clearly defined project directory.

You may also download the lab here: Download the R script for Lab 02

Seminar

Overview

Understand basic features of causal diagrams: definitions and applications
Introduction to the five elementary causal structures
Lab: gentle introduction to simulation and regression

Review

Psychological research begins with two questions:

What do I want to know?

For which population does this knowledge generalise?

This course considers how to ask psychological questions that pertain to populations with different characteristics.

In psychological research, we typically ask questions about the causes and consequences of thought and behaviour: "What if?" questions (Hernán & Robins, 2025).
The following concepts help us describe two distinct failure modes:

External validity: the extent to which findings generalise to other situations, people, settings, and time periods. We want to know if our findings carry beyond the sample population to the target population. We fail when our results do not generalise as we think, or when we have not clearly defined our question or target population.
Internal validity: the extent to which the associations we obtain from data reflect causality. When asking "What if?" questions, we want to understand what would happen if we intervened. In this course, we use "treatment" or "exposure" to denote the intervention, and "outcome" to denote the effect of an intervention.

During the first part of the course, our primary focus is on challenges to internal validity from confounding bias.

How randomisation identifies causal effects

Last week we introduced the fundamental problem of causal inference: we cannot observe both potential outcomes $Y_{i} (1)$ and $Y_{i} (0)$ for the same individual. Randomisation solves this problem at the population level. When treatment is randomly assigned ( $R \to A$ ), treatment assignment is independent of all potential outcomes:

$Y (a) ⊥ ⊥ A$

This condition is called unconditional exchangeability. It means that the treated and untreated groups are, on average, identical in every respect except the treatment they received. Any difference in average outcomes between groups can therefore be attributed to the treatment itself. The ATE is then identified by a simple difference in group means:

$ATE = \hat{E} [Y ∣ A = 1] - \hat{E} [Y ∣ A = 0]$

Experiments are the benchmark for causal inference because randomisation eliminates confounding without requiring the investigator to know or measure the confounders. However, most psychological questions cannot be answered by experiment alone: we cannot randomly assign people to experience grief, adopt a religion, or grow up in poverty. For these questions, we need observational methods that achieve conditional exchangeability through careful adjustment. The causal diagrams we learn today are the tools for deciding what to adjust for.

Definitions

Internal validity is compromised if the association between the treatment and outcome in a study does not consistently reflect causality in the sample population as defined at baseline.

External validity is compromised if the association between the treatment and outcome in a study does not consistently reflect causality in the target population as defined at baseline.

Confounding bias exists if there is an open back-door path between the treatment and outcome, or if the path between the treatment and outcome is blocked.

Today, our purpose is to clarify the meaning of each term in this definition. To that end, we introduce the five elementary graphical structures employed in causal diagrams, then explain the four elementary rules that allow investigators to identify causal effects from the asserted relations in a causal diagram.

Introduction to Causal Diagrams

Causal diagrams (also called causal graphs, Directed Acyclic Graphs, or Causal DAGs) are graphical tools whose primary purpose is to enable investigators to detect confounding biases.

Remarkably, causal diagrams are rarely used in psychology!

The meaning of our symbols

Our variable naming conventions (adapted from Bulbulia, 2023)

For us:

$X$ denotes a variable without reference to its role.
$A$ denotes the "treatment" or "exposure" variable: the variable for which we seek to understand the effect of intervening on it. It is the "cause."
$Y$ denotes the outcome or response of an intervention. It is the "effect."
$Y (a)$ denotes the counterfactual or potential state of $Y$ in response to setting the level of the exposure to $A = a$ . To consistently estimate causal effects we need to evaluate counterfactual states of the world. This might seem like science fiction, but we are already familiar with methods for obtaining such contrasts: randomised controlled experiments.
$L$ denotes a measured confounder or set of confounders: a variable which, if conditioned upon, closes an open back-door path between $A$ and $Y$ .
$U$ denotes an unmeasured confounder: a variable that may affect both the treatment and the outcome but for which we have no direct measurement.
$M$ denotes a mediator: a variable along the path from exposure to outcome. Conditioning on a mediator when estimating the total effect of $A$ on $Y$ will bias that estimate.
$\overset{ˉ}{X}$ denotes a sequence of variables, for example a sequence of treatments.
$R$ denotes a randomisation or a chance event.

Elements of our causal graphs

Nodes, edges, and conditioning conventions (adapted from Bulbulia, 2023)

Time indexing

In our causal diagrams, we implement two conventions for temporal order. First, the layout is structured left to right to reflect the sequence of causality. Second, we index nodes according to the relative timing of events. If $X_{0}$ precedes $X_{1}$ , the indexing indicates this chronological order.

Representing uncertainty in timing

When the sequence of events is ambiguous (particularly in cross-sectional data), we use $X_{ϕt}$ to propose a temporal order without clear time-specific measurements. We denote an event presumed to occur first as $X_{ϕ 0}$ and a subsequent event as $X_{ϕ 1}$ .

Arrows

Black arrows denote causality.
Red arrows reveal an open backdoor path.
Dashed black arrows denote attenuation.
Red dashed arrows denote bias in a true causal association.
A blue arrow with a circle point denotes effect-measure modification.
$R \to A$ denotes random treatment assignment.

Boxes

A black box denotes conditioning that reduces confounding or is inert.
A red box describes conditioning that introduces confounding bias.
A dashed circle denotes a latent variable (not measured or not conditioned upon).

Terminology for conditional independence

Statistical independence ( $∐$ ): $A ∐ Y (a)$ means treatment assignment is independent of the potential outcomes.
Statistical dependence ( $∐$ ): $A ∐ Y (a)$ means treatment assignment is related to potential outcomes, potentially introducing bias.
Conditioning ( $∣$ ): specifies contexts under which independence or dependence holds.
- Conditional independence: $A ∐ Y (a) ∣ L$ means that once we account for $L$ , treatment and potential outcomes are independent.
- Conditional dependence: $A ∐ Y (a) ∣ L$ means potential outcomes and treatments are not independent after conditioning on $L$ .

The Five Elementary Structures of Causality

Judea Pearl proved that all elementary structures of causality can be represented graphically (Pearl, 2009).

Five elementary structures (adapted from Bulbulia, 2023)

Two variables:

Causality absent: no causal effect between $A$ and $B$ . They are statistically independent: $A ∐ B$ .
Causality: $A$ causally affects $B$ . They are statistically dependent: $A ∐ B$ .

Three variables:

Fork: $A$ causally affects both $B$ and $C$ . Variables $B$ and $C$ are conditionally independent given $A$ : $B ∐ C ∣ A$ .
Chain: $C$ is affected by $B$ , which is affected by $A$ . Variables $A$ and $C$ are conditionally independent given $B$ : $A ∐ C ∣ B$ . Here $B$ mediates the effect of $A$ on $C$ .
Collider: $C$ is affected by both $A$ and $B$ , which are independent. Conditioning on $C$ induces an association: $A ∐ B ∣ C$ .

Once we understand these basic relationships, we can build more complex causal diagrams. These structures help us see how statistical independences and dependencies emerge from the data, allowing us to clarify confounders so that $Y (a) ∐ A ∣ L$ .

The Three Identification Assumptions

Every causal inference, whether from an experiment or an observational study, rests on three assumptions. When all three hold, the causal effect of $A$ on $Y$ is identified: it can be estimated from the observed data.

Assumption 1: Causal consistency

The observed outcome for an individual equals the potential outcome under the treatment that individual actually received. Formally: if $A_{i} = a$ , then $Y_{i} = Y_{i} (a)$ . This assumption requires that the treatment is well-defined and that there are no hidden versions of treatment. For example, if "exercise" is the treatment, the assumption requires that exercising means roughly the same thing for everyone in the study.

Assumption 2: Conditional exchangeability (no unmeasured confounding)

Within levels of the measured covariates $L$ , treatment assignment is independent of the potential outcomes:

$Y (a) ⊥ ⊥ A ∣ L$

This is the assumption that, after conditioning on $L$ , the treated and untreated groups are exchangeable. It is satisfied automatically by randomisation (where $L$ is empty and the condition is unconditional). In observational studies, it requires that we have measured and conditioned on all common causes of treatment and outcome. No statistical test can confirm this assumption; it is justified by subject-matter knowledge encoded in a causal diagram.

Assumption 3: Positivity

Every individual has a non-zero probability of receiving each treatment level, within every stratum of the covariates:

$P (A = a ∣ L = l) > 0 for all a and l$

In plain language, there must be both treated and untreated individuals at every combination of covariate values. If some subgroup never receives treatment (for example, if no one over age 90 is prescribed the drug), we cannot estimate the causal effect for that subgroup because we have no counterfactual comparison.

These three assumptions form the foundation for everything that follows in this course. Causal diagrams help us evaluate whether the second assumption (conditional exchangeability) holds: if we can identify a set $L$ that blocks all backdoor paths between $A$ and $Y$ , and if we condition on $L$ , then conditional exchangeability is satisfied (given that our diagram is correct).

You might wonder: "If not from the data, where do our assumptions about causality come from?" Our assumptions are based on existing knowledge. Otto Neurath, an Austrian philosopher, used the metaphor of a ship rebuilt at sea:

We are like sailors who on the open sea must reconstruct their ship but are never able to start afresh from the bottom. Where a beam is taken away a new one must at once be put there, and for this the rest of the ship is used as support. In this way, by using the old beams and driftwood, the ship can be shaped entirely anew, but only by gradual reconstruction. (Neurath, 1973, p. 199)

The Four Rules of Confounding Control

Four rules of confounding control

Condition on common cause or its proxy. When $A$ and $Y$ share common causes, conditioning on these common causes blocks the open backdoor paths.
Do not condition on a mediator. When $L$ mediates $A \to Y$ , conditioning on $L$ biases the total causal effect estimate.
Do not condition on a collider. When $L$ is a common effect of $A$ and $Y$ , conditioning on $L$ induces a spurious association. For example, if marriage causes wealth and happiness causes wealth, conditioning on wealth induces an association between marriage and happiness even without a causal connection.
Proxy rule: conditioning on a descendant is akin to conditioning on its parent. When $L^{'}$ is an effect of $L$ , conditioning on $L^{'}$ is approximately equivalent to conditioning on $L$ . This can introduce bias (if $L$ is a collider) or reduce bias (if $L$ is an unmeasured common cause and $L^{'}$ is a measured proxy).

Keyboard shortcuts

PSYC 434: Conducting Research Across Cultures