Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Lab 9: Policy Trees

R script

Download the R script for this lab (right-click → Save As)

This lab moves from evaluating heterogeneity (Lab 8) to making decisions. Policy trees learn simple, interpretable treatment assignment rules from causal forest predictions. You will fit policy trees of different depths, evaluate their performance, and discuss the ethical implications of algorithmic treatment assignment.

What you will learn

  1. How to construct a reward matrix from causal forest predictions
  2. How to fit and interpret depth-1 and depth-2 policy trees
  3. How to evaluate policy performance against random assignment
  4. How to express learned rules in plain language

Connection to previous labs

This lab uses the causal forest and individual treatment effect predictions from Labs 5-8. The progression is: estimate effects (Lab 5) discover heterogeneity (Lab 6) evaluate targeting (Lab 8) learn assignment rules (this lab).

Setup

library(causalworkshop)
library(grf)
library(policytree)
library(tidyverse)

Install policytree

If you haven't installed policytree, run install.packages("policytree") first.

Re-fit the causal forest (or re-use from previous labs):

# simulate data
d <- simulate_nzavs_data(n = 5000, seed = 2026)
d0 <- d |> filter(wave == 0)
d1 <- d |> filter(wave == 1)
d2 <- d |> filter(wave == 2)

# construct matrices
covariate_cols <- c(
  "age", "male", "nz_european", "education", "partner", "employed",
  "log_income", "nz_dep", "agreeableness", "conscientiousness",
  "extraversion", "neuroticism", "openness",
  "community_group", "wellbeing"
)

X <- as.matrix(d0[, covariate_cols])
Y <- d2$wellbeing
W <- d1$community_group

cf <- causal_forest(
  X, Y, W,
  num.trees = 1000,
  honesty = TRUE,
  tune.parameters = "all",
  seed = 2026
)

tau_hat <- predict(cf)$predictions

The gamma matrix

A policy tree needs a reward matrix (called the "gamma matrix"). Each row is an individual; each column is an action. The entry gives the expected reward for assigning that individual to that action.

With two actions (treat vs not treat), the gamma matrix has two columns:

# construct gamma matrix
# column 1: reward if not treated (control) = 0 (baseline)
# column 2: reward if treated = predicted treatment effect
gamma_matrix <- cbind(
  control = rep(0, length(tau_hat)),
  treatment = tau_hat
)

head(gamma_matrix)

Why is the control reward zero?

We normalise the control reward to zero so that the treatment column represents the gain from treating. A positive value means treatment helps; a negative value means treatment harms. The policy tree then simply needs to decide: for which individuals is the gain positive enough to justify treatment?

Fit a depth-1 policy tree

A depth-1 tree makes a single split, dividing the population into two groups based on one covariate:

# subsample for speed (policy tree fitting can be slow on large datasets)
set.seed(2026)
n_sample <- 500
idx <- sample(seq_len(nrow(X)), n_sample)

X_sample <- as.data.frame(X[idx, ])
gamma_sample <- gamma_matrix[idx, ]

# fit depth-1 policy tree
pt_depth1 <- policy_tree(X_sample, gamma_sample, depth = 1)

# print the tree
print(pt_depth1)

Visualise the tree:

plot(pt_depth1)

Reading the tree

The tree shows one splitting variable and a threshold. Individuals above the threshold go one way; individuals below go the other. The leaf labels (1 or 2) correspond to the columns of the gamma matrix: 1 = control, 2 = treatment.

Fit a depth-2 policy tree

A depth-2 tree makes two sequential splits, creating four groups:

# fit depth-2 policy tree
pt_depth2 <- policy_tree(X_sample, gamma_sample, depth = 2)

# print and plot
print(pt_depth2)
plot(pt_depth2)

Evaluate policies

Predict treatment assignments on the full dataset and compare with random assignment:

# predict actions for full dataset
X_full <- as.data.frame(X)
actions_depth1 <- predict(pt_depth1, X_full)
actions_depth2 <- predict(pt_depth2, X_full)

# compute expected reward under each policy
# action = 1 means control, action = 2 means treatment
reward_depth1 <- ifelse(actions_depth1 == 1, gamma_matrix[, 1], gamma_matrix[, 2])
reward_depth2 <- ifelse(actions_depth2 == 1, gamma_matrix[, 1], gamma_matrix[, 2])
reward_random <- 0.5 * gamma_matrix[, 1] + 0.5 * gamma_matrix[, 2]

# compare policies
policy_comparison <- tibble(
  policy = c("Random assignment", "Depth-1 tree", "Depth-2 tree", "Treat everyone"),
  expected_reward = c(
    mean(reward_random),
    mean(reward_depth1),
    mean(reward_depth2),
    mean(tau_hat)
  ),
  treat_rate = c(
    0.50,
    mean(actions_depth1 == 2),
    mean(actions_depth2 == 2),
    1.00
  )
)

print(policy_comparison |> mutate(across(where(is.numeric), \(x) round(x, 3))))

Is the improvement worth the complexity?

A depth-2 tree is harder to explain than a depth-1 tree but may assign treatments more efficiently. Compare the expected rewards: if depth-2 is only marginally better, the simpler depth-1 rule may be preferable for transparency.

Interpret the rules in plain language

Read the tree output and translate it into a decision rule:

# what variables and thresholds does the depth-2 tree use?
print(pt_depth2)

# example interpretation (your values will differ):
# "treat individuals with extraversion > 0.3 and baseline wellbeing < 0.1"

Exercise

Write out the depth-2 policy tree as a set of plain-language if-then rules. For example: "If extraversion is above X, then treat. Otherwise, if neuroticism is below Y, treat; otherwise, do not treat."

Compare policy assignments with actual treatment

How do the policy-recommended assignments compare with who actually received treatment in the data?

# agreement between policy and observed treatment
agreement <- tibble(
  actual = W,
  policy_depth2 = ifelse(actions_depth2 == 2, 1, 0)
) |>
  mutate(agree = actual == policy_depth2)

cat("Agreement rate:", round(mean(agreement$agree), 3), "\n")
cat("Policy treats: ", round(mean(agreement$policy_depth2), 3), "\n")
cat("Actual treated:", round(mean(agreement$actual), 3), "\n")

Ethical considerations

Statistical optimality is not social optimality

A policy tree maximises expected treatment benefit, but it does not account for:

  • Fairness. The tree may split on variables correlated with protected characteristics (ethnicity, gender, socioeconomic status). Even if a variable is not in the covariate set, proxy variables may reproduce discriminatory patterns.
  • Equity. Targeting resources to those who benefit most may mean those who benefit somewhat receive nothing. A policy that is statistically optimal may be socially unjust.
  • Transparency. A depth-2 tree is interpretable; a depth-5 tree is not. Policy-makers and the public need to understand the rule.
  • Override. A clinician or social worker should always be able to override an algorithmic recommendation based on individual context the model cannot see.

When presenting policy tree results, always discuss these trade-offs.

Exercises

Lab diary

Complete at least two of the following exercises for your lab diary.

  1. Different outcome. Fit a policy tree for religious_service on belonging. Do the splitting variables change? What does this suggest about which covariates drive effect modification for different outcomes?

  2. Depth-3 tree. Fit a depth = 3 policy tree. Does the expected reward improve substantially over depth-2? Is the tree still interpretable?

  3. Discuss override. In one paragraph, describe a scenario where a clinician should override a policy tree recommendation. What information would the clinician have that the model does not?