Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Lab 8: RATE and QINI Curves

R script

Download the R script for this lab (right-click → Save As)

This lab evaluates whether targeting treatment to those predicted to benefit most improves outcomes compared with treating everyone. You will compute RATE curves and QINI curves from causal forest predictions and assess targeting efficiency at different population percentiles.

What you will learn

  1. How to rank individuals by predicted treatment benefit
  2. How to compute and interpret RATE curves (gain over random assignment)
  3. How to compute and interpret QINI curves (cumulative targeting gain)
  4. How to characterise the covariate profile of high-benefit individuals

Connection to previous labs

This lab builds directly on Labs 5 and 6. You will use the causal forest fitted in those labs to evaluate whether targeting resources to the most responsive individuals is worthwhile.

Setup

library(causalworkshop)
library(grf)
library(tidyverse)

Re-fit the causal forest from Labs 5-6 (or copy the code from Lab 5):

# simulate data
d <- simulate_nzavs_data(n = 5000, seed = 2026)
d0 <- d |> filter(wave == 0)
d1 <- d |> filter(wave == 1)
d2 <- d |> filter(wave == 2)

# construct matrices
covariate_cols <- c(
  "age", "male", "nz_european", "education", "partner", "employed",
  "log_income", "nz_dep", "agreeableness", "conscientiousness",
  "extraversion", "neuroticism", "openness",
  "community_group", "wellbeing"
)

X <- as.matrix(d0[, covariate_cols])
Y <- d2$wellbeing
W <- d1$community_group

# fit causal forest
cf <- causal_forest(
  X, Y, W,
  num.trees = 1000,
  honesty = TRUE,
  tune.parameters = "all",
  seed = 2026
)

# extract predicted individual treatment effects
tau_hat <- predict(cf)$predictions

Rank individuals by predicted benefit

The first step in any targeting analysis is to sort individuals from highest to lowest predicted treatment effect:

# sort by predicted benefit (descending)
n <- length(tau_hat)
tau_order <- order(tau_hat, decreasing = TRUE)
tau_sorted <- tau_hat[tau_order]

# what does the top of the distribution look like?
cat("Top 5 predicted effects:   ", round(head(tau_sorted, 5), 3), "\n")
cat("Bottom 5 predicted effects:", round(tail(tau_sorted, 5), 3), "\n")
cat("Overall mean:              ", round(mean(tau_hat), 3), "\n")

RATE curve

The RATE (Rank-Weighted Average Treatment Effect) curve shows how much we gain by targeting treatment to the top of predicted beneficiaries, compared with random assignment.

For each targeting rate , we compute the average predicted effect among the top of individuals, minus the overall average:

# compute RATE curve
rates <- seq(0.05, 1.00, by = 0.05)
rate_results <- tibble(
  rate = numeric(),
  avg_tau_targeted = numeric(),
  gain_over_random = numeric()
)

for (r in rates) {
  n_targeted <- floor(r * n)
  targeted_idx <- tau_order[seq_len(n_targeted)]

  avg_targeted <- mean(tau_hat[targeted_idx])
  gain <- avg_targeted - mean(tau_hat)

  rate_results <- bind_rows(
    rate_results,
    tibble(rate = r, avg_tau_targeted = avg_targeted, gain_over_random = gain)
  )
}

print(rate_results |> mutate(across(where(is.numeric), \(x) round(x, 3))))

Plot the RATE curve:

ggplot(rate_results, aes(x = rate, y = gain_over_random)) +
  geom_line(colour = "#E69F00", linewidth = 1) +
  geom_point(colour = "#E69F00", size = 2) +
  scale_x_continuous(labels = scales::percent_format()) +
  labs(
    title = "RATE curve: gain from targeting",
    x = "Targeting rate (proportion treated)",
    y = "Gain over random assignment"
  ) +
  theme_minimal()

Reading the RATE curve

A steep curve at low targeting rates means a small group benefits substantially more than average. A flat curve means everyone benefits similarly, and targeting adds no value. The curve always reaches zero at 100% (treating everyone is the same as random).

QINI curve

The QINI curve measures the cumulative gain from targeting. For each percentile , it computes the total benefit from targeting the top , minus the proportional share they would get under random assignment:

# compute QINI curve
qini_results <- tibble(
  percentile = numeric(),
  cumulative_gain = numeric()
)

for (p in rates) {
  n_top <- floor(p * n)
  top_idx <- tau_order[seq_len(n_top)]

  # cumulative gain: total effect for targeted minus proportional share

  cum_gain <- sum(tau_hat[top_idx]) - p * sum(tau_hat)

  qini_results <- bind_rows(
    qini_results,
    tibble(percentile = p, cumulative_gain = cum_gain)
  )
}

print(qini_results |> mutate(across(where(is.numeric), \(x) round(x, 3))))

Plot the QINI curve:

ggplot(qini_results, aes(x = percentile, y = cumulative_gain)) +
  geom_line(colour = "#56B4E9", linewidth = 1) +
  geom_point(colour = "#56B4E9", size = 2) +
  scale_x_continuous(labels = scales::percent_format()) +
  labs(
    title = "QINI curve: cumulative targeting gain",
    x = "Population percentile",
    y = "Cumulative gain over random"
  ) +
  theme_minimal()

Compute the area under the QINI curve (AUQC) via trapezoidal approximation:

# area under QINI curve
auqc <- sum(qini_results$cumulative_gain * 0.05)
cat("Area Under QINI Curve (AUQC):", round(auqc, 3), "\n")

AUQC interpretation

A larger AUQC means targeting is more valuable. An AUQC near zero means there is little heterogeneity to exploit, and random assignment performs nearly as well as targeted assignment.

Targeting efficiency

Create a summary table comparing the top 10%, 20%, and 50% of predicted beneficiaries:

# targeting efficiency at key percentiles
top_10_idx <- tau_order[seq_len(floor(0.10 * n))]
top_20_idx <- tau_order[seq_len(floor(0.20 * n))]
top_50_idx <- tau_order[seq_len(floor(0.50 * n))]

efficiency <- tibble(
  group = c("Top 10%", "Top 20%", "Top 50%", "Everyone"),
  avg_effect = c(
    mean(tau_hat[top_10_idx]),
    mean(tau_hat[top_20_idx]),
    mean(tau_hat[top_50_idx]),
    mean(tau_hat)
  ),
  lift_vs_random = c(
    mean(tau_hat[top_10_idx]) / mean(tau_hat),
    mean(tau_hat[top_20_idx]) / mean(tau_hat),
    mean(tau_hat[top_50_idx]) / mean(tau_hat),
    1
  )
) |>
  mutate(efficiency_gain_pct = round((lift_vs_random - 1) * 100, 1))

print(efficiency |> mutate(across(where(is.numeric), \(x) round(x, 3))))

Characterise the covariate profile of high-benefit individuals:

# who are the top 10%?
top_10_data <- d0[tau_order[seq_len(floor(0.10 * n))], ]
everyone <- d0

cat("Top 10% vs everyone:\n")
cat("  Extraversion:    ", round(mean(top_10_data$extraversion), 2),
    "vs", round(mean(everyone$extraversion), 2), "\n")
cat("  Neuroticism:     ", round(mean(top_10_data$neuroticism), 2),
    "vs", round(mean(everyone$neuroticism), 2), "\n")
cat("  Partner (prop):  ", round(mean(top_10_data$partner), 2),
    "vs", round(mean(everyone$partner), 2), "\n")
cat("  Agreeableness:   ", round(mean(top_10_data$agreeableness), 2),
    "vs", round(mean(everyone$agreeableness), 2), "\n")

Key takeaway

RATE and QINI curves translate heterogeneous treatment effects into actionable targeting decisions. If the curves are steep, concentrating resources on high-benefit individuals improves overall outcomes. If the curves are flat, treating everyone equally is just as effective. In Lab 9, we will learn how to express these targeting decisions as simple, interpretable rules using policy trees.

Exercises

Lab diary

Complete at least two of the following exercises for your lab diary.

  1. Different outcome. Compute RATE and QINI curves for a different outcome (e.g., belonging or life_satisfaction). Is the AUQC larger or smaller? What does this imply about targeting?

  2. Negative effects. Some individuals may have , meaning the treatment is predicted to harm them. How many individuals in your sample have negative predicted effects? What are the implications for resource allocation?

  3. AUTOC vs QINI weighting. The RATE curve (AUTOC weighting) emphasises the top of the ranking, while the QINI curve weights all percentiles equally. In one paragraph, explain when each metric would be more useful for a policy-maker.