Lab 8: RATE and QINI Curves
Download the R script for this lab (right-click → Save As)
This lab evaluates whether targeting treatment to those predicted to benefit most improves outcomes compared with treating everyone. You will compute RATE curves and QINI curves from causal forest predictions and assess targeting efficiency at different population percentiles.
- How to rank individuals by predicted treatment benefit
- How to compute and interpret RATE curves (gain over random assignment)
- How to compute and interpret QINI curves (cumulative targeting gain)
- How to characterise the covariate profile of high-benefit individuals
This lab builds directly on Labs 5 and 6. You will use the causal forest fitted in those labs to evaluate whether targeting resources to the most responsive individuals is worthwhile.
Setup
library(causalworkshop)
library(grf)
library(tidyverse)
Re-fit the causal forest from Labs 5-6 (or copy the code from Lab 5):
# simulate data
d <- simulate_nzavs_data(n = 5000, seed = 2026)
d0 <- d |> filter(wave == 0)
d1 <- d |> filter(wave == 1)
d2 <- d |> filter(wave == 2)
# construct matrices
covariate_cols <- c(
"age", "male", "nz_european", "education", "partner", "employed",
"log_income", "nz_dep", "agreeableness", "conscientiousness",
"extraversion", "neuroticism", "openness",
"community_group", "wellbeing"
)
X <- as.matrix(d0[, covariate_cols])
Y <- d2$wellbeing
W <- d1$community_group
# fit causal forest
cf <- causal_forest(
X, Y, W,
num.trees = 1000,
honesty = TRUE,
tune.parameters = "all",
seed = 2026
)
# extract predicted individual treatment effects
tau_hat <- predict(cf)$predictions
Rank individuals by predicted benefit
The first step in any targeting analysis is to sort individuals from highest to lowest predicted treatment effect:
# sort by predicted benefit (descending)
n <- length(tau_hat)
tau_order <- order(tau_hat, decreasing = TRUE)
tau_sorted <- tau_hat[tau_order]
# what does the top of the distribution look like?
cat("Top 5 predicted effects: ", round(head(tau_sorted, 5), 3), "\n")
cat("Bottom 5 predicted effects:", round(tail(tau_sorted, 5), 3), "\n")
cat("Overall mean: ", round(mean(tau_hat), 3), "\n")
RATE curve
The RATE (Rank-Weighted Average Treatment Effect) curve shows how much we gain by targeting treatment to the top of predicted beneficiaries, compared with random assignment.
For each targeting rate , we compute the average predicted effect among the top of individuals, minus the overall average:
# compute RATE curve
rates <- seq(0.05, 1.00, by = 0.05)
rate_results <- tibble(
rate = numeric(),
avg_tau_targeted = numeric(),
gain_over_random = numeric()
)
for (r in rates) {
n_targeted <- floor(r * n)
targeted_idx <- tau_order[seq_len(n_targeted)]
avg_targeted <- mean(tau_hat[targeted_idx])
gain <- avg_targeted - mean(tau_hat)
rate_results <- bind_rows(
rate_results,
tibble(rate = r, avg_tau_targeted = avg_targeted, gain_over_random = gain)
)
}
print(rate_results |> mutate(across(where(is.numeric), \(x) round(x, 3))))
Plot the RATE curve:
ggplot(rate_results, aes(x = rate, y = gain_over_random)) +
geom_line(colour = "#E69F00", linewidth = 1) +
geom_point(colour = "#E69F00", size = 2) +
scale_x_continuous(labels = scales::percent_format()) +
labs(
title = "RATE curve: gain from targeting",
x = "Targeting rate (proportion treated)",
y = "Gain over random assignment"
) +
theme_minimal()
A steep curve at low targeting rates means a small group benefits substantially more than average. A flat curve means everyone benefits similarly, and targeting adds no value. The curve always reaches zero at 100% (treating everyone is the same as random).
QINI curve
The QINI curve measures the cumulative gain from targeting. For each percentile , it computes the total benefit from targeting the top , minus the proportional share they would get under random assignment:
# compute QINI curve
qini_results <- tibble(
percentile = numeric(),
cumulative_gain = numeric()
)
for (p in rates) {
n_top <- floor(p * n)
top_idx <- tau_order[seq_len(n_top)]
# cumulative gain: total effect for targeted minus proportional share
cum_gain <- sum(tau_hat[top_idx]) - p * sum(tau_hat)
qini_results <- bind_rows(
qini_results,
tibble(percentile = p, cumulative_gain = cum_gain)
)
}
print(qini_results |> mutate(across(where(is.numeric), \(x) round(x, 3))))
Plot the QINI curve:
ggplot(qini_results, aes(x = percentile, y = cumulative_gain)) +
geom_line(colour = "#56B4E9", linewidth = 1) +
geom_point(colour = "#56B4E9", size = 2) +
scale_x_continuous(labels = scales::percent_format()) +
labs(
title = "QINI curve: cumulative targeting gain",
x = "Population percentile",
y = "Cumulative gain over random"
) +
theme_minimal()
Compute the area under the QINI curve (AUQC) via trapezoidal approximation:
# area under QINI curve
auqc <- sum(qini_results$cumulative_gain * 0.05)
cat("Area Under QINI Curve (AUQC):", round(auqc, 3), "\n")
A larger AUQC means targeting is more valuable. An AUQC near zero means there is little heterogeneity to exploit, and random assignment performs nearly as well as targeted assignment.
Targeting efficiency
Create a summary table comparing the top 10%, 20%, and 50% of predicted beneficiaries:
# targeting efficiency at key percentiles
top_10_idx <- tau_order[seq_len(floor(0.10 * n))]
top_20_idx <- tau_order[seq_len(floor(0.20 * n))]
top_50_idx <- tau_order[seq_len(floor(0.50 * n))]
efficiency <- tibble(
group = c("Top 10%", "Top 20%", "Top 50%", "Everyone"),
avg_effect = c(
mean(tau_hat[top_10_idx]),
mean(tau_hat[top_20_idx]),
mean(tau_hat[top_50_idx]),
mean(tau_hat)
),
lift_vs_random = c(
mean(tau_hat[top_10_idx]) / mean(tau_hat),
mean(tau_hat[top_20_idx]) / mean(tau_hat),
mean(tau_hat[top_50_idx]) / mean(tau_hat),
1
)
) |>
mutate(efficiency_gain_pct = round((lift_vs_random - 1) * 100, 1))
print(efficiency |> mutate(across(where(is.numeric), \(x) round(x, 3))))
Characterise the covariate profile of high-benefit individuals:
# who are the top 10%?
top_10_data <- d0[tau_order[seq_len(floor(0.10 * n))], ]
everyone <- d0
cat("Top 10% vs everyone:\n")
cat(" Extraversion: ", round(mean(top_10_data$extraversion), 2),
"vs", round(mean(everyone$extraversion), 2), "\n")
cat(" Neuroticism: ", round(mean(top_10_data$neuroticism), 2),
"vs", round(mean(everyone$neuroticism), 2), "\n")
cat(" Partner (prop): ", round(mean(top_10_data$partner), 2),
"vs", round(mean(everyone$partner), 2), "\n")
cat(" Agreeableness: ", round(mean(top_10_data$agreeableness), 2),
"vs", round(mean(everyone$agreeableness), 2), "\n")
RATE and QINI curves translate heterogeneous treatment effects into actionable targeting decisions. If the curves are steep, concentrating resources on high-benefit individuals improves overall outcomes. If the curves are flat, treating everyone equally is just as effective. In Lab 9, we will learn how to express these targeting decisions as simple, interpretable rules using policy trees.
Exercises
-
Different outcome. Compute RATE and QINI curves for a different outcome (e.g.,
belongingorlife_satisfaction). Is the AUQC larger or smaller? What does this imply about targeting? -
Negative effects. Some individuals may have , meaning the treatment is predicted to harm them. How many individuals in your sample have negative predicted effects? What are the implications for resource allocation?
-
AUTOC vs QINI weighting. The RATE curve (AUTOC weighting) emphasises the top of the ranking, while the QINI curve weights all percentiles equally. In one paragraph, explain when each metric would be more useful for a policy-maker.