Run Multiple Generalized Random Forest (GRF) Causal Forest Models with Enhanced Qini Cross-Validation
Source:R/margot_causal_forest.R
margot_causal_forest.Rd
This function runs multiple GRF causal forest models with enhanced features. In addition to estimating causal effects, it can compute the Rank-Weighted Average Treatment Effect (RATE) for each model. It also gives you the option to train a separate "Qini forest" on a subset of data and compute Qini curves on held-out data, thereby avoiding in-sample optimism in the Qini plots.
Usage
margot_causal_forest(
data,
outcome_vars,
covariates,
W,
weights,
grf_defaults = list(),
save_data = FALSE,
compute_rate = TRUE,
top_n_vars = 15,
save_models = TRUE,
train_proportion = 0.7,
qini_split = TRUE,
qini_test_prop = 0.5,
verbose = TRUE
)
Arguments
- data
A data frame containing all necessary variables.
- outcome_vars
A character vector of outcome variable names to be modelled.
- covariates
A matrix of covariates to be used in the GRF models.
- W
A vector of binary treatment assignments.
- weights
A vector of weights for the observations.
- grf_defaults
A list of default parameters for the GRF models.
- save_data
Logical indicating whether to save data, covariates, and weights. Default is FALSE.
- compute_rate
Logical indicating whether to compute RATE for each model. Default is TRUE.
- top_n_vars
Integer specifying the number of top variables to use for additional computations. Default is 15.
- save_models
Logical indicating whether to save the full GRF model objects. Default is TRUE.
- train_proportion
Numeric value between 0 and 1 indicating the proportion of non-missing data to use for training policy trees. Default is 0.7.
- qini_split
Logical indicating whether to do a separate train/test split exclusively for the Qini calculation. Default is FALSE (i.e., Qini is computed in-sample).
- qini_test_prop
Proportion of data to use for the Qini test set (if
qini_split=TRUE
). Default is 0.5.- verbose
Logical indicating whether to display detailed messages during execution. Default is TRUE.