Run Multiple Generalised Random Forest (GRF) Causal Forest Models in Parallel

Parallelised, diagnostic‑rich variant of `margot_causal_forest()`. Each outcome‑specific forest is estimated in its own R worker via **future**. All the `cli` messages and checks from the sequential original are preserved, so you still get the same granular reporting (dimension checks, Qini status, warnings, etc.). Live progress bars are emitted with **progressr** using a `cli` handler.

Usage

margot_causal_forest_parallel(
  data,
  outcome_vars,
  covariates,
  W,
  weights,
  grf_defaults = list(),
  save_data = FALSE,
  compute_rate = TRUE,
  top_n_vars = 15,
  save_models = TRUE,
  train_proportion = 0.7,
  qini_split = TRUE,
  qini_train_prop = 0.7,
  n_cores = future::availableCores() - 1,
  verbose = TRUE
)

Arguments

data: A data frame containing all necessary variables.
outcome_vars: A character vector of outcome variable names to be modelled.
covariates: A matrix of covariates to be used in the GRF models.
W: A vector of binary treatment assignments.
weights: A vector of weights for the observations.
grf_defaults: A list of default parameters for the GRF models.
save_data: Logical indicating whether to save data, covariates, and weights. Default is FALSE.
compute_rate: Logical indicating whether to compute RATE for each model. Default is TRUE.
top_n_vars: Integer specifying the number of top variables to use for additional computations. Default is 15.
save_models: Logical indicating whether to save the full GRF model objects. Default is TRUE.
train_proportion: Numeric value between 0 and 1 indicating the proportion of non-missing data to use for training policy trees. Default is 0.7.
qini_split: Logical indicating whether to do a separate train/test split exclusively for the Qini calculation. Default is TRUE (i.e., Qini is computed out-of-sample).
qini_train_prop: Proportion of data to use for the Qini training set (if qini_split=TRUE). Default is 0.7.
n_cores: integer. number of parallel workers (default = all cores − 1).
verbose: Logical indicating whether to display detailed messages during execution. Default is TRUE.

Value

list with elements: * `results` – per‑outcome diagnostics and objects * `combined_table` – rbind‑ed e‑value table across outcomes * `outcome_vars` – vector of (successful) outcome names * `not_missing` – indices of complete‑case rows * (`data`, `covariates`, `weights`) when `save_data = TRUE` * `full_models` when `save_models = TRUE`

Details

Messages produced inside workers are captured by **future** and dispatched to the master session. Progress bars update in real time. To silence progress, call `progressr::handlers("off")` before running.