Skip to contents

Parallelised, diagnostic‑rich variant of `margot_causal_forest()`. Each outcome‑specific forest is estimated in its own R worker via **future**. All the `cli` messages and checks from the sequential original are preserved, so you still get the same granular reporting (dimension checks, Qini status, warnings, etc.). Live progress bars are emitted with **progressr** using a `cli` handler.

Usage

margot_causal_forest_parallel(
  data,
  outcome_vars,
  covariates,
  W,
  weights,
  grf_defaults = list(),
  save_data = FALSE,
  compute_rate = TRUE,
  top_n_vars = 15,
  save_models = TRUE,
  train_proportion = 0.7,
  qini_split = TRUE,
  qini_train_prop = 0.7,
  n_cores = future::availableCores() - 1,
  verbose = TRUE
)

Arguments

data

A data frame containing all necessary variables.

outcome_vars

A character vector of outcome variable names to be modelled.

covariates

A matrix of covariates to be used in the GRF models.

W

A vector of binary treatment assignments.

weights

A vector of weights for the observations.

grf_defaults

A list of default parameters for the GRF models.

save_data

Logical indicating whether to save data, covariates, and weights. Default is FALSE.

compute_rate

Logical indicating whether to compute RATE for each model. Default is TRUE.

top_n_vars

Integer specifying the number of top variables to use for additional computations. Default is 15.

save_models

Logical indicating whether to save the full GRF model objects. Default is TRUE.

train_proportion

Numeric value between 0 and 1 indicating the proportion of non-missing data to use for training policy trees. Default is 0.7.

qini_split

Logical indicating whether to do a separate train/test split exclusively for the Qini calculation. Default is TRUE (i.e., Qini is computed out-of-sample).

qini_train_prop

Proportion of data to use for the Qini training set (if qini_split=TRUE). Default is 0.7.

n_cores

integer. number of parallel workers (default = all cores − 1).

verbose

Logical indicating whether to display detailed messages during execution. Default is TRUE.

Value

list with elements: * `results` – per‑outcome diagnostics and objects * `combined_table` – rbind‑ed e‑value table across outcomes * `outcome_vars` – vector of (successful) outcome names * `not_missing` – indices of complete‑case rows * (`data`, `covariates`, `weights`) when `save_data = TRUE` * `full_models` when `save_models = TRUE`

Details

Messages produced inside workers are captured by **future** and dispatched to the master session. Progress bars update in real time. To silence progress, call `progressr::handlers("off")` before running.