Perform Naive Cross-Sectional Regressions
Source:R/margot_naive_regressions.R
margot_naive_regressions.RdThis function performs naive cross-sectional linear regressions of a single exposure variable on multiple outcome variables, ignoring potential confounders. It produces output compatible with margot_plot() to demonstrate what happens when proper causal inference methods are not used. The results should be interpreted as misspecified models that do not account for confounding.
Usage
margot_naive_regressions(
data,
exposure_var,
outcome_vars,
baseline_vars = NULL,
scale = c("RD", "RR"),
delta = 1,
sd = 1,
coefficient_scale = 1,
save_output = FALSE,
save_path = here::here("push_mods"),
base_filename = "naive_regressions_output",
use_timestamp = FALSE,
prefix = NULL,
flip_outcomes = NULL,
flip_method = "zscore",
flip_scale_bounds = NULL
)Arguments
- data
A data frame containing all necessary variables.
- exposure_var
A character string specifying the exposure variable name.
- outcome_vars
A character vector of outcome variable names to be modeled.
- baseline_vars
Optional character vector of baseline variables to include as covariates in the regression models. Default is NULL (no additional covariates).
- scale
Character string specifying the scale for E-value calculation. Options are "RD" (risk difference, default) or "RR" (risk ratio).
- delta
The hypothesised increase in outcome for RD scale E-value calculations. Default value is 1.
- sd
The standard deviation of the outcome for RD scale E-value calculations. Default value is 1.
- coefficient_scale
Numeric value to scale coefficients by. Default is 1 (no scaling). Use this to interpret effects for multi-unit changes (e.g., set to 4 to get effects for a 4-unit change in the exposure variable).
- save_output
Logical, whether to save the complete output. Default is FALSE.
- save_path
The directory path to save the output. Default is "push_mods" in the current working directory.
- base_filename
The base filename for saving the output. Default is "naive_regressions_output".
- use_timestamp
Logical, whether to include a timestamp in the filename. Default is FALSE.
- prefix
Optional prefix to add to the saved output filename. Default is NULL.
- flip_outcomes
Optional character vector or named list specifying outcomes to reverse-score. Behaves like the `flip_outcomes` argument in `margot_causal_forest()` and appends an `_r` suffix to flipped outcomes.
- flip_method
Default inversion method when `flip_outcomes` is supplied. One of "zscore" (simple negation, default) or "ordinal" (bounded scale inversion).
- flip_scale_bounds
Numeric vector of length 2 [min, max] or named list supplying bounds for ordinal flipping. Required when `flip_method = "ordinal"` unless bounds should be inferred from the data.
Value
A list containing:
- models
A list of lm() model objects for each outcome.
- combined_table
A data frame with columns E[Y|A], 2.5 and E_Val_bound, compatible with margot_plot().
- individual_results
A list of individual regression summaries for each outcome.
Details
This function fits simple linear regressions of the form: outcome ~ exposure + baseline_vars. It calculates confidence intervals and E-values for each regression coefficient. The output uses "E[Y|A]" notation to indicate these are conditional expectations from naive regressions, not causal effects. The E-values calculated are technically incorrect since they assume causal interpretation of the coefficients.
This function is intended for educational purposes to demonstrate the difference between naive associations and properly estimated causal effects.
Examples
if (FALSE) { # \dontrun{
# perform naive regressions
naive_results <- margot_naive_regressions(
data = my_data,
exposure_var = "treatment",
outcome_vars = c("outcome1_z", "outcome2_z", "outcome3_z")
)
# perform naive regressions with baseline covariates
naive_results_adjusted <- margot_naive_regressions(
data = my_data,
exposure_var = "treatment",
outcome_vars = c("outcome1_z", "outcome2_z", "outcome3_z"),
baseline_vars = c("age", "gender", "baseline_outcome")
)
# perform naive regressions scaled for 4-unit change
naive_results_scaled <- margot_naive_regressions(
data = my_data,
exposure_var = "treatment",
outcome_vars = c("outcome1_z", "outcome2_z", "outcome3_z"),
coefficient_scale = 4
)
# plot results with misspecified label
margot_plot(naive_results$combined_table, rename_ate = "Naive Association")
} # }