Perform Naive Cross-Sectional Regressions
Source:R/margot_naive_regressions.R
margot_naive_regressions.Rd
This function performs naive cross-sectional linear regressions of a single exposure variable on multiple outcome variables, ignoring potential confounders. It produces output compatible with margot_plot() to demonstrate what happens when proper causal inference methods are not used. The results should be interpreted as misspecified models that do not account for confounding.
Arguments
- data
A data frame containing all necessary variables.
- exposure_var
A character string specifying the exposure variable name.
- outcome_vars
A character vector of outcome variable names to be modeled.
- baseline_vars
Optional character vector of baseline variables to include as covariates in the regression models. Default is NULL (no additional covariates).
- scale
Character string specifying the scale for E-value calculation. Options are "RD" (risk difference, default) or "RR" (risk ratio).
- delta
The hypothesised increase in outcome for RD scale E-value calculations. Default value is 1.
- sd
The standard deviation of the outcome for RD scale E-value calculations. Default value is 1.
- coefficient_scale
Numeric value to scale coefficients by. Default is 1 (no scaling). Use this to interpret effects for multi-unit changes (e.g., set to 4 to get effects for a 4-unit change in the exposure variable).
- save_output
Logical, whether to save the complete output. Default is FALSE.
- save_path
The directory path to save the output. Default is "push_mods" in the current working directory.
- base_filename
The base filename for saving the output. Default is "naive_regressions_output".
- use_timestamp
Logical, whether to include a timestamp in the filename. Default is FALSE.
- prefix
Optional prefix to add to the saved output filename. Default is NULL.
Value
A list containing:
- models
A list of lm() model objects for each outcome.
- combined_table
A data frame with columns E[Y|A], 2.5 and E_Val_bound, compatible with margot_plot().
- individual_results
A list of individual regression summaries for each outcome.
Details
This function fits simple linear regressions of the form: outcome ~ exposure + baseline_vars. It calculates confidence intervals and E-values for each regression coefficient. The output uses "E[Y|A]" notation to indicate these are conditional expectations from naive regressions, not causal effects. The E-values calculated are technically incorrect since they assume causal interpretation of the coefficients.
This function is intended for educational purposes to demonstrate the difference between naive associations and properly estimated causal effects.
Examples
if (FALSE) { # \dontrun{
# perform naive regressions
naive_results <- margot_naive_regressions(
data = my_data,
exposure_var = "treatment",
outcome_vars = c("outcome1_z", "outcome2_z", "outcome3_z")
)
# perform naive regressions with baseline covariates
naive_results_adjusted <- margot_naive_regressions(
data = my_data,
exposure_var = "treatment",
outcome_vars = c("outcome1_z", "outcome2_z", "outcome3_z"),
baseline_vars = c("age", "gender", "baseline_outcome")
)
# perform naive regressions scaled for 4-unit change
naive_results_scaled <- margot_naive_regressions(
data = my_data,
exposure_var = "treatment",
outcome_vars = c("outcome1_z", "outcome2_z", "outcome3_z"),
coefficient_scale = 4
)
# plot results with misspecified label
margot_plot(naive_results$combined_table, rename_ate = "Naive Association")
} # }