Skip to contents

This function processes longitudinal data (wide format) across multiple waves, handling dyadic censoring: if one partner in a dyad is lost at wave \(t\), the entire dyad is set to lost at wave \(t\). All subsequent wave data are then set to NA.

Usage

margot_process_longitudinal_data_wider(
  df_wide,
  relationship_id = "NULL",
  ordinal_columns = NULL,
  continuous_columns_keep = NULL,
  exposure_vars = NULL,
  scale_exposure = FALSE,
  not_lost_in_following_wave = "not_lost_following_wave",
  lost_in_following_wave = NULL,
  remove_selected_columns = TRUE,
  time_point_prefixes = NULL,
  time_point_regex = NULL,
  save_observed_y = FALSE,
  censored_if_any_lost = TRUE
)

Arguments

df_wide

A wide-format dataframe containing longitudinal data.

relationship_id

A string naming the column that identifies dyads. Defaults to "NULL".

ordinal_columns

A character vector of column names to be treated as ordinal and dummy-coded.

continuous_columns_keep

A character vector of continuous column names to keep without scaling.

exposure_vars

A character vector of exposure variable names that determine attrition.

scale_exposure

Logical. If TRUE, scales the exposure variable(s). Default is FALSE.

not_lost_in_following_wave

Character string with the suffix for the "not lost" indicator. Default is "not_lost_following_wave".

lost_in_following_wave

Character string with the suffix for the "lost" indicator. If NULL, no 'lost' indicator is created.

remove_selected_columns

Logical. If TRUE, removes selected columns after encoding. Default is TRUE.

time_point_prefixes

A character vector of time point prefixes. If NULL, inferred from the data.

time_point_regex

A regex pattern for identifying time points. Used if time_point_prefixes is NULL.

save_observed_y

Logical. If TRUE, retains observed outcome values in the final wave even if lost. Default FALSE.

censored_if_any_lost

Logical. If TRUE, sets "not_lost_in_following_wave" = 0 if any data are NA in wave t+1.

Value

A processed dataframe suitable for longitudinal analyses.

Details

The dyadic logic occurs after computing each wave's "not_lost" indicator. If any person in a dyad is flagged lost (0), then all partners in that dyad also get flagged lost at the same wave.

The function prints CLI messages summarising how many dyads and how many participants are lost at each wave, and which dyad IDs were forced lost.

Examples

# See the tests in previous examples, or adapt your own.