Outcome-wide Science in NZAVS Studies

Causal Inference
Outcome-wide Science
Joseph Bulbulia

Victoria University of Wellington, New Zealand



Figure 1: Confounding control using three waves of data & multiple imputation of responses lost-to-follow-up in which an unmeasured cause affects both selection and the outcome, introducing bias.

Confounding occurs when the statistical association between indicators in the data do not reflect a causal association between parameters in a target population. A causal diagramme is a qualitative tool that enables us to quickly inspect sources of confounding.

We may represent our strategy for confounding control using the causal graph presented in Figure 1.

Figure 2 presents another causal graph in which selection may introduce bias.

Figure 2: Confounding control using three waves of data & multiple imputation of responses lost-to-follow-up in which the outcome affects selection, introducing bias

We define an exposure or treatment by \(A\). We define \(\boldsymbol{Y}\) as the set of all well-being outcomes contained in the NZAVS. Our interest is in consistently estimating the causal effect of congregation size on each of these outcomes. To consistently estimate the causal effect of an exposure \(A=1\) on the vector of well-being outcomes \(\boldsymbol{Y}\) we must ensure that we have included a set of confounders that will prevent any statistical association between \(A\) and \(Y\) in the absence of causation. A statistical association in the absence of a causal association may arise in one of three ways.: (1) \(A\) and \(Y\) share a common cause. (2) \(A\) and \(Y\) share a common effect. (3) There is confounding by descent, such that conditioning on a confounder introduces a statistical association between \(A\) and \(Y\) in the absence of a casual association (collider bias).

To address challenge (1) we may condition or stratify on all measured common causes of \(A\) and \(Y\). To address challenges (2) and (3), we must not condition on any effect or descendent of an effect. Note however that by conditioning on a descent of a common cause we may at least partially address conditioning by a common cause. This is because conditioning on the effect of a common cause may itself partially block the effect of the that common cause.

In NZAVS outcome-wide science studies we use a modified version of VanderWeele et al’s approach for estimating outcome-wide causal effects (VanderWeele, Mathur, and Chen 2020).

VanderWeele’s approach utilises three waves of panel data collected within the same participants over time.

Call \(t-1\) the baseline wave, \(t_0\) the exposure wave, and \(t+1\) the outcome measurement wave. First, we include at baseline indicators of all past outcomes \(\boldsymbol{Y}_{t-1}\). Second, we include at baseline an indicator for the past exposure \(A_{t-1}\). Third, we include a rich set of baseline confounders \(\boldsymbol{L_{t-1}}\). Of course, we cannot ensure that this strategy is sufficient to ensure no unmeasured confounding. However, on On VanderWeele’s strategy, for any unmeasured confounder to affect both the exposure/outcome association, it would need to do so independently of its effects on prior measurements of the exposure and (all) outcomes at baseline, in addition to all measured baseline confounders. Note also that by controlling for measured confounders the wave prior to exposure we avoid unwittingly conditioning on an effects of the exposure that subsequently affects the outcome – mediator bias. Note furthermore that by measuring the outcome in the year following the exposure, \(Y_{t+1}\), we ensure that the exposure temporally precedes the outcome, and is not, for example, an effect of the outcome. Finally, loss-to-follow-up or attrition, as well as survey non-response may introduce selection bias (represented by the boxed \(S_{t+1}\)). There are several ways in which uch selection bias may introduce confounding. In Figure 1, we note the prospect in which a novel common cause of \(A\) and \(Y\) called \(U_s\) may lead to a spurious association between \(A\) and \(Y\) if \(S\) is a collider of \(A\) and \(U_s\). To avoid selection bias we multiply impute the missing responses.


VanderWeele, Tyler J, Maya B Mathur, and Ying Chen. 2020. “Outcome-Wide Longitudinal Designs for Causal Inference: A New Template for Empirical Studies.” Statistical Science 35 (3): 437466.




BibTeX citation:
  author = {Joseph Bulbulia},
  title = {Outcome-Wide {Science} in {NZAVS} {Studies}},
  date = {2022-11-05},
  url = {https://go-bayes.github.io/b-causal-lab/},
  langid = {en}
For attribution, please cite this work as:
Joseph Bulbulia. 2022. “Outcome-Wide Science in NZAVS Studies.” November 5, 2022. https://go-bayes.github.io/b-causal-lab/.