Reporting Guide

Published

August 12, 2025

NoteReadings for Workshop
ImportantKey concepts:
  • Confounding
    • Causal Directed Acyclic Graph
    • Five Elementary Causal Structures
    • d-separation
    • Back door path
    • Conditioning
    • Fork bias
    • Collider bias
    • Mediator bias
    • Four Rules of Confounding Control

A Practical Checklist for Your Sparrc Study

  1. State a well-defined treatment.
    Specify the hypothetical intervention precisely enough that every member of the target population could, in principle, receive it. For example, ‘weight loss’ is too vague—people lose weight via exercise, diet, depression, cancer, amputation, and more (Hernan and Robins 2020). A clearer intervention is: “engage in vigorous physical activity for ≥30 minutes per day” (Miguel A. Hernán et al. 2008). Precision here underwrites consistency (see step 5) and interpretability downstream.

  2. State a well-defined outcome.
    Define the outcome so the causal contrast is meaningful and temporally anchored. ‘Well-being’ is underspecified; ‘psychological distress one year post-intervention measured with the Kessler-6’ is interpretable and reproducible (Kessler et al. 2002). Include timing, scale, and instrument.

  3. Clarify the target population.
    Say exactly who you aim to inform. Eligibility rules define the source population, but sampling and participation can yield a study population with a different distribution of effect modifiers (Issa J. Dahabreh and Hernán 2019; Issa J. Dahabreh et al. 2019; Stuart, Ackerman, and Westreich 2018; Bulbulia 2024). If you intend to generalise beyond the source population (transport), articulate the additional conditions and knowledge required (Deffner, Rohrer, and McElreath 2022; Bareinboim and Pearl 2013; Westreich et al. 2017; Issa J. Dahabreh and Hernán 2019; Pearl and Bareinboim 2022).

  4. Evaluate whether treatment groups are exchangeable given measured covariates.
    Make the case that potential outcomes are independent of treatment conditional on covariates, i.e., Y(a)\coprod A\mid X (Neal 2020; Morgan and Winship 2014; Angrist and Pischke 2009; Hernan and Robins 2020). Use design and diagnostics (design diagrams/DAGs, subject-matter arguments, pre-treatment covariate balance, overlap checks). If exchangeability is doubtful, redesign (e.g., stronger measurement, alternative identification strategies) rather than rely solely on modelling.

  5. Ensure treatments to be compared satisfy causal consistency.
    Consistency requires that, for units receiving a treatment version compatible with level a, the observed outcome equals Y(a); it also presumes well-defined versions and no interference between units (Tyler J. VanderWeele and Hernan 2013; Hernan and Robins 2020). When multiple versions exist, either refine the intervention so versions are irrelevant to Y(a), or condition on version-defining covariates in ways that preserve your estimand.

  6. Check the positivity (overlap) assumption.
    Each treatment level must occur with non-zero probability at every covariate profile needed for exchangeability—and, when versions exist, for consistency as well (Westreich and Cole 2010). Diagnose limited overlap (propensity score distributions, extreme weights), and consider design-stage remedies (trimming, restriction, adaptive sampling) before estimation.

  7. Ensure measurement aligns with the scientific question.
    Verify that constructs are captured by instruments whose error structures won’t distort the causal contrast of interest. Be explicit about likely forms of measurement error (classical, Berkson, differential, misclassification) and their structural implications for bias (Hernan and Robins 2020; Tyler J. VanderWeele and Hernán 2012; Bulbulia 2024). Where feasible, incorporate validation studies, multiple indicators, or calibration models.

  8. Preserve representativeness from start to finish.
    End-of-study analyses should reflect the target population’s distribution of effect modifiers. Differential attrition, non-response, or measurement processes tied to treatment and outcomes can induce selection bias in the presence of true effects (M. A. Hernán 2017; Miguel A. Hernán and Robins 2017; Miguel A. Hernán, Hernández-Díaz, and Robins 2004; Bulbulia 2024). Plan and justify strategies such as inverse probability weighting for censoring, multiple imputation under defensible mechanisms, sensitivity analyses for missing not at random, and careful timing of measurements.

  9. Document the reasoning that supports steps 1–8.
    Make assumptions, disagreements, and judgement calls legible: register or otherwise time-stamp your analytic plan; include identification arguments (e.g., DAGs), code, and data where possible; report robustness and sensitivity analyses; and explain decisions about design restrictions, modelling choices, and transportability (Ogburn and Shpitser 2021). Transparent reasoning is a scientific result in its own right.

Projects with Applied Interests

  1. Translate effects into absolute, population-level impact.

Report absolute risk differences for the target population. Always show the baseline so deltas are interpretable.

  1. Show heterogeneity and targetability.

Where possible, identify who is affected, who is unaffected, and who may be harmed. If appropriate, provide a simple, auditable policy rule for targeting (and a plain-language rationale).

  1. Express uncertainty in decision terms.

Go beyond confidence intervals: report probabilities that an option is optimal, expected net benefit, expected regret, and when helpful the value of additional information. Use simulation to make uncertainty tangible.

Packages

Code
report::cite_packages()
  - Chang W (2023). _extrafont: Tools for Using Fonts_. doi:10.32614/CRAN.package.extrafont <https://doi.org/10.32614/CRAN.package.extrafont>, R package version 0.19, <https://CRAN.R-project.org/package=extrafont>.
  - R Core Team (2025). _R: A Language and Environment for Statistical Computing_. R Foundation for Statistical Computing, Vienna, Austria. <https://www.R-project.org/>.
  - Xie Y (2025). _tinytex: Helper Functions to Install and Maintain TeX Live, and Compile LaTeX Documents_. R package version 0.57, <https://github.com/rstudio/tinytex>. Xie Y (2019). "TinyTeX: A lightweight, cross-platform, and easy-to-maintain LaTeX distribution based on TeX Live." _TUGboat_, *40*(1), 30-32. <https://tug.org/TUGboat/Contents/contents40-1.html>.

References

Angrist, Joshua D, and Jörn-Steffen Pischke. 2009. Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton University Press.
Bareinboim, Elias, and Judea Pearl. 2013. “A General Algorithm for Deciding Transportability of Experimental Results.” Journal of Causal Inference 1 (1): 107–34.
Bulbulia, J. A. 2024. “Methods in Causal Inference Part 3: Measurement Error and External Validity Threats.” Evolutionary Human Sciences 6: e42. https://doi.org/10.1017/ehs.2024.33.
Dahabreh, Issa J., and Miguel A. Hernán. 2019. “Extending Inferences from a Randomized Trial to a Target Population.” European Journal of Epidemiology 34 (8): 719–22. https://doi.org/10.1007/s10654-019-00533-2.
Dahabreh, Issa J, James M Robins, Sebastien JP Haneuse, and Miguel A Hernán. 2019. “Generalizing Causal Inferences from Randomized Trials: Counterfactual and Graphical Identification.” arXiv Preprint arXiv:1906.10792.
Deffner, Dominik, Julia M. Rohrer, and Richard McElreath. 2022. “A Causal Framework for Cross-Cultural Generalizability.” Advances in Methods and Practices in Psychological Science 5 (3): 25152459221106366. https://doi.org/10.1177/25152459221106366.
Hernan, M. A., and J. M. Robins. 2020. Causal Inference: What If? Chapman & Hall/CRC Monographs on Statistics & Applied Probab. Taylor & Francis. https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/.
Hernán, M. A. 2017. “Invited Commentary: Selection Bias Without Colliders | American Journal of Epidemiology | Oxford Academic.” American Journal of Epidemiology 185 (11): 1048–50. https://doi.org/10.1093/aje/kwx077.
Hernán, Miguel A., Alvaro Alonso, Roger Logan, Francine Grodstein, Karin B. Michels, Walter C. Willett, JoAnn E. Manson, and James M. Robins. 2008. “Observational Studies Analyzed Like Randomized Experiments: An Application to Postmenopausal Hormone Therapy and Coronary Heart Disease.” Epidemiology 19 (6): 766. https://doi.org/10.1097/EDE.0b013e3181875e61.
Hernán, Miguel A., Sonia Hernández-Díaz, and James M. Robins. 2004. “A Structural Approach to Selection Bias.” Epidemiology 15 (5): 615–25. https://www.jstor.org/stable/20485961.
Hernán, Miguel A, and James M Robins. 2017. “Per-Protocol Analyses of Pragmatic Trials.” N Engl J Med 377 (14): 1391–98.
Kessler, R. C., G. Andrews, L. J. Colpe, E. Hiripi, D. K. Mroczek, S.-L. T. Normand, E. E. Walters, and A. M. Zaslavsky. 2002. “Short Screening Scales to Monitor Population Prevalences and Trends in Non-Specific Psychological Distress.” Psychological Medicine 32 (6): 959–76. https://doi.org/10.1017/S0033291702006074.
Morgan, Stephen L., and Christopher Winship. 2014. Counterfactuals and Causal Inference: Methods and Principles for Social Research. 2nd ed. Analytical Methods for Social Research. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9781107587991.
Neal, Brady. 2020. “Introduction to Causal Inference from a Machine Learning Perspective.” Course Lecture Notes (Draft). https://www.bradyneal.com/Introduction_to_Causal_Inference-Dec17_2020-Neal.pdf.
Ogburn, Elizabeth L., and Ilya Shpitser. 2021. “Causal Modelling: The Two Cultures.” Observational Studies 7 (1): 179–83. https://doi.org/10.1353/obs.2021.0006.
Pearl, Judea, and Elias Bareinboim. 2022. “External Validity: From Do-Calculus to Transportability Across Populations.” In Probabilistic and Causal Inference: The Works of Judea Pearl, 451–82.
Stuart, Elizabeth A, Benjamin Ackerman, and Daniel Westreich. 2018. “Generalizability of Randomized Trial Results to Target Populations: Design and Analysis Possibilities.” Research on Social Work Practice 28 (5): 532–37.
VanderWeele, Tyler J, and Miguel A Hernan. 2013. “Causal Inference Under Multiple Versions of Treatment.” Journal of Causal Inference 1 (1): 1–20.
VanderWeele, Tyler J., and Miguel A. Hernán. 2012. “Results on Differential and Dependent Measurement Error of the Exposure and the Outcome Using Signed Directed Acyclic Graphs.” American Journal of Epidemiology 175 (12): 1303–10. https://doi.org/10.1093/aje/kwr458.
Westreich, Daniel, and Stephen R. Cole. 2010. “Invited commentary: positivity in practice.” American Journal of Epidemiology 171 (6). https://doi.org/10.1093/aje/kwp436.
Westreich, Daniel, Jessie K Edwards, Catherine R Lesko, Elizabeth Stuart, and Stephen R Cole. 2017. “Transportability of Trial Results Using Inverse Odds of Sampling Weights.” American Journal of Epidemiology 186 (8): 1010–14.