Lab 10: End-to-End Research Report

Lab materials

Download the final assessment Option A report template. If the course-site download fails, use the Google Drive mirror.

Run the setup block in Lab Setup, including quarto install tinytex from a terminal, then restart R. Work from the project the report-template unzips into. Before attempting the manuscript, open setup.R and run the whole script. The first run downloads the required materials and packages, which may take about 5 to 10 minutes. When setup.R finishes without an error, open and render manuscript.qmd.

This lab puts the whole course together in one workflow. By the end of the session you will have a draft Option A research report rendered to PDF: an outcome-wide average treatment effect table, a forest plot, per-outcome policy trees, and an ethics paragraph.

What you will learn

  1. How to set up a single-source-of-truth setup.R for a research report.
  2. How to estimate four average treatment effects in one batch with margot::margot_causal_forest().
  3. How to apply a Bonferroni correction and report E-values for an outcome-wide design.
  4. How to fit policy trees and choose between depth-1 and depth-2 with a stated parsimony threshold.
  5. How to apply a transparent graphing rule so you graph only policies that survive a stated reporting test.
  6. How to build a research report in Quarto Markdown with citations.

Setup

Install packages before class, then restart R. Follow Lab Setup: R Packages and Build Tools first.

PDF rendering has one extra system step. Run this in a terminal, not in the R console:

quarto install tinytex
quarto list tools

On macOS, use Terminal or the RStudio Terminal tab. On Windows, use Command Prompt, PowerShell, or the RStudio Terminal tab. If Windows says quarto is not recognised, install Quarto from https://quarto.org/docs/download/, close and reopen RStudio, then rerun the commands above.

The R block below mirrors what the report template's setup.R requires at the top: it installs missing CRAN packages, including the tinytex R interface package. The R package is useful, but it does not replace quarto install tinytex, which installs the TeX distribution Quarto needs to build PDFs. The block also requires causalworkshop >= 0.6.0 and margot >= 1.0.322, stopping with a clear restart-R message if a stale namespace is loaded.

cran_packages <- c(
  "ggplot2", "dplyr", "tibble", "tidyr", "ggdag", "grf",
  "knitr", "kableExtra", "rmarkdown", "tinytex"
)
missing_cran <- cran_packages[
  !vapply(cran_packages, \(p) requireNamespace(p, quietly = TRUE), logical(1))
]
if (length(missing_cran) > 0) install.packages(missing_cran)

if (!requireNamespace("pak", quietly = TRUE)) install.packages("pak")
if (!requireNamespace("margot", quietly = TRUE) ||
    packageVersion("margot") < "1.0.322") {
  pak::pak("go-bayes/margot")
}
if (!requireNamespace("causalworkshop", quietly = TRUE) ||
    packageVersion("causalworkshop") < "0.6.0") {
  pak::pak("go-bayes/causalworkshop")
  if ("causalworkshop" %in% loadedNamespaces()) {
    stop("causalworkshop was upgraded; please restart R and re-run.", call. = FALSE)
  }
}

suppressPackageStartupMessages({
  library(causalworkshop)
  library(margot)
  library(grf)
  library(ggdag)
  library(ggplot2)
  library(dplyr)
  library(tibble)
  library(knitr)
  library(kableExtra)
})

If the GitHub package installation fails after installing build tools, stop and use the course lab machine. Do not debug compilers during class.

Step 1: Start from the report template

Download research-report-template.zip and unzip it. If you already downloaded the template from the Google Drive mirror, you can keep using that copy. Open the resulting folder in RStudio (or another editor), then open manuscript.qmd.

In RStudio:

  1. Choose File > Open Project... if the folder contains an .Rproj file; otherwise choose File > Open... and select manuscript.qmd.
  2. Keep the Files pane pointed at the unzipped research-report-template folder.
  3. Click manuscript.qmd to open the report source.

If you use another editor, open the whole unzipped folder first, then open manuscript.qmd from inside that folder. Do not edit the zipped file itself.

Windows note

Keep the template in a simple folder such as Documents\PSYC434\research-report-template, not inside the zipped archive and not in a synced folder with a very long path. Render from the RStudio Terminal with:

quarto render manuscript.qmd

If the render fails before running any R code and mentions LaTeX, run quarto install tinytex from Command Prompt or PowerShell, restart RStudio, and render again.

Your folder should look like this:

research-report/
  setup.R          # study decisions, helpers, single source of truth
  manuscript.qmd   # prose, headings, tables, figures
  _quarto.yml      # render configuration
  references.bib   # BibTeX references
  README.md        # short orientation

setup.R holds decisions and reusable code. manuscript.qmd is the file you will write in: it holds the prose, headings, tables, figures, and code chunks that read from setup.R. The split lets you change one decision (the exposure, the parsimony threshold, the seed) without scattering edits across the manuscript.

The main decision controls live near the top of setup.R:

  • use_fit_cache: whether a successful model fit is reused on later renders. The template leaves this FALSE so a changed exposure or model reruns cleanly.
  • min_gain_for_depth_switch: how much held-out policy-value point gain depth-2 needs before it replaces depth-1.
  • policy_value_lower_threshold and treated_uplift_lower_threshold: the graphing-rule thresholds that decide whether a selected tree is strong enough to appear as a figure.
  • alpha_family_wise: the family-wise error rate for the four-outcome Bonferroni correction.
  • num_trees and n_iterations_stability: fitting controls that trade speed against Monte Carlo noise and stability information.

These are not cosmetic choices. They decide what counts as enough improvement, what counts as enough evidence to graph a tree, and how cautious the report should be.

Quarto Markdown survival guide

A Quarto report is a plain-text .qmd file that mixes prose, citations, tables, figures, and executable R code. When you render it, Quarto runs the code chunks, inserts their output, and turns the result into PDF and HTML. Treat manuscript.qmd as the source of truth for the written report.

Use this minimal syntax while drafting:

## Methods

This report estimates the causal effect of `r exposure_label` on four
wellbeing outcomes.

We follow the outcome-wide reporting logic in VanderWeele [@vanderweele2020].

```{r}
#| label: tbl-ate
#| echo: false
#| tbl-cap: "Average treatment effects for four wellbeing outcomes."

ate$transformed_table
```

The parts to recognise are:

  • ## starts a section heading. Use the template headings unless you have a good reason to change them.
  • Ordinary paragraphs are just text. Leave a blank line between paragraphs.
  • Inline R code uses `r object_name`. Use it when a number or label should update automatically.
  • R code chunks start with ```{r} and end with ```. Put chunk options such as echo: false or fig-cap: directly under the chunk header.
  • Citations use [@citekey], where the key must appear in references.bib. For example, [@vanderweele2020].
  • Figures and tables need captions. A reader should understand what the display shows without searching the prose.

Quarto code chunks: what runs and what appears

Quarto code chunks have three layers: the fence, the options, and the code. The fence says what language the block uses. The options say whether the code runs and whether the code or output appears in the report. The code does the work.

For R chunks, options begin with #| because # is an R comment:

```{r}
#| label: fig-example
#| echo: false
#| warning: false
#| fig-cap: "Example figure caption."

plot(1:10)
```

For LaTeX/TikZ chunks, options begin with %| because % is a LaTeX comment:

```{tikz}
%| label: fig-dag-tikz
%| eval: false
%| fig-cap: "TikZ DAG, switched off while using the ggdag version."

\begin{tikzpicture}
  \node {$A \to Y$};
\end{tikzpicture}
```

The most useful options are:

  • eval: false: do not run this chunk. Use this when you want to keep alternative code in the file, such as the TikZ DAG, without rendering it.
  • echo: false: run the code but hide the code from the PDF or HTML. Use this for most tables and figures.
  • include: false: run the code but hide the code and all its output. Use this for setup chunks that create objects used later.
  • warning: false and message: false: hide routine warnings or package messages from the report. Do not use these to hide real errors.
  • tbl-cap: and fig-cap:: add table and figure captions. Use tbl-cap for tables and fig-cap for figures.
  • label:: name the chunk. Labels must be unique, short, and contain no spaces. A good pattern is tbl-results, fig-forest, or setup-fit.

Use the right option for the job:

GoalOption
Keep code in the file but stop it runningeval: false
Run code but hide the code itselfecho: false
Run setup code silentlyinclude: false
Show output as raw Markdownoutput: asis
Temporarily let render continue after an errorerror: true

In the template, the ggdag DAG chunk has #| eval: true and the TikZ chunk has %| eval: false. To switch to the TikZ version, turn the ggdag chunk to #| eval: false and the TikZ chunk to %| eval: true. Keep only one DAG version active at a time.

Do not put options in the middle of a chunk. Quarto reads them at the top of the chunk, before the code. Also check the comment marker: #| is for R chunks, %| is for TikZ/LaTeX chunks.

For this report, write the prose in manuscript.qmd, put study decisions and helper code in setup.R, and keep references in references.bib. Do not paste screenshots of tables or figures into the report. Let the code chunks create them when the document renders.

Useful Quarto references:

Adding citations with BibTeX

The template already tells Quarto to use references.bib. You only need to add BibTeX entries to that file and cite them from manuscript.qmd.

Use this workflow for a source you find on Google Scholar:

  1. Search for the exact article title in Google Scholar.
  2. Find the correct record. Prefer the publisher version or a record with a Digital Object Identifier (DOI) when there are duplicates.
  3. Click the quotation-mark cite icon, or the Cite link if that is what your browser shows.
  4. Click BibTeX.
  5. Copy the full entry, from @article{... or @book{... through the closing }.
  6. Paste it at the bottom of references.bib.
  7. Give the entry a readable key, such as vanderweele2020, ding2016, or smith2024wellbeing.
  8. In manuscript.qmd, cite it with [@vanderweele2020] or use an in-text citation such as @vanderweele2020.
  9. Render the manuscript. If Quarto says a citation is missing, check that the key in the .qmd exactly matches the key in references.bib.

A BibTeX entry looks like this:

@article{vanderweele2020,
  title = {Outcome-wide Epidemiology},
  author = {VanderWeele, Tyler J.},
  journal = {Epidemiology},
  year = {2020},
  volume = {31},
  number = {1},
  pages = {6--9},
  doi = {10.1097/EDE.0000000000001141}
}

Check Google Scholar entries before trusting them. Fix obvious errors in capitalisation, missing journal names, missing DOIs, page ranges, or author names. For title capitalisation that must be preserved in APA style, protect proper nouns with braces, for example title = {Te Tiriti o Waitangi and Wellbeing}.

For several sources, you can also set Google Scholar to show BibTeX links directly: open the menu, choose Settings, find Bibliography manager, select Show links to import citations into, choose BibTeX, and save. Some library guides also describe saving sources to My Library and exporting several records at once, but for this assessment it is usually safer to add and check one source at a time.

Step 2: Make your study decisions

Open setup.R. The main study decisions and reporting thresholds live near the top. Start with the target population, exposure, seed, and simulated sample size:

target_population <- "..."

# for the lab walkthrough, set this to "community_group"
# for your submitted report, set this to "religious_service" or "volunteer_work"
name_exposure <- "community_group"

study_seed <- 2026
study_n <- 2000

Lab demo vs report

In this session we walk through the same template using community_group as the exposure. We use community_group for the lab so we can work end-to-end without spoiling either of the two report exposures (religious_service or volunteer_work). When you start your report, change name_exposure back to one of those two. The setup script accepts all three values so the lab can render, but only the first two are valid choices for submission.

Choose the people first. The target population tells the reader whose wellbeing the report is about and which interventions could sensibly apply to them. Only then define the exposure contrast and outcomes.

The four wellbeing outcomes are fixed by the assignment: sense of purpose, belonging, self-esteem, and life satisfaction at wave 2. The adjustment set $L$ contains baseline demographics, socioeconomic and personality variables, the baseline exposure, and the baseline value of every outcome (lagged-self adjustment — using each variable's own past value as a covariate).

Step 3: Simulate the panel

The simulator returns a long panel with three waves. The simulate_panel() helper in setup.R reshapes it into a wide tibble with one row per person: baseline covariates, exposure at wave 1, and the four outcomes at wave 2.

panel <- simulate_panel()
nrow(panel)
mean(panel$exposure_t1)

Synthetic data

The simulator is the same one used in Lab 9. It supports the eight exposure-by-outcome combinations students may pick. Numerical results do not generalise to any real population; they let you practise the workflow against a known truth.

Step 4: Fit four causal forests in one call

margot::margot_causal_forest() fits one honest causal forest per outcome with a shared adjustment set, and returns a single object with combined results. This is more concise than looping over grf::causal_forest() yourself, and it produces the inputs the policy-tree pipeline expects.

X <- as.matrix(panel[, covariate_cols])
W <- panel$exposure_t1
weights <- rep(1, nrow(panel))

models_binary <- margot::margot_causal_forest(
  data = panel,
  outcome_vars = c("t2_purpose", "t2_belonging",
                   "t2_self_esteem", "t2_life_satisfaction"),
  covariates = X,
  W = W,
  weights = weights,
  grf_defaults = list(num.trees = 500, honesty = TRUE),
  top_n_vars = 12,
  save_models = TRUE,
  save_data = TRUE,
  compute_conditional_means = TRUE,
  train_proportion = 0.5,
  use_train_test_split = TRUE,
  seed = study_seed
)

print(models_binary$combined_table)

Reading combined_table

One row per outcome. Columns are the risk-difference estimate E[Y(1)]-E[Y(0)], the 95% confidence interval (2.5 %, 97.5 %), the E-value for the point estimate (E_Value), and the E-value for the bound nearest the null (E_Val_bound). The CI is unadjusted; you will apply Bonferroni in Step 5.

Step 5: Outcome-wide reporting with margot_plot()

Reporting four outcomes in one table requires correcting for multiplicity. margot::margot_plot() does this in one call: pass the four-outcome combined table, set adjust = "bonferroni", and it returns a plot, a transformed table with the multiplicity-adjusted confidence intervals, and a short prose interpretation.

ate <- ate_plot_objects(models_binary)   # wrapper from setup.R
ate$plot                                  # forest plot
ate$transformed_table                     # ATE, Bonferroni 95% CI, E-values
cat(ate$interpretation)                   # auto-drafted summary

ate_plot_objects() wraps margot::margot_plot() with a plot_defaults list (in setup.R) so you can restyle (colours, ordering, text size, x-axis limits) without editing the manuscript chunks. The Bonferroni correction widens each CI by the family-wise $z$ factor ($z_{\alpha_{FW}/(2k)}$ with $k = 4$), and the bound E-value is recomputed at the multiplicity-adjusted lower limit. If that limit crosses zero, the bound E-value is reported as 1 — meaning no unmeasured-confounder strength is needed to push the bound to the null.

Why Bonferroni

The outcome-wide design asks how the exposure affects all four outcomes jointly. If you tested each outcome at $\alpha = 0.05$ independently, the family-wise error rate would be approximately $1 - (1 - 0.05)^4 \approx 0.19$. Bonferroni keeps the family-wise rate at 0.05. Other corrections (Holm, Benjamini-Hochberg) are reasonable; pick one and state which.

Step 6: Policy trees

After estimating the four ATEs, the question becomes whether the same exposure benefits everyone equally. Lab 9 introduced policy trees as an interpretable allocation rule. Lab 10 uses the same margot workflow, then applies a transparent graphing rule before deciding which trees to put in the report.

The course workflow uses an outcome-only policy objective. A do not treat leaf means the fitted rule assigns the no-treatment action for that covariate profile; it does not compute a financial saving. If a decision-maker needs a scarce-budget or cost-sensitive rule, the cost must be specified explicitly, for example by subtracting a treatment cost c from the treatment reward before refitting the policy tree.

Step 6a: Stability and the workflow object

policy_tree_stability <- margot::margot_policy_tree_stability(
  model_results = models_binary,
  depth = 2,
  n_iterations = 50,
  vary_type = "split_only",
  parallel = FALSE,
  label_mapping = label_mapping,
  seed = study_seed
)

wf <- margot::margot_policy_workflow(
  stability = policy_tree_stability,
  original_df = panel,
  label_mapping = label_mapping,
  audience = "policy",
  prefer_stability = TRUE,
  min_gain_for_depth_switch = 0.01,
  signal_score = "pv_snr",
  signals_k = 3,
  interpret_models = "wins_borderline",
  plot_models = "wins_borderline"
)

print(wf$policy_brief_df)
print(wf$best$depth_summary_df)

policy_brief_df lists, for each outcome, the selected depth, the policy value with its 95% confidence interval, the treated-uplift with its 95% confidence interval, and coverage (the share of the sample the rule recommends for treatment).

best$depth_summary_df contains the depth-1 versus depth-2 comparison and the parsimony decision. With min_gain_for_depth_switch = 0.01, the workflow selects depth-2 only when the held-out policy-value point gain over depth-1 is at least 0.01 outcome units. Use uncertainty intervals, stability, equity, and implementation burden to judge how cautiously to interpret the selected rule; do not treat interval overlap as the depth-selection rule.

In the report template, these decisions appear in two tables:

  • tbl-depth reads from fit$wf$best$depth_summary_df and shows the depth-1 value, depth-2 value, point gain, and selected depth.
  • tbl-selection reads from fit$selection and shows whether each selected tree passes the graphing rule.

The corresponding controls are in setup.R: min_gain_for_depth_switch controls tbl-depth; policy_value_lower_threshold and treated_uplift_lower_threshold control tbl-selection.

Coverage is an output of the fitted rule, not a fixed treatment budget. If the rule treats 80% of people, the tree is mostly saying "treat broadly" under the outcome-only objective. It is not saying a programme with capacity for 20% should use the same rule.

Step 6b: The graphing rule

margot::margot_select_grf_policy_trees() applies a transparent reporting test. It keeps a policy tree only when both the policy-value lower confidence limit and the treated-uplift lower confidence limit exceed zero. Outcomes that fail the test still appear in your tables; their trees do not get graphed.

selection <- margot::margot_select_grf_policy_trees(
  policy_brief = wf$policy_brief_df,
  policy_value_lower_threshold = 0,
  treated_uplift_lower_threshold = 0
)
print(selection)

The graph_policy_tree column is TRUE for outcomes whose targeting story passes the test. State the thresholds in your methods. If you raise them (for example, policy_value_lower_threshold = 0.05), justify the choice.

Why a graphing rule

The policy-tree algorithm always returns a tree. The interesting question is whether that tree is reliable enough to put in front of a reader. Reporting a tree with a policy value that includes zero invites overclaiming. The graphing rule is a precommitment: state the test, then graph only what passes.

Step 6c: Plotting the selected trees

margot_policy_workflow() builds combined plots (decision tree + prediction-points scatter) for the models flagged for interpretation. Pull them out for the manuscript by mapping selected outcome labels back to their model names.

label_to_model <- setNames(
  wf$best$depth_summary_df$model,
  wf$best$depth_summary_df$outcome_label
)
graphed_labels <- selection$Outcome[selection$graph_policy_tree]
graphed_models <- unname(label_to_model[graphed_labels])

for (mn in graphed_models) {
  print(wf$plots[[mn]]$combined_plot)
}

If wf$plots does not contain a model you want to graph, change interpret_models and plot_models in the workflow call. The default wins_borderline covers most reports; recommended is more permissive.

Step 7: Audit against simulator ground truth

The simulator stores the true individual treatment effects in tau_* columns. A side-by-side comparison of the estimated ATE and the population mean tau is a sanity check the analyses themselves cannot perform.

truth <- ground_truth_audit()
results |>
  select(outcome_label, estimate) |>
  left_join(truth, by = "outcome_label") |>
  mutate(diff = estimate - true_mean_tau)

In a real research report you do not have a tau_* column. The audit is a teaching scaffold: in the lab, it shows you how close the estimates come to the truth. Do not include the audit in the submitted report.

Step 8: Render the manuscript

Open manuscript.qmd in RStudio (or your editor) before rendering. This is the report source file. The preamble sources setup.R, simulates the panel, fits the four causal forests, builds the policy-tree workflow, and applies the graphing rule. Every table and figure reads from the resulting objects.

In the terminal (not the R console), run:

cd research-report
quarto render manuscript.qmd

On Windows, first open the unzipped template folder in RStudio, then run the render command from the RStudio Terminal. If you are using Command Prompt or PowerShell, use cd to move into the folder that contains manuscript.qmd; paths with spaces are easier to handle if you keep the folder under Documents\PSYC434.

Quarto writes a PDF and an HTML next to manuscript.qmd. Read both alongside the .qmd source. Check that every table and figure updates from setup.R; do not hard-code numerical results in the prose.

Render early and often

Render after every meaningful change. A render that fails an hour after the last successful render is easier to repair than one that fails after a day. While drafting, lower study_n, num_trees, and n_iterations_stability in setup.R for faster iteration; raise them for the final render.

If the render fails, read the first error message in the terminal. Most failures come from a missing package, an object name that does not exist, an unclosed code chunk, or a citation key that is absent from references.bib.

Step 9: Write the report

The template gives you headings, code chunks, tables, and figures. What it does not give you is the prose. Replace each italicised placeholder paragraph with your own writing, using the Introduction, Methods, Results, and Discussion (IMRAD) structure. IMRAD is the default for psychology, epidemiology, and most life-sciences journals.

The ten steps in the Causal Workflow tell you what a defensible causal study has to establish. IMRAD tells you where in the manuscript each piece goes. The two are complementary: every IMRAD slot is anchored to specific workflow steps.

IMRAD checklist, mapped to the causal workflow

  • Title: concise and informative. Name the exposure, the outcome family, and the population. Highlight the result, not only the method.
  • Abstract: a stand-alone summary with background, causal question, key methods, principal results, and conclusions. Anchored in workflow Steps 0, 1, 2, 3 (target population, exposure, time zero, outcomes).
  • Introduction: set the scientific context, identify the gap, and state why the question matters for the target population. Do not turn the Introduction into the ten-step workflow and do not summarise results here.
  • Methods: detailed enough for another investigator to reproduce the study. Anchored in Steps 0-8: target population, time zero, exposure, outcomes, the DAG and adjustment set for exchangeability, the causal-consistency argument, positivity diagnostics, measurement choices, and the strategy for preserving representativeness (attrition, missingness).
  • Results: objective presentation of findings — the four ATEs with Bonferroni-adjusted intervals, E-values, and any policy trees that pass the graphing rule. No interpretation, no citations. Outputs of estimation given the identification arguments stated in Methods.
  • Discussion: interpret results, compare with existing literature, address limitations, state implications. Anchored in Step 9 (transparent documentation): name the assumptions you made, where each could fail, and what the sensitivity analyses (E-values) imply. Acknowledge limitations honestly.
  • In summaries and abstracts, keep to one tense.
  • Use flowing prose in the main text. Bullet lists belong in methods checklists and supplementary material.

The template's section ordering follows the workflow after the Introduction: target population first, then the causal contrast and outcomes, then identification, then estimation. The Introduction, Methods, Results, and Discussion blocks map cleanly onto the IMRAD slots above; use the checklist as your editing pass.

Ethics paragraph (subsection of the Discussion)

The Option A criteria require a short statement on what would need to be considered before acting on a policy rule. Cover fairness, proxy variables, governance, and one value judgement the analysis depends on. In the template this lives as the Ethics subsection of the Discussion (three to five sentences). A policy tree that splits on income or deprivation can be defensible, or it can be a proxy for ethnicity or migration history; the ethics paragraph names the trade-off. Also state that the current rule does not include treatment cost unless you have explicitly specified one. The model cannot settle public values; you can describe what evidence the model can and cannot provide.

Pointers