Assessments

Overview

AssessmentCLOsWeightDue
Lab diaries (8 × 1.25%)1, 2, 310%Weekly
In-class test 1220%22 April (w7)
In-class test 22, 320%20 May (w11)
In-class presentation1, 2, 310%27 May (w12)
Research report1, 2, 340%Friday 29 May

Assessment 1: Lab Diaries (10%)

Nine weekly diaries, one per lab (weeks 1–6 and 8–10). There are no labs in week 7 (test 1), week 11 (test 2), or week 12 (presentations). Your best eight diaries count (8 × 1.25%), so you may miss one without penalty. Each diary is graded satisfactory/not satisfactory. You receive full credit for submitting a satisfactory entry. Diaries are due by the end of the lab session.

DiaryWeekDue date
lab-01.mdw1Wed 25 Feb
lab-02.mdw2Wed 4 Mar
lab-03.mdw3Wed 11 Mar
lab-04.mdw4Wed 18 Mar
lab-05.mdw5Wed 25 Mar
lab-06.mdw6Wed 1 Apr
lab-08.mdw8Wed 29 Apr
lab-09.mdw9Wed 6 May
lab-10.mdw10Wed 13 May

What to write

Each diary is a short reflection (~150 words) covering:

  1. What the lab covered and what you did.
  2. A connection to the week's readings or lecture content.
  3. One thing you found useful, surprising, or challenging.

Several labs have focussed exercises.

The labs are marked full-credit/no-credit. If your diary shows that you engaged with the lab, you will get full credit, even if some exercises are wrong.

Format

Write each diary as a plain markdown (.md) file named by week number: lab-01.md, lab-02.md, …, lab-10.md (there is no lab-07.md). Use GitHub-flavoured markdown formatting: headings, paragraphs, bold, italics, and lists. Because you push diaries to GitHub, your files will render there automatically. These submissions build your markdown fluency; later in the course you will use Quarto to extend markdown to PDF and Word.

Submission

Push your diary files to your private GitHub Classroom repository set up in Lab 1. The commit timestamp is your submission record. Your repository is private and visible only to you and the course coordinator; no additional sharing step is needed.

Markdown example

Here is a minimal diary entry showing basic markdown formatting:

# Lab 01: Introduction to R

This week we installed R and RStudio, then ran our first script.
The exercise connected to the lecture on **causal questions** by
showing how we structure data for analysis.

I found the following steps useful:

- Creating an RStudio project
- Writing a short R script
- Pushing changes to GitHub

Assessment 2: In-Class Test 1 (20%) — 22 April

Covers material from weeks 1–6 (causal questions, causal diagrams, confounding, average treatment effect (ATE), effect modification). Test duration is 50 minutes. The allocated time is 1 hour 50 minutes. Required: pen/pencil and one A4 sheet of notes. No devices permitted.

Test Location

The test is in class. Come to the seminar room (EA120) with a writing instrument.

Assessment 3: In-Class Test 2 (20%) — 20 May

Covers material from weeks 8–10 (heterogeneous treatment effects, machine learning, resource allocation, policy trees, and classical measurement theory). Same format and conditions as test 1: in class, on paper, with one A4 sheet of notes, no devices, and no AI tools.

Use the Assessment Self-Checks while preparing. They list the things you should be able to state clearly: the causal question, heterogeneous treatment effects, policy trees, and measurement assumptions.

Test Location

The test is in class. Come to the seminar room (EA120) with a writing instrument.

Assessment 4: In-Class Presentation (10%) — 27 May

You will present your research report (Option A) or your Marsden EOI concept (Option B). Your job is to answer two questions for a non-specialist audience: what is it, and why should anyone care?

The presentation is 10 minutes, followed by one question from the panel. You must answer the question after your talk. You may ask one brief clarifying question before answering.

You may use the whiteboard and paper notes. Do not use slides, handouts, devices, or other materials.

Your talk should cover the following points, in this order.

  1. Title and motivation (what is it, so what).
  2. Causal question, target population, exposure, and outcomes.
  3. A simple causal diagram showing your identification strategy.
  4. Causal estimand and analysis plan (what you will estimate, and how).
  5. One key limitation or risk, and how you will address it.

The full grade-banded rubric is in the Presentation Rubric.

Assessment 5: Research Report (40%) — Due Friday 29 May

You choose your format

Students choose one of two formats for the research report:

  • Option A: Research Report — quantify an average treatment effect using the synthetic dataset produced by the course simulator.
  • Option B: Marsden Fund EOI — write a first-round Marsden Fund Expression of Interest using the causal inference framework.

You must declare your choice by submitting the option form on Nuku by Friday 3 April (end of w6). If no declaration is received by this date, Option A is assumed.

Generate your data using the causalworkshop package:

# install (once)
install.packages("pak")
pak::pak("go-bayes/causalworkshop")

# generate data
library(causalworkshop)
d <- simulate_nzavs_data(n = 5000, seed = 2026)

Choose one exposure (religious_service or volunteer_work) and report effects outcome-wide on all four wellbeing outcomes (purpose, belonging, self_esteem, life_satisfaction). The third exposure in the simulator, community_group, is reserved for the worked example in Lab 9 and may not be used for the report. Lab sessions support you in this assignment. We assume no statistical background.

Current template and historical examples

  • Start from the current research-report template. If the course-site download fails, use the Google Drive mirror. It is the 2026 scaffold for Option A.
  • A historical report example is available here: Example PSYC434 Report. Use it to see tone and structure, not as a checklist for this year's narrower simulated-data assignment. Do not treat its topic, subgroup choices, or ethical framing as a model for this year's report.
  • A historical Marsden one-pager is available here: Marsden one-page example. It predates the current criteria. For Option B, follow the 2026 requirements below and the official RSNZ templates.

Use the Assessment Self-Checks before drafting your report or presentation. In particular, check that policy tree splits are interpreted as descriptive targeting rules, not as identified causes of differential response.

Late Penalty

Late assignments, and assignments with extensions, may be subject to delays in marking and may not receive comprehensive feedback.

Assignments submitted late without an approved extension will incur a grade penalty of 5% of the total marks available for the assignment per day late (i.e., in 24-hr increments), up to a maximum of 5 days (up to 24 hrs late = −5%; up to 48 hrs late = −10%, etc.).

Assignments submitted more than five days late without an approved extension will not be graded unless exceptional circumstances are accepted by the Course Coordinator.

Option A: Research Report

Estimate the Average Treatment Effect of a single exposure on all four wellbeing outcomes using the synthetic three-wave panel generated by causalworkshop::simulate_nzavs_data(). The outcome-wide design follows VanderWeele et al. (2020) and the workflow practised in Labs 8–10. You choose one exposure (religious_service or volunteer_work); the four outcomes (purpose, belonging, self_esteem, life_satisfaction) are fixed.

  • Introduction: 800-word hard limit.
  • Discussion: 1,000-word hard limit (extended to accommodate outcome-wide interpretation).
  • Methods/Results: concise, no specific word limit.
  • American Psychological Association (APA) style. Submit as a single PDF with R code appendix.

Assessment Criteria (Option A)

The criteria are listed in the order in which they should appear in the report and follow the ten-step causal workflow: first define the target population, then the contrast and outcomes, then check the identification assumptions, then state the causal estimand, then estimate, then report, then ethics. The same order is reflected in the manuscript.qmd scaffold inside the research-report template.

Introduction. Set up the scientific interest of the question. Explain why this population and topic matter, what is already known, what remains uncertain, and why a causal answer would be useful. The Introduction should read like the beginning of a scientific paper, not like a numbered workflow checklist.

Define the target population. State the target population: for whom would the answer apply if the assumptions held? Would the exposure apply to all New Zealanders, or only to a subset? Identify subpopulations for whom the contrast is ill-defined or for whom positivity is unlikely to hold. Explain how the synthetic sample relates to your target population. Comment on transportability.

Eligibility criteria. Force yourself to think concretely. Name the people for whom the answer is relevant ("would a Kiwi reader say this applies to me?"), and the people for whom the intervention would be incoherent (for example, increase religious-service attendance is not a coherent intervention for a self-described atheist with no intention of attending). Excluding the second group up front protects positivity and clarifies what the estimate is about.

State the causal question. State your question clearly after naming the target population. Frame it as a causal question: among this population, what would later wellbeing look like if everyone received the exposure compared with if no one received it? Note that the question is outcome-wide: you ask how the exposure affects four wellbeing outcomes jointly, rather than picking one favourite outcome after looking at the results. Describe any practical or ethical relevance. Confirm data are from the course simulator distributed in causalworkshop and that you use three waves.

Determine the exposure. Define the exposure $A$. Explain the contrast (binary or modified treatment policy). Address the consistency assumption — would two participants with the same recorded exposure value share the same potential outcomes? Confirm the exposure precedes the outcomes in time.

Determine the outcomes. Define each outcome $Y_k$ for $k = 1, \dots, 4$. State the scale (continuous, z-scored), the timing (wave 2, post-exposure), and how each outcome relates to the question. Following VanderWeele's outcome-wide approach (VanderWeele et al., 2020), the four outcomes together index wellbeing as a composite construct rather than four independent endpoints. Explain why this design reduces selective reporting (selecting only the outcome that gave the strongest result after seeing the data) and lets the reader read the pattern across the family rather than over-interpreting any single outcome.

Sample characteristics. Provide descriptive statistics for baseline demographics, exposure prevalence, and outcome distributions. Make clear the data are simulated.

Account for confounders. Define the baseline confounder set $L$. Justify each variable as a plausible cause of both $A$ and the $Y_k$. Confirm temporal order. Include baseline values of the exposure and of every outcome in $L$ — lagged-self adjustment (treating each variable's own past value as a covariate). Use z-scores for continuous covariates and one-hot encoding for categorical covariates with three or more levels. Describe how the adjustment set supports conditional exchangeability: that for two people who match on $L$, exposure status is, on average, as good as random with respect to potential outcomes. Note positivity (overlap of exposure levels) is required as well.

Draw a causal diagram. Include measured confounders, unmeasured confounders, and time indices. The diagram should make the identification strategy legible at a glance. Recall that directed acyclic graph (DAG) arrows are qualitative claims about which variables can affect which others, not commitments to specific causal mechanisms; the diagram is for checking d-separation (a graphical rule for reading off conditional independencies).

Identify the causal estimand. State the causal contrast for each outcome: $$ \mathrm{ATE}_k = \mathbb{E}[Y_k(1) - Y_k(0)], \quad k = 1, \dots, 4. $$ State the joint causal estimand of interest (the vector of four ATEs) and how you will summarise it.

Missing data. Describe how missing data are handled. Use inverse-probability-of-censoring weights (IPCW) — a method that reweights observed cases so they stand in for those who dropped out — for attrition, or state why the synthetic panel does not require them.

Model approach. Estimate the four ATEs in one batch using margot::margot_causal_forest() (a wrapper around grf::causal_forest() with sensible defaults: honest splitting, doubly-robust ATEs, and a shared adjustment set). Briefly explain how machine learning is used and what "doubly robust" means (the estimator is consistent if either the outcome model or the propensity-score model is well specified). State the tuning choices (number of trees, honesty, seed). The research-report-template scaffold (Lab 10) configures these choices in setup.R; report the values you used.

Multiple-testing correction. Because you report four outcomes, apply a Bonferroni correction at $\alpha_{FW} = 0.05$ and report the multiplicity-adjusted confidence intervals. margot::margot_plot(adjust = "bonferroni") does this automatically.

Sensitivity analysis. Report an E-value (VanderWeele & Ding, 2017) for every outcome's point estimate and for the lower bound of its multiplicity-adjusted (Bonferroni) confidence interval. margot::margot_plot() recomputes the bound E-value at the multiplicity-adjusted limit and reports it in the transformed table. Interpret each E-value in plain language.

Report results. Present the four ATE estimates, multiplicity-adjusted 95% confidence intervals, and E-values in a single forest plot and an accompanying table. Order outcomes by effect magnitude. Interpret the pattern as a whole, not outcome by outcome. Use the auto-drafted prose returned by margot::margot_plot() as a starting point; revise it for clarity and your audience.

Identify optimal policy trees. A policy tree finds the partition of the population that maximises a utility function (here, the benefit of treating people the rule recommends to treat, relative to a no-one-treated baseline). The tree therefore identifies an optimal targeting rule, not a set of subgroup contrasts; each leaf is a partition, not a comparison group with its own causal effect estimate. After estimating the four ATEs, fit depth-1 and depth-2 policy trees per outcome using the margot policy-tree workflow (Lab 9 walks through the steps; Lab 10 wires it into the report). Use the parsimony rule from the Reporting Guide to choose between depth-1 and depth-2 (the course default is min_gain_for_depth_switch = 0.01 outcome units). Translate the leaves of the chosen tree into plain language a non-specialist could repeat. Report the policy value with its 95% confidence interval, and the proportion of the sample assigned to treatment. Read the split variables as descriptors of where the rule sends people, not as identified causes of differential response. The current margot workflow does not force a fixed percentage treated; if you discuss a budget cap, state it as an extra decision problem.

Note the graphing rule for policy trees. The template's setup.R already calls margot::margot_select_grf_policy_trees() to decide which trees to plot; you do not have to call it yourself. The rule keeps a tree only when both the policy-value and the treated-uplift lower confidence limits exceed zero (the course default thresholds are zero for both). What you do need to do: (i) state the thresholds in your methods so a reader can see the rule, and (ii) report what the rule decided. Outcomes that fail the rule remain in your tables and prose; their trees do not appear as figures. If no outcome passes the rule, say so and discuss what an unidentified targeting story implies.

Discussion. Pull the report together in five short subsections (the template provides the scaffold; the whole discussion is capped at 1,000 words):

  1. Summary of results. Restate the causal question and summarise the pattern across the four outcomes (direction, rough magnitude, multiplicity-adjusted uncertainty, and the policy-tree targeting story). Describe the pattern, not each outcome in turn.
  2. Limitations. Acknowledge the identification assumptions and name the one most probable to fail in a real-data study. Note that the data are simulated, and what that implies for external validity.
  3. Importance of methods. Briefly motivate the outcome-wide + Bonferroni + policy-tree workflow against simpler alternatives. Two or three sentences.
  4. Importance of findings (theory and policy). State what the result would mean for theory and for practice if it held in real data. Name a decision-maker who would care and the smallest change to their practice the evidence could support.
  5. Ethics. Three to five sentences naming the equity, proxy-variable, and oversight considerations that would need attention before anyone acted on the policy tree. State one value judgement the analysis depends on. The aim is to flag the considerations, not to resolve them; the model cannot settle public values.

Contributor Roles Taxonomy (CRediT) and artificial-intelligence (AI) disclosure. Append a CRediT contributor statement (a National Information Standards Organization (NISO) standard) and an AI disclosure statement (per the course's AI use policy) after the Discussion. The template provides headings and short placeholder text for both.

Option B: Marsden Fund Expression of Interest (EOI)

Write a first-round Marsden Fund Expression of Interest (EOI) following the RSNZ 2026 guidelines. Your research question must use the causal inference framework taught in this course. Assume an Ecology, Human Behaviour, and Evolution (EHB) panel.

Templates and Guidelines

Download the official 2026 RSNZ templates before you begin:

Formatting: 12-point Times New Roman, single spacing, 2 cm margins. Submit as a single PDF.

Required Sections

Section numbers follow the 2026 RSNZ EOI form.

1a. Research Title (max 25 words). Plain language, no jargon. The title should be accessible to a scientifically literate non-specialist.

1d. Research Summary (max 200 words). This summary must be standalone: assessors outside your discipline will read it. Answer four questions in this order:

  1. What is the current state of the field? (1–2 sentences establishing the gap or problem.)
  2. What do you aim to do? (State the causal question plainly.)
  3. How will you do it? (Name the data source, design, and analytic approach.)
  4. What do you expect to find? (One sentence on anticipated results and their significance.)

2. Vision Mātauranga (max 200 words). Describe how the proposed research relates to the four Vision Mātauranga (VM) themes: (i) indigenous innovation, drawing on Māori knowledge, resources, and people; (ii) taiao, achieving environmental sustainability through iwi and hapū relationships with land and sea; (iii) hauora/oranga, improving health and social purpose; (iv) mātauranga, exploring indigenous knowledge and its contribution to NZ research. If none of the themes apply, you may state "not applicable" with a considered justification.

3a. Abstract (max 1 page). Cover the following: aims of the research; importance of the research area; novelty, originality, insight, and ambition of the proposed work; potential impact; methodology; and your capacity to deliver.

3b. Benefit Statement (max 400 words, 1 page). Describe the economic, environmental, or health benefit of the research to New Zealand. Explain why NZ is the right place for this research and describe potential impacts for Māori. In a student context the benefit case may be aspirational, but it must be concrete.

3c. References (max 3 pages). Bold your own name. Include article titles and full author lists (up to 12 authors; use "et al." thereafter).

3d. Roles and Resources (max 1 page). Describe the contributions of each team member, the resources required, and any ethical considerations. Use the Roles and Resources form.

Assessment Criteria (Option B)

Research. Quantifiable impact potential through novelty, originality, insight, and ambition. Rigorous methods grounded in prior research. Ability and capacity to deliver.

Benefit. Economic, environmental, or health benefit to New Zealand. Rationale for NZ-based research. In a student context the benefit case may be aspirational but must be concrete.

Vision Mātauranga. Relation to VM themes; where relevant, engagement with Māori. "Not applicable" is acceptable with considered justification.

Causal reasoning (course-specific). Well-defined causal question, clearly stated causal estimand, appropriate identification strategy. This criterion carries substantial weight.

For the full Marsden Fund assessment criteria, see the RSNZ 2026 EOI Guidelines (pdf).

AI use in this course

You may use AI tools in this course. You do not have to.

AI use policy

  • You may use AI for coding help, brainstorming, and clearer writing.
  • Use it as a critical reader: ask it to find holes in your argument, push you to clearer thinking, and point out vague wording. Push back when it gives bland or wrong advice.
  • You are responsible for all submitted work. Verify all claims, code, and references.
  • You must be able to explain your work in your own words.
  • For lab diaries and the final report, add a short note if AI use is substantial (tool, date, and how it was used).
  • If AI output shaped your submission in an important way, acknowledge it as a source.
  • AI tools are not permitted in in-class tests.
  • Do not upload confidential, identifiable, or sensitive information.

Useful AI uses while studying:

  1. Ask for a plain-language explanation of a concept after you have checked the lecture notes.
  2. Ask whether your answer states the target population, causal contrast, outcome, estimand, and assumptions.
  3. Ask where your wording sounds associational when you mean causal.
  4. Ask for a similar practice question, then answer that question yourself.

Poor AI uses:

  1. Asking it to write an answer you have not attempted.
  2. Trusting its answer without checking the course materials.
  3. Letting it swap the causal steps for vague statistics language.
  4. Using it to memorise phrases rather than understand the logic.

VanderWeele, T. J., & Ding, P. (2017). Sensitivity analysis in observational research: Introducing the E-value. Annals of Internal Medicine, 167(4), 268–274. https://doi.org/10.7326/M16-2607

VanderWeele, T. J., Mathur, M. B., & Chen, Y. (2020). Outcome-wide longitudinal designs for causal inference: A new template for empirical studies. Statistical Science, 35(3), 437–466.