Week 1: How to Ask a Question in Psychological Science?

Slides

Lab: Git and GitHub

Causal graph: we refer to this image in the lecture and begin reviewing causal graphs in Week 2

Background readings

None today. Recommended readings are listed at the end of this page.

Key concepts for the test(s)

Today we introduce three problems that recur throughout the course:

  • Defining the question: a causal question requires a clear contrast
  • Specifying the target population: the answer depends on who the question is about
  • Unobservability of causal effects: we never observe both sides of the contrast for one person

Before next week

Bring your laptop in Week 1. Install R before Week 2. Instructions are in Lab 2: Install R and Set Up Your IDE.


Motivating example: does social media harm adolescent wellbeing?

Orben & Przybylski (2019) reports a negative association between time spent on social media and wellbeing among British teenagers. The observed correlation was 0.04, comparable in magnitude to the association between wearing glasses and wellbeing in the same dataset.

The finding was widely reported as evidence that social media harms young people. Some investigators argued the conclusions were not strong enough: those teenagers who most frequently engaged with social media exhibited the lowest wellbeing scores, and the negative association is non-linear (Twenge et al., 2020).

Questions about whether social media use harms young people remain live. On 18 February 2026, CNN reported testimony in ongoing litigation about adolescent social media use (link). Courts, legislators, and parents are making decisions right now on the basis of findings reported in scientific journals.

Yet what do the associational findings really tell us? Can we move from associations to policy-relevant causal conclusions about whether social media use harms young people? If so, what steps would we need to take? And for whom would our conclusions generalise?

These questions will occupy us over the coming weeks. The aim of this course is to provide you with a set of skills that enable you to ask and answer causal questions using observational data, and to identify variability in response across subgroups in the population of interest.

A simple map for week 1

This week gives you a checklist for deciding whether a claim is even asking a causal question yet.

Three questions to ask of any causal claim

  1. What are the two states of the world being compared?
  2. For which population is the comparison meant to hold?
  3. What part of the contrast is necessarily missing from observation?

If a study claim does not answer the first two questions, it is still too vague. If it forgets the third question, it will slip from causal language into loose talk about associations.

Psychology begins with a question

Before we can answer whether social media harms teenagers, we must ask a question that is clear enough to be answered. "Does social media harm wellbeing?" is not yet a causal question because it does not specify what is being compared with what. A causal question compares two states of the world. This question names only one.

We will use two words that are easy to confuse. An association asks whether two variables co-occur. A causal effect, on the other hand, asks what would happen if we changed something about the world.

Consider the difference. "Is time on social media associated with lower wellbeing?" asks whether two variables co-occur. "Would adolescent wellbeing improve if we replaced two hours of nightly doom-scrolling with two hours of study?" asks what would happen under a specific comparison. The second question states a contrast (scrolling versus studying), a population (adolescents), an outcome (wellbeing), and a time horizon (nightly, over some stated period). The first question does not.

The comparison between two states is what we call a causal contrast, or contrast for short. A contrast is the simplest structure a causal question can have: state A versus state B, for a defined group, measured on a defined outcome, over a defined time horizon.

A practical template is: for population, what is the effect of intervention versus control on outcome, measured by measure after an exposure period of time horizon? The arrow of time is built in: the intervention comes first, then we measure the outcome.

Five parts of a usable causal question

  • Population: who is the question about?
  • Intervention: what state of the world are we interested in?
  • Control: what is the comparison condition?
  • Outcome: what do we measure?
  • Timing: when do we measure it?

Everything in this course follows from the demand that psychological questions specify their contrasts. This lecture introduces three problems that make specifying a contrast harder than it first appears.

Problem 1: both sides of the contrast must be precisely defined

When investigators evaluate "time on social media," what do they mean? Passive scrolling, direct messaging, and creative content production may differ in their consequences. We need an interval over which the behaviour occurs: one week, one month, one year. We need to specify what the comparison condition is: passive scrolling versus studying, versus socialising in person, versus something else. Without precise specification, the question has no answer because it has not yet been asked.

The two sides of a contrast have names. The condition whose consequences we want to evaluate is the intervention. The state we compare it against is the control. "Intervention" and "control" are placeholders: neither implies a medical procedure or a laboratory setting. They simply label the two states of the world that define our comparison.

Precision extends to what we measure. "Wellbeing" aggregates self-esteem, life satisfaction, anxiety symptoms, and depressive symptoms. Each is a distinct construct, and they do not always move together. We must define the outcome, its measure (for example, life satisfaction on a 0 to 10 scale), and the time frame over which we assess it. The consequences of scrolling for a teenager's wellbeing in five hundred years are zero because life ends.

Notice that specifying interventions and outcomes forces us to order events along a timeline. For one state to influence another, it must precede it. There must be a contrast condition and a stated time horizon, because timing affects the magnitude of interest. The effects of scrolling for five minutes for three weeks (contrasted with no social media) might differ from the effects of scrolling for five hours every day for five years.

In later weeks we extend this idea to more complex questions with more than two states of the world, or with sequences of actions over time. The same demand applies: name the states, the population, the outcome, and the time horizon.

Problem 2: the answer to a causal question depends on the population

The teenagers that Orben & Przybylski (2019) studied were a convenience sample from one country at one moment in time. Would the association of 0.04 hold in other countries? Would it hold today? The concept of "teenager" is itself vague. It lumps thirteen-year-olds, essentially children, with nineteen-year-olds, essentially adults. The answer to a causal question may systematically differ by age, gender, socioeconomic background, or parental attention.

Before we can evaluate whether social media influences wellbeing, we must specify the target population. The answer to a contrast for one population may differ from, or reverse for, another. There is no abstract answer to a causal question without reference to both the contrast conditions and a population.

The distinction between the sample population (who you studied) and the target population (who you want to learn about) is central to external validity, which we formalise in Week 4. We return to population specificity when we discuss variation in responses across subgroups (Week 6, 8, and 9) and transportability (Week 4).

Problem 3: no more than one side of the contrast can be measured for each individual

Consider Alice, who takes up two hours of doom-scrolling each night before bed. Suppose she enrolled in an experiment and was randomised to the doom-scrolling condition. The contrast is studying mathematics for two hours each night. At the end of the trial Alice reports high life satisfaction. Can we say that doom-scrolling caused Alice's life satisfaction?

We cannot, because no more than one side of the contrast can be measured for Alice in a given period. Alice followed the doom-scrolling protocol. She did not follow the mathematics protocol. We observe only one state of the world for Alice, never both.

This is the central logical problem in causal inference. A causal effect compares two possible futures for the same person, but we only ever observe one future.

We formalise this with potential outcomes notation. Let $Y_i(1)$ denote the outcome that person $i$ would experience under the intervention ($A = 1$), and $Y_i(0)$ the outcome under the control condition ($A = 0$). The individual causal effect is:

$$\delta_i = Y_i(1) - Y_i(0)$$

This quantity, $\delta_i$, is the contrast at the level of a single person: the difference between what would happen to person $i$ under treatment and what would happen under control. We observe only one of $Y_i(1)$ or $Y_i(0)$ for any individual. The individual causal effect $\delta_i$ is therefore never directly observable. This is not a limitation of our methods; it is a logical constraint. No amount of data collection, no statistical technique, and no machine learning algorithm can reveal both potential outcomes for the same person at the same time.

Pair exercise: formulating a contrast

  1. Take the headline "Screen time linked to poor sleep in teenagers."
  2. Write a causal question that specifies both sides of the contrast (screen time versus what?), a defined outcome (which aspect of sleep?), a target population, and a time horizon.
  3. Swap with your partner and critique: is the other side of the contrast well defined? Is the population specific enough?

From individuals to populations

If individual causal effects are unobservable, what can we learn? We can learn about average effects across a population. The average treatment effect (ATE) is:

$$\text{ATE} = \mathbb{E}[Y(1) - Y(0)]$$

This is the expected difference in the outcome if everyone in the target population experienced the intervention versus if everyone experienced the control condition. The ATE is a population-level quantity. It tells us what would happen on average, not what would happen to any particular person.

Causal inference contrasts counterfactual states at the population or subpopulation level. When we say "social media influences wellbeing," we mean that on average, across a defined population, one pattern of use changes life satisfaction relative to the counterfactual of another pattern. We must specify the contrast conditions and the population for this statement to have content.

A short memory aid

  • A causal question needs a contrast.
  • A causal answer is always population-specific.
  • Individual causal effects are not directly observed.

What we learned

Return to the social media question. Orben & Przybylski (2019) found an association of 0.04 between social media use and lower wellbeing. Courts and legislators are treating this as evidence of harm. We now see three reasons why the leap from association to causation fails.

First, "the influence of social media on wellbeing" is undefined until we specify the interventions (scrolling versus what?), the outcomes (which dimension of wellbeing?), and the time frame. Second, the answer to a causal question depends on the target population, and the populations that matter (thirteen-year-olds in Aotearoa New Zealand today) may differ from the population studied (British teenagers before 2019). Third, the individual causal effect is never observable; we can only recover average effects under assumptions we have not yet stated.

The lesson is that before answering a question we must ask it. Psychology begins with a clearly defined question. A well-defined causal question requires a contrast between at least two interventions, a specified outcome and time horizon, and a target population. In later weeks we add the further question of whether the observed data can identify that contrast.

Most psychological research cannot randomise the variables we care about. We cannot randomly assign people to experience trauma, adopt a religion, or lose a job. Week 2 introduces the randomised experiment as the benchmark for causal inference and the graphical tools (causal diagrams) that allow us to reason about causation when randomisation is impossible.

Pair exercise: three problems in one claim

  1. Take the claim "Religion improves mental health."
  2. Specify the contrast by naming a concrete intervention and control condition (religion versus what, exactly?).
  3. Specify the population (for whom, where, and when?).
  4. Specify the outcome, the measure, and the timing (what do we measure, and when do we measure it after the exposure period?).
  5. Rewrite the claim using the course template: for population, what is the effect of intervention versus control on outcome, measured by measure after an exposure period of time horizon?
  6. Swap with your partner and critique: is the contrast precise, is the population defensible, and does the timing make sense (intervention first, outcome later)?

Further reading

For an accessible introduction to causal inference and its history, see Pearl & Mackenzie (2018). The two core causal questions and the formal treatment of causal inference appear in Bulbulia (2024).


Lab materials: Lab 1: Git and GitHub

Bulbulia, J. A. (2024). Methods in causal inference part 1: Causal diagrams and confounding. Evolutionary Human Sciences, 6, e40. https://doi.org/10.1017/ehs.2024.35

Orben, A., & Przybylski, A. K. (2019). The association between adolescent well-being and digital technology use. Nature Human Behaviour, 3(2), 173–182. https://doi.org/10.1038/s41562-018-0506-1

Pearl, J., & Mackenzie, D. (2018). The book of why: The new science of cause and effect. Basic books.

Twenge, J. M., Haidt, J., Joiner, T. E., et al. (2020). Underestimating digital media harm. Nature Human Behaviour, 4, 346–348. https://doi.org/10.1038/s41562-020-0839-4