Week 11 workbook and solutions

Joseph Bulbulia https://josephbulbulia.netlify.app (Victoria University of Wellington)https://www.wgtn.ac.nz
05-23-2021

Import the jittered NZAVS dataset

Show code
Show code
# read data

nz_0 <- as.data.frame(readr::read_csv2(
  url(
    "https://raw.githubusercontent.com/go-bayes/psych-447/main/data/nzj.csv"
  )
))

# to re-level kessler 6 variables
f <-
  c(
    "None Of The Time",
    "A Little Of The Time",
    "Some Of The Time",
    "Most Of The Time",
    "All Of The Time"
  )

# get data into shape
nz_cr <- nz_0 %>%
  dplyr::mutate_if(is.character, factor) %>%
  dplyr::select(
    -c(
      SWB.Kessler01,
      SWB.Kessler02,
      SWB.Kessler03,
      SWB.Kessler04,
      SWB.Kessler05,
      SWB.Kessler06
    )
  ) %>% 
  dplyr::mutate(Wave = as.factor(Wave)) %>%
  dplyr::mutate(FeelHopeless = forcats::fct_relevel(FeelHopeless, f)) %>%
  dplyr::mutate(FeelDepressed = forcats::fct_relevel(FeelDepressed, f)) %>%
  dplyr::mutate(FeelRestless = forcats::fct_relevel(FeelRestless, f)) %>%
  dplyr::mutate(EverythingIsEffort = forcats::fct_relevel(EverythingIsEffort, f)) %>%
  dplyr::mutate(FeelWorthless = forcats::fct_relevel(FeelWorthless, f)) %>%
  dplyr::mutate(FeelNervous = forcats::fct_relevel(FeelNervous, f)) %>%
  dplyr::mutate(Wave = as.factor(Wave)) %>%
  dplyr::mutate(male_id = as.factor(Male)) %>%
  dplyr::mutate(date = make_date(year = 2009, month = 6, day = 30) + TSCORE) %>%
  dplyr::mutate(
    FeelWorthless_int = as.integer(FeelWorthless),
    FeelNervous_int =  as.integer(FeelNervous),
    FeelHopeless_int =  as.integer(FeelHopeless),
    EverythingIsEffort_int =  as.integer(EverythingIsEffort),
    FeelRestless_int =  as.integer(FeelRestless),
    FeelDepressed_int =  as.integer(FeelDepressed),
    HLTH.Fatigue_int = as.integer(HLTH.Fatigue + 1)
  ) %>%
  dplyr::mutate( yearS = (TSCORE - min(TSCORE)/365) ) %>%
  dplyr::mutate(KESSLER6sum = as.integer(KESSLER6sum))

## if you do anything with covid (warning, such a model would be tricky)

ord_dates_class <- c("Baseline",
                     "PreCOVID",
                     "JanFeb",
                     "EarlyMarch",
                     "Lockdown",
                     "PostLockdown")

nzl <- nz_cr %>%
  dplyr::filter(YearMeasured == 1) %>%
  dplyr::mutate(yearS = (TSCORE - min(TSCORE)/365)) %>%
  dplyr::mutate(WSCORE = as.factor(WSCORE)) %>%
  dplyr::mutate(Covid_Timeline =
                  as.factor(ifelse(
                    TSCORE %in% 3896:3921,
                    # feb 29 - march 25th
                    "EarlyMarch",
                    ifelse(
                      TSCORE %in% 3922:3954,
                      "Lockdown",
                      #march 26- Mon 27 April 2020
                      ifelse(
                        TSCORE > 3954,
                        # after april 27th 20202
                        "PostLockdown",
                        ifelse(
                          TSCORE %in% 3842:3895,
                          # jan 6 to feb 28
                          "JanFeb",
                          ifelse(TSCORE %in% 3665:3841 &
                                   Wave == 2019,
                                 "PreCOVID",
                                 "Baseline"  # 3672 TSCORE or  20 July 2019))))))))
                          )
                        )
                      )
                    ))))%>%
  dplyr::mutate(Covid_Timeline = forcats::fct_relevel(Covid_Timeline, ord_dates_class))%>%
  dplyr::mutate(Id = factor(Id))

Assessment: briefly answer three of the following five questions.

  1. What is the difference between prediction and causal inference: give an example of how regression may be use useful for each task. Make sure your explanation includes a DAG.

  2. Explain “collider confounding” (or “collider bias”), and explain how collider bias can spoil inference. Make sure that your explanation includes a DAG.

  3. Using the nzl dataset, select at least five demongraphic/ideological variables that might be related to an exposure variable and and outcome variable of your choice. Create a DAG and identify which variables you should include to obtain an unbiased estimate of the causal effect of your exposure variable on your outcome variable. Test your model and interpret the results.

  4. Write a DAG for one of your previous workbook regression analyses; Explain your DAG. Redo your analysis to obtain an unbiased causal inference (conditional on your DAG). Compare the results of your current model with the results of your previous model and note any differences in interpretation.

  5. Examine the causal assumptions of a psychological theory for which you have data. Test the causal assumptions of this model. Make sure to draw a DAG. Consider simulating data if/where relevant.

Marking criteria

  1. Clarity, accuracy, initiative, and organisation in the descriptive component of your work.

  2. Clarity, accuracy, initiative, and organisation in your DAG and, if relevant, your regression model(s).

  3. Accuracy and insight in the interpretation of DAG, and if relevant, your regression model(s).

Supplement: NZ-jitter information:

Link to the NZAVS data dictionary is here

Link to questions only here

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC-SA 4.0. Source code is available at https://go-bayes.github.io/psych-447/, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Bulbulia (2021, May 23). Psych 447: Week 11 workbook and solutions. Retrieved from https://vuw-psych-447.netlify.app/workbooks/W_11_s/

BibTeX citation

@misc{bulbulia2021week,
  author = {Bulbulia, Joseph},
  title = {Psych 447: Week 11 workbook and solutions},
  url = {https://vuw-psych-447.netlify.app/workbooks/W_11_s/},
  year = {2021}
}