Psych 447: Week 3 Workbook Solutions

Joseph Bulbulia; Johannes Karl

Libraries we will need

library("tidyverse")
library("patchwork")
library("readr")
library("sjPlot")
library("MASS")

theme_set(theme_classic())

Question 1: Why is this graph not printing any output?

library("tidyverse")
ggplot(data = mtcars) + 
  aes(mpg, wt, colour=factor(cyl))

Solution, there are no layers.

Question 2. Using the mpg dataset, graph the relationship between city milage and highway mileage by year manufacture

Solution

ggplot(mpg, aes(
  x = hwy,
  y = cty,
  colour = as.factor(year)
)) +
  geom_point() +
  geom_smooth(method = loess)

Question 3. Edit this graph so that the x axis and the y axis both start at 0

# Create graph and add title
ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy)) + 
  labs(title = "Relationship between engine displacement and fuel efficiency in the mpg automobile dataset") + 
  xlab("Engine displacement in (units)") + 
  ylab("Highway miles per liter")

Solution

  ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy)) + 
  labs(title = "Relationship between engine displacement and fuel efficiency in the mpg automobile dataset") + 
  xlab("Engine displacement in (units)") + 
  ylab("Highway miles per liter") + 
  expand_limits(x = 0, y = 0)

Question 4: what is one benefits and one limitation of this graph (in which the x and y values start at 0?)

Solution

Example benefits:

We clearly see that engine displacement starts at just below
We clearly understand that no highway mileage below 10 mpg doesn’t exist.

Example weakness

Even if the thresholds are not evident, we can already see the thresholds without starting at graphs at zero (i.e. there is no information)
Zero engine displacement and zero highway miles per gallon are not physically meaningful concepts, so starting the graphs at these thresholds adds no new information.

Question 5. Which of these two graphs do you prefer and why?

g1 <-ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy, colour =  class )) + 
  labs(title = "Relationship between engine displacement and fuel efficiency in the mpg automobile dataset") + 
  xlab("Engine displacement in (units)") + 
  ylab("Highway miles per liter")

g2 <-ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy, shape =  class )) + 
  labs(title = "Relationship between engine displacement and fuel efficiency in the mpg automobile dataset") + 
  xlab("Engine displacement in (units)") + 
  ylab("Highway miles per liter")

library("patchwork")

g1 / g2 + plot_annotation(title = "Which plot do you prefer and why?", tag_levels = 'a')

Solution

Arguably, patterns are easier to detect using the colour aesthetic but there are no hard and fast rules

Question 6. add a facets to this graph for the “class” variable

g2 <-ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy, shape =  class )) + 
  labs(title = "Relationship between engine displacement and fuel efficiency in the mpg automobile dataset") + 
  xlab("Engine displacement in (units)") + 
  ylab("Highway miles per liter")

Solution

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy)) + 
  facet_wrap(~ class) + 
  labs(title = "Relationship between engine displacement and fuel efficiency in the mpg automobile dataset") + 
  xlab("Engine displacement in (units)") + 
  ylab("Highway miles per liter")

Question 7. which graph is more informative and why?

Solution

Arguably the facets make the role of class more evident by making the class indicator more salient. Where possible,i t is a good idea to declutter your graph.

Question 8. remove the legend from the facet graph above (g4)

ggplot(data = mpg) +
  geom_point(mapping = aes(x = displ, y = hwy, color = class)) +
  facet_wrap( ~ class, nrow = 2)

Solution

ggplot(data = mpg) +
  geom_point(mapping = aes(x = displ, y = hwy, color = class)) +
  facet_wrap( ~ class, nrow = 2) +
  theme(legend.position = "none")

Question 9 Convert the y variable to “numeric” and graph the relationship betweeen `religiousity` (x-axis) and `thr_mus`` (y-axis) in the ISSP dataset. Create new axis labels

Download the ISSP questionaire used in this study [here]: (https://github.com/go-bayes/psych-447/blob/main/data_raw/ISSP/ISSP_2018_Religion_Questionnaire_final_version1-2.pdf)

Note¹

# read the issp dataset for questionaire see: ISSP_2018_Religion_Questionnaire_final_version1-2.pdf
# 
# subset of data from the issp dataset
issp <- readr::read_csv2(url("https://raw.githubusercontent.com/go-bayes/psych-447/main/data/issp.csv"))

# note that we need to check our data
head(issp)

# when we do we find that classes of the variables need to be adjusted
str(issp)

Let’s get the data into shape using dplyr. Run this code below

ip <- issp %>%
  mutate(
    id = factor(id),
    thr_ath = as.factor(thr_ath),
    thr_bd = as.factor(thr_bd),
    thr_ch = as.factor(thr_ch),
    thr_hd = as.factor(thr_hd),
    thr_jw = as.factor(thr_jw),
    thr_ms = as.factor(thr_ms),
    neg_ath = as.factor(neg_ath),
    neg_bd = as.factor(neg_bd),
    neg_ch = as.factor(neg_ch),
    neg_hd  = as.factor(neg_hd),
    neg_jw = as.factor(neg_jw),
    neg_ms = as.factor(neg_ms),
    wave  = as.factor(wave),
    nzeuro = as.factor(nzeuro),
    eduyears = as.numeric(eduyears),
    male = as.factor(male),
    age = as.numeric(age),
    rightwing = as.numeric(rightwing),
    rural = as.factor(rural),
    religiosity = as.numeric(religiosity)
  )

Solution

library(ggplot2)
ggplot(data = ip, aes(y = as.numeric(thr_ms), x = religiosity, colour = wave)) + 
  geom_smooth(method = lm, fullrange = FALSE, alpha = 0.1) + 
  labs(title = "Religiosity predict Muslim acceptance post-Christchurch shootings") + 
  xlab("Level of Religiosity (scale 1-7) ") + ylab ("acceptance of Muslims (1-4)")

Question 10. Note that I have the following graph should start from 1 and run to 4 but currently runs from 0-4. Fix the graph

library(ggplot2)
ggplot(data = ip, aes(y = as.numeric(thr_ms), x = religiosity, colour = wave))  +  geom_jitter(alpha = .1) + 
  geom_smooth(method = lm, fullrange = FALSE, alpha = 0.1) +
   scale_y_continuous(limits = c(0,4))

Soution

library(ggplot2)
ggplot(data = ip, aes(y = as.numeric(thr_ms), x = religiosity, colour = wave))  +  geom_jitter(alpha = .1) + 
  geom_smooth(method = lm, fullrange = FALSE, alpha = 0.1) +
   scale_y_continuous(limits = c(1,4))

Extra question 11. Find one way of improving the the following code and explain your Solution

library(sjPlot)
plot_xtab(
    ip$thr_ms,
    ip$wave,
    show.total = F,
    show.n = F,
    geom.colors = c("lightgreen", "darkred")
  ) +
  xlab("Threatened by Muslims") +  ylab("Frequency") +
  #scale_y_continuous(limits=c(0,7)) + #theme(plot.title = element_text(size=09))
  theme(axis.text.x = element_text(angle = 20, hjust = 1))

Solution.

E.g. add a title

library(sjPlot)
plot_xtab(
    ip$thr_ms,
    ip$wave,
    show.total = F,
    show.n = F,
    geom.colors = c("lightgreen", "darkred")
  ) +
  xlab("Threatened by Muslims") +  ylab("Frequency") +
  #scale_y_continuous(limits=c(0,7)) + #theme(plot.title = element_text(size=09))
  theme(axis.text.x = element_text(angle = 20, hjust = 1)) + 
  labs(title = "Comparison of sample responses to Muslim threat in 2018 an 2019")

Data were collected with Barry Milne and Martin Van dataset; the data are only authorised for the purposes of teaching.↩︎

Week 3 Workbook Solutions

Libraries we will need

Question 1: Why is this graph not printing any output?

Solution, there are no layers.

Question 2. Using the mpg dataset, graph the relationship between city milage and highway mileage by year manufacture

Solution

Question 3. Edit this graph so that the x axis and the y axis both start at 0

Solution

Question 4: what is one benefits and one limitation of this graph (in which the x and y values start at 0?)

Solution

Question 5. Which of these two graphs do you prefer and why?

Solution

Question 6. add a facets to this graph for the “class” variable

Solution

Question 7. which graph is more informative and why?

Solution

Question 8. remove the legend from the facet graph above (g4)

Solution

Question 9 Convert the y variable to “numeric” and graph the relationship betweeen `religiousity` (x-axis) and `thr_mus`` (y-axis) in the ISSP dataset. Create new axis labels

Solution

Question 10. Note that I have the following graph should start from 1 and run to 4 but currently runs from 0-4. Fix the graph

Soution

Extra question 11. Find one way of improving the the following code and explain your Solution

Solution.

Corrections

Reuse

Citation

Week 3 Workbook Solutions

Libraries we will need

Question 1: Why is this graph not printing any output?

Solution, there are no layers.

Question 2. Using the mpg dataset, graph the relationship between city milage and highway mileage by year manufacture

Solution

Question 3. Edit this graph so that the x axis and the y axis both start at 0

Solution

Question 4: what is one benefits and one limitation of this graph (in which the x and y values start at 0?)

Solution

Question 5. Which of these two graphs do you prefer and why?

Solution

Question 6. add a facets to this graph for the “class” variable

Solution

Question 7. which graph is more informative and why?

Solution

Question 8. remove the legend from the facet graph above (g4)

Solution

Question 9 Convert the y variable to “numeric” and graph the relationship betweeen religiousity (x-axis) and `thr_mus`` (y-axis) in the ISSP dataset. Create new axis labels

Solution

Question 10. Note that I have the following graph should start from 1 and run to 4 but currently runs from 0-4. Fix the graph

Soution

Extra question 11. Find one way of improving the the following code and explain your Solution

Solution.

Corrections

Reuse

Citation

Question 9 Convert the y variable to “numeric” and graph the relationship betweeen `religiousity` (x-axis) and `thr_mus`` (y-axis) in the ISSP dataset. Create new axis labels