Course basics

Goals: (1) Download Rstudio (2) Get Git (3) Teach you the essentials of Rmarkdown (4) Integrate (1)-(3).

Joseph Bulbulia https://josephbulbulia.netlify.app (Victoria University of Wellington)https://www.wgtn.ac.nz , Johannes Karl https://johannes-karl.com (Victoria University of Wellington)https://www.wgtn.ac.nz
2021-FEB-23

To do:

Download R

R is freely available for download at: https://www.r-project.org/ Please make sure that you have a current version of R-studio desktop on your machine

Download R-Studio desktop

Rstudio desktop is also freely available for download at:

https://rstudio.com/products/rstudio/download/#download

Please make sure that you have the current version of R-studio desktop on your machine

Read about Rmarkdown

Read Danielle Navarro’s brief account of Rmarkdown here

This is a thorough rmarkdown workshop. You might quickly feel lost. Don’t worry about. We only expect you to acquire the basics this week.

Rmarkdown workshop

Who is this course for?

For those of you who always wanted to learn R but never thought they could, this course is for you.

How will you benefit from this course?

You’ll learn how to use R, and Github, and we’ll teach you the fundamentals of statistics with a focus thinking not on rules.

You’ll learn how to learn – that is, how to obtain the resources you need to address a problem at hand.

By the end of the course, you’ll know how to:

Data skills:

  1. perform your data analysis in R
  2. document your analysis and collaborate in GitHub
  3. create a publication-ready article, with tables and graphs
  4. create a free personal website on github.

Statistical skills:

  1. learn the importance of knowing your question
  2. learn how to collect data that bears on your question
  3. learn how to explore your data visually
  4. learn how to avoid common modelling pitfalls
  5. learn how to improve your inference using multi-level models

Our approach to teaching and learning

This course is designed to provide you with basic understanding, useful tips, and some guide rails for learning. Our main task is to give you the confidence, and the inspiration, for independent learning.

What is R?

In a nutshell

R is a free programming language and software environment for statistical computing (for download links see here: Windows, Mac).

History

R is the brainchild of Ross Ihaka and [Robert Gentleman](https://en.wikipedia.org/wiki/Robert_Gentleman_(statistician). It was created at the University of Auckland, where Ross Ihaka remains a professor of statistics.

Purpose

R was conceived to be a flexible language for data analysis usable by researchers. Since the initial beta release of R in 2000 the language has gained substantial popularity inside and outside of academia (have a look at this blog post for an interesting analysis). New versions of R are released periodically and can be downloaded and installed to replace the older R version.

What is R Studio?

The IDE

The are many ways for using R on your computer. For the purposes of getting started, we will be using the Integrated Development Environment (IDE): R Studio.

R Studio provides an interface with a number of user-friendly options, including a separate console and editor that has various help and syntax-auto-complete functions, and various tools for plotting, history, data visualization, debugging and work space management. It is important to remember that R and R Studio are not the same thing.

A quick walk through R Studio

GitHub

What is Git/GitHub?

Github is a version control system. It is similar to Google docs, though for code. It is useful for collaboration because code easily breaks. It is only rarely possible to simultaneously work in real time on the same code because it will eventually break. Where and how is not easy to assess.

A second function of GitHub is that it allows us to reconstruct histories of analysis. This is critical for open and reproducible science. This is the main function that we will be examining here.

A third function, which pertains to single users, is that when writing code you can rewind and recover from your mistakes. This will save you a whole lot of time in the end.

Note that GitHub has an interface with Rstudio. You will be using GitHub with Rstudio throughout this course.

Install Git

We suggested installing the educational version because this will allow you to have private repositories.

If you haven’t done that, but want to get started you can open a free account and retrospectively add your educational account later.

PRO TIP Pick a user name that will be OK for professional purposes. If in doubt use your name.

Directions for installing Git can be found http://github.com

Create a repository

First create a repository on GitHub

Next copy the location

Then open a new project in Rstudio as a GitHub project

Then open an Rmarkdown document, write and save it

First, make sure that Rmarkdown is installed:

# run this code
if (!requireNamespace("devtools"))
  install.packages('devtools')
devtools::install_github('rstudio/rmarkdown')

Next, create a document

Make sure you save your document

Press + S is the command for “save”

Then commit your Rmarkdown document by including a message, and “pushing” to GitHub

Note that we don’t want want to push .Rproj files to GitHub (this will mess up your collaborations), so I edited my .gitignore file.

Typically you won’t want to be pushing large html files back and forth to GitHub (that can cause GitHub to freeze).

You can edit your gitingore file by adding a * like so:

/*.html

see: https://git-scm.com/docs/gitig

Next commit!

Voila!

Extra information

JB’s recommendations for using Git and Rstudio

This is a very good tutorial on github and Rstudio: link

Video link

A very brief setup video for Mac Users Link

JK’s recommendations for using Git and Rstuio

setup

RMarkdown

What is Rmarkdown?

In the example above we breezed through Rmarkdown without exampling it. What is Rmarkdown?

Rmarkdown is a format for combining data-analysis with ordinary writing using a simple markup language.

The Rmarkdown code we used to write the opening paragraph looks like this.

## To do
Read Danielle Navarro's brief account of 
Rmarkdown [here](https://slides.djnavarro.net/starting-rmarkdown/#8)

The:

##

makes a heading. Then we write as we ordinarily would write:

Read Danielle Navarro’s brief account of Rmarkdown

and we include a link by typing

[here](https://slides.djnavarro.net/starting-rmarkdown/#8)

Think of rmarkdown as writing in word but without having to use your mouse all the time, and you can write up your analysis while writing in a one0stop shop.

Rmarkdown is just an efficient method for composing text without having to reach for your mouse, and a way of documenting and reporting your analysis

Why is Rmarkdown useful?

Rmarkdown merges two very powerful ideas: 1. R as a coding based tool to make your analysis repeatable; 2. markdown an approach to writing text that allows for the direct embedding of code output.

This is an immensely powerful approach that can be used for everything, from writing research papers, to writing complex technical documentation. This website is written Rmarkdown.

You will be creating a website similar to Johannes Karl’swebsite and Joseph Bulbulia’s websiteand you will do this by written using Rmarkdown).

You might think that writing in R markdown is only a nice technical trick for people really into coding, but in reality it addresses a central problem of statistical analysis.

The majority of errors in quantitative research papers (some meta-researchers indicate values as high as 80%) are human errors in transcribing values from the statistical software they are using to the final document (for a marginally entertaining story around this issue see this post).

Extra information

coding tips Rmarkdown website

JB’s recommendation for a very short introduction to Rmarkdown: https://rpubs.com/bpbond/626346

Tips and tricks (JB)

One day, someone might ask you to collaborate in \(\LaTeX\) (pronounced “Lay-Tek”). \(\LaTeX\) is a document preparation system developed by Leslie Lamport in the 1980s that uses \(\TeX\), a typesetting system that Donald Knuth developed in the 1970s to create mathematical documents. Writing in LaTeX is only a little more complicated than writing in markdown. For example, instead of writing # Heading, ## Subheading, ## Subsubheading, you would write \section{Heading} \subsection{Subheading}, \subsubsection{Subsubheading}. However, the principles of mouseless composition that make Rmarkdown so nice, also make LaTeX nice. Rmarkdown shares features for bibliographic referencing with LaTeX that we’ll cover in later weeks. For now, since we are teaching you about Rmarkdown, we thought it’d be useful to teach about LaTeX too. Stay tuned for more.

Final thought: why quantitative psychology has to change (Why can’t I use SPSS, JK)

Quantitative psychology has long struggled with replicability of it’s results both in substantive and also statistical areas. Concerns around these topics have already been raised on works by authors such as Joseph Banks Rhine the founder of modern parapsychology in the 1930s. Numerous authors, even at the time criticized both methods of the experiment and of the analysis [@gulliksenExtraSensoryPerceptionWhat1938]. In modern times, Deryl Bem’s article “Feeling the Future” that reported evidence in favor of Extra Sensory Perception revived this debate and led to an increased uptake of Open Science methods. Importantly, this is not only an issue in psychology, but instead affects all quantitative fields such as biology, chemistry, and physics. Out of the many issues that are addressed as part of the open science movement (if you are interested in getting active in it have a look at ANZORN) we will focus mostly on aspects of reproducability in analysis.

Until recently IBMs SPSS (Statistical Package for the Social Sciences), which originally launched in 1968 dominated the research space in psychology. If you never had the fortune of working with SPSS this is what it looked like:

SPSS presented the user with a GUI (Graphical User Interface) through which they could run tests on their data. The big issue was that each statistical test has many different options researchers can choose (you will often hear people talk about researcher degrees of freedom) and a GUI makes it very difficult to accurately record every small setting a researcher has chosen. As a work around researchers could either store their output of the analysis which recorded some settings, but even for moderately complex analysis this output could stretch in the hundredth of pages. Alternatively, researchers could save the underlying code that SPSS used, but this was also very clunky and extremely arcane to understand. To give you a sense of scope below you see a snippet from a widely used analysis in SPSS aimed at examining the similarity of factor structures across groups. This code has a total of 130 lines that researchers would have needed to largely enter by hand and double check for any potential coding errors.

Additionally, some changes made by researchers were extremely difficult to account for. For example, when a researcher re-coded a variable say reversing its direction there was no way of knowing that this had taken place if you later looked at the data set. Together with the rise in complex analysis in psychology this has led to a steady decline in the use of SPSS and most psychology departments, as well as private, and governmental stake holder now require a certain fluency in R or similar coding based languages.

Final thought

Lecture slides

Click here to go to the lecture slides.

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC-SA 4.0. Source code is available at https://go-bayes.github.io/psych-447/, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Bulbulia & Karl (2021, Feb. 23). Psych 447: Course basics. Retrieved from https://vuw-psych-447.netlify.app/posts/1_1/

BibTeX citation

@misc{bulbulia2021course,
  author = {Bulbulia, Joseph and Karl, Johannes},
  title = {Psych 447: Course basics},
  url = {https://vuw-psych-447.netlify.app/posts/1_1/},
  year = {2021}
}