5  Appendix: Introduction to R?

5.1 R

For conducting analyses with data sets of hundreds to thousands of observations, calculating by hand is not feasible and you will need a statistical software. R is one of those. R can also be thought of as a high-level programming language. In fact, R is one of the top languages to be used by data analysts and data scientists. There are a lot of analysis packages in R that are currently developed and maintained by researchers around the world to deal with different data problems. Most importantly, R is free! In this section, we will learn how to use R to conduct basic statistical analyses.

5.2 IDE

5.2.1 Rstudio

RStudio is an integrated development environment (IDE) designed specifically for working with the R programming language. It provides a user-friendly interface that includes a source editor, console, environment pane, and tools for plotting, debugging, version control, and package management. RStudio supports both R and Python and is widely used for data analysis, statistical modeling, and reproducible research. It also integrates seamlessly with tools like R Markdown, Shiny, and Quarto, making it popular among data scientists, statisticians, and educators.

5.2.2 Visual Studio Code (VS Code)

VS Code is a versatile code editor that supports multiple programming languages, including R. With the R extension for VS Code, users can write and execute R code, access R’s console, and utilize features like syntax highlighting, code completion, and debugging. While not as specialized as RStudio for R development, VS Code offers a lightweight alternative with extensive customization options and support for various programming tasks.

5.2.3 Positron

Positron IDE is the next-generation integrated development environment developed by Posit, the company behind RStudio. Designed to be a modern, extensible, and language-agnostic IDE, Positron builds on the strengths of RStudio while supporting a broader range of languages and workflows, including R, Python, and Quarto.

5.3 RStudio Layout

RStudio consists of several panes: - Source: Where you write scripts and markdown documents. - Console: Where you type and execute R commands. - Environment/History: Shows your variables and command history. - Files/Plots/Packages/Help/Viewer: For file management, viewing plots, managing packages, accessing help, and viewing web content.

5.4 R Scripts

R scripts are plain text files containing R code. You can create a new script in RStudio by clicking File > New File > R Script.

5.5 R Help

Use ?function_name or help(function_name) to access help for any R function. For example:

?mean
help(mean)

5.6 R Packages

Packages extend R’s functionality. Install a package with:

install.packages("package_name")

Load a package with:

library(package_name)

5.7 R Markdown

R Markdown allows you to combine text, code, and output in a single document. Create a new R Markdown file in RStudio via File > New File > R Markdown....

Recently, the posit team has developed a new version of the R Markdown called quarto document, with the file extension .qmd. It is still under rapid development.

5.8 Vectors

Vectors are the most basic data structure in R.

x <- c(1, 2, 3, 4, 5)
x
[1] 1 2 3 4 5

You can perform operations on vectors:

x * 2
[1]  2  4  6  8 10

5.9 Data Sets

Data frames are used for storing data tables. Create a data frame:

df <- data.frame(Name = c("Alice", "Bob"), Score = c(90, 85))
df
   Name Score
1 Alice    90
2   Bob    85

You can import data from files using read.csv() or read.table().


This appendix is adapted from Why R?.