Defining variables

There are three different ways to define variables:

x <- 1
y = 2
assign("z", 3)

Testing data types

x <- 1
is.xxxxxx(x) #Tests if value is of a specific type
is(x) #The first value returned is the data type

Converting data types

x <- as.xxxxxxxx(value)
Valid types include integer, numeric, character, logical and factor

Logical operators

x == y #Is equal to
x != y #Is not equal to
x > y #Is greater than
x < y  #Is less than
x <! y #Is not less than
x >! y #Is not greater than
x == y | x == z #Or
x == y & x == z #And
x %in% vector/list

Converting to date

df$column <- as.Date(df$column, format = "%d/%m/%Y") #Use this if the column is a character
df$column <- as.Date(df$column, origin = as.Date("01/01/1970", format = "%d/%m/%Y")) #Use this if the column is a number

Date format codes can be found online


Installing & loading packages

install.packages("name of package") #You only need to call this once
library(nameofpackage) #Put this line in
#any script that uses the package

Creating & subsetting vectors

x <- c(10,20,30,40,50)
x[1:3] #Returns values at indices 1 to 3
x[c(1,2,4)] #Returns values at indices 1, 2, and 4

Creating & subsetting dataframes

df <- data.frame(numeric_col = c(1,2,3), character_col = c("My", "name", "is"), stringsAsFactors = TRUE/FALSE)
df$numeric_col #Returns just numeric_col
df[,1] #Returns all rows in the first column
df[1,] #Returns only the first row

Loading data

Load data via .csv or .xlsx

read.csv("path_to_file", stringsAsFactors = TRUE/FALSE, header = TRUE/FALSE) #Specify whether you want columns
#with strings as factors and
#if the file has headers
library(readxl) #read_excel requires the readxl package
read_excel("path_to_file", col_names = TRUE/FALSE, sheet = "sheetname"/sheet_index)

Filtering

dataframe[dataframe$column ==/!=/</>/%in% value,]
subset(dataframe, dataframe$column ==/!=/</>/%in% value)

You can use any logical operator in your filter criteria Note: subset and the [] method may return the results in different data structures!


Plotting

Use plots to visualise data and its relationships

plot(x = df$column, y = df$column, main = "Title of plot", xlab = "X axis label", ylab = "Y axis label", pch = 1-25) #Creates a plot of x against y
hist(df$column) #Creates a histogram from df$column