1 What is R ?

R is a free programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians and data miners for developing statistical software and data analysis.

R and its libraries implement a wide variety of statistical and graphical techniques, including linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and others. R is easily extensible through functions and extensions, and the R community is noted for its active contributions in terms of packages.

R isn’t used as general programming language. It sees more usage in Data Analysis and Data Visualisation primarily because of the powerful libraries behind it.

2 Install R and RStudio

3 Scalars

3.1 Data Types

R works with numerous data types. Some of the most basic types to get started are:

  • Text (or string) values are called characters.
  • Decimals values like 4.5 are called numerics.
  • Natural numbers like 4 are called integers. Integers are also numerics.
  • Boolean values (TRUE or FALSE) are called logical.

3.2 Strings

As is customary with every new language one learns, let’s learn how to print “Hello world!”:

"Hello world!"
## [1] "Hello world!"

3.3 Arithmetic

In its most basic form, R can be used as a simple calculator. Consider the following arithmetic operators:

  • Addition: +
  • Subtraction: -
  • Multiplication: *
  • Division: /
  • Exponentiation: ^
  • Modulo: %% For example:
# Remainder when 3 is divided by 2
3 %% 2
## [1] 1

Another interesting thing to point out is that it inherently follows BODMAS:

# Inherently follows BODMAS
6 - 7 * 2 / 1 + 3
## [1] -5

3.4 Logical Values

There are 3 logical operators:

  • &, && - and
  • |, || - or
  • ! - not

The (logical) comparison operators known to R are:

  • < for less than
  • > for greater than
  • <= for less than or equal to
  • >= for greater than or equal to
  • == for equal to each other
  • != not equal to each other

Most programming languages refer to logical values as boolean values. Some expressions can return True or False values:

# AND
1 & 0
## [1] FALSE
# OR
0 | 1
## [1] TRUE
# NOT
!0
## [1] TRUE

Here’s an example of the comparison logical operator:

3 < 4
## [1] TRUE

The == sign checks for equivalence of an expression:

# == sign checks for equivalence
2 + 2 == 5
## [1] FALSE

Shorthand for TRUE and FALSE are given as T and F respectively:

T == TRUE
## [1] TRUE

4 Variables

4.1 Assignment

As in other programming languages, you can store values into a variable to access it later:

# Assign the value 42 to x
x <- 42
# Print out the value of the variable x
x
## [1] 42

Another way to assign a variable is by calling the assign(...) function:

# Call the assign function
assign('this_var_x',43210)
# Display 'this_var_x'
this_var_x
## [1] 43210

4.2 Arithematic

Variable arithmetic also works but notice that it doesn’t change the original value of x:

# Divide x by 21
x/21
## [1] 2
# Print x
x
## [1] 42

To change the value of x, we’d have to do the following:

# Save the result of x/21 to a variable x
x <- x/21
# Print x
x
## [1] 2

Let’s assume we have 3 variables where each variable here denotes the number of fruits we bought at the market:

# Fruits bought at the market
apples <- 5
oranges <- 6
tomatoes <- "ten"

Let’s find out the sum of the apples and oranges bought collectively:

# Sum of apples and oranges
apples + oranges
## [1] 11

What about the sum of the apples and tomatoes ?

apples + tomatoes
## Error in apples + tomatoes: non-numeric argument to binary operator

This doesn’t work as expected because the two variables(apples and tomatoes) are of different datatypes(integer and character).

4.3 Logical

Variables can be compared to scalars:

# Are the number of apples less than 2?
apples < 2
## [1] FALSE

Variables can be compared to other vectors as well:

# Comparing vectors
apples == oranges
## [1] FALSE

We can introduce complex logical expressions by linking them with either of the three logical operators(!,|,&):

apples > 2 & oranges < 7
## [1] TRUE

5 Introduction to Functions and Helpers

5.1 Examples of functions

Functions exist to perform repeated tasks. You call a function by typing its name, followed by one or more arguments to that function in parenthesis. Let’s see an example of the sum(...) function:

# Sum of 1, 3, 5
sum(1,3,5)
## [1] 9

Another example:

# Sum of apples and oranges
sum(apples, oranges)
## [1] 11

Some arguments have names. For example, to repeat a value 3 times, you would call the rep(...) function and provide its times argument:

# Repeats "FIRE" 3 times
rep("FIRE", times = 3)
## [1] "FIRE" "FIRE" "FIRE"

Most mathematical functions, like sqrt(...), have well defined functions:

# Square root of 16
sqrt(16)
## [1] 4

5.2 Helpers

help(...) brings up help for the given function. Try displaying help for the sum function:

help(sum)
# We can also bring up the help file in this way
?sum

example(...) brings up examples of usage for the given function. Try displaying examples for the min function:

example(sum)
## 
## sum> ## Pass a vector to sum, and it will add the elements together.
## sum> sum(1:5)
## [1] 15
## 
## sum> ## Pass several numbers to sum, and it also adds the elements.
## sum> sum(1, 2, 3, 4, 5)
## [1] 15
## 
## sum> ## In fact, you can pass vectors into several arguments, and everything gets added.
## sum> sum(1:2, 3:5)
## [1] 15
## 
## sum> ## If there are missing values, the sum is unknown, i.e., also missing, ....
## sum> sum(1:5, NA)
## [1] NA
## 
## sum> ## ... unless  we exclude missing values explicitly:
## sum> sum(1:5, NA, na.rm = TRUE)
## [1] 15

5.3 Tests

One of the most important helper functions used is class(...). It helps us to determine the datatype of the variable:

# Returns the class of the object
class(apples)
## [1] "numeric"

Another way to determine the class of an object follows:

# Returns a boolean value
is.numeric(tomatoes)
## [1] FALSE

6 Miscellaneous

A syntactically valid variable name consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number. Names such as “.2way” are not valid, and neither are the reserved words. See ?make.names.

Commands are separated either by a semi-colon (;), or by a newline. Elementary commands can be grouped together into one compound expression by braces ({ and }). Comments can be put almost anywhere, starting with a hashmark (#), everything to the end of the line is a comment. Finally, print(...) prints a message.

# Commands seperated by ;
x <- 1; y <- 2; print(x + y)
## [1] 3