R is a free programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians and data miners for developing statistical software and data analysis.
R and its libraries implement a wide variety of statistical and graphical techniques, including linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and others. R is easily extensible through functions and extensions, and the R community is noted for its active contributions in terms of packages.
R isn’t used as general programming language. It sees more usage in Data Analysis and Data Visualisation primarily because of the powerful libraries behind it.
R works with numerous data types. Some of the most basic types to get started are:
4.5 are called numerics.4 are called integers. Integers are also numerics.TRUE or FALSE) are called logical.As is customary with every new language one learns, let’s learn how to print “Hello world!”:
"Hello world!"## [1] "Hello world!"
In its most basic form, R can be used as a simple calculator. Consider the following arithmetic operators:
# Remainder when 3 is divided by 2
3 %% 2## [1] 1
Another interesting thing to point out is that it inherently follows BODMAS:
# Inherently follows BODMAS
6 - 7 * 2 / 1 + 3## [1] -5
There are 3 logical operators:
&, && - and|, || - or! - notThe (logical) comparison operators known to R are:
< for less than> for greater than<= for less than or equal to>= for greater than or equal to== for equal to each other!= not equal to each otherMost programming languages refer to logical values as boolean values. Some expressions can return True or False values:
# AND
1 & 0## [1] FALSE
# OR
0 | 1## [1] TRUE
# NOT
!0## [1] TRUE
Here’s an example of the comparison logical operator:
3 < 4## [1] TRUE
The == sign checks for equivalence of an expression:
# == sign checks for equivalence
2 + 2 == 5## [1] FALSE
Shorthand for TRUE and FALSE are given as T and F respectively:
T == TRUE## [1] TRUE
As in other programming languages, you can store values into a variable to access it later:
# Assign the value 42 to x
x <- 42
# Print out the value of the variable x
x## [1] 42
Another way to assign a variable is by calling the assign(...) function:
# Call the assign function
assign('this_var_x',43210)
# Display 'this_var_x'
this_var_x## [1] 43210
Variable arithmetic also works but notice that it doesn’t change the original value of x:
# Divide x by 21
x/21## [1] 2
# Print x
x## [1] 42
To change the value of x, we’d have to do the following:
# Save the result of x/21 to a variable x
x <- x/21
# Print x
x## [1] 2
Let’s assume we have 3 variables where each variable here denotes the number of fruits we bought at the market:
# Fruits bought at the market
apples <- 5
oranges <- 6
tomatoes <- "ten"Let’s find out the sum of the apples and oranges bought collectively:
# Sum of apples and oranges
apples + oranges## [1] 11
What about the sum of the apples and tomatoes ?
apples + tomatoes## Error in apples + tomatoes: non-numeric argument to binary operator
This doesn’t work as expected because the two variables(apples and tomatoes) are of different datatypes(integer and character).
Variables can be compared to scalars:
# Are the number of apples less than 2?
apples < 2## [1] FALSE
Variables can be compared to other vectors as well:
# Comparing vectors
apples == oranges## [1] FALSE
We can introduce complex logical expressions by linking them with either of the three logical operators(!,|,&):
apples > 2 & oranges < 7## [1] TRUE
Functions exist to perform repeated tasks. You call a function by typing its name, followed by one or more arguments to that function in parenthesis. Let’s see an example of the sum(...) function:
# Sum of 1, 3, 5
sum(1,3,5)## [1] 9
Another example:
# Sum of apples and oranges
sum(apples, oranges)## [1] 11
Some arguments have names. For example, to repeat a value 3 times, you would call the rep(...) function and provide its times argument:
# Repeats "FIRE" 3 times
rep("FIRE", times = 3)## [1] "FIRE" "FIRE" "FIRE"
Most mathematical functions, like sqrt(...), have well defined functions:
# Square root of 16
sqrt(16)## [1] 4
help(...) brings up help for the given function. Try displaying help for the sum function:
help(sum)
# We can also bring up the help file in this way
?sumexample(...) brings up examples of usage for the given function. Try displaying examples for the min function:
example(sum)##
## sum> ## Pass a vector to sum, and it will add the elements together.
## sum> sum(1:5)
## [1] 15
##
## sum> ## Pass several numbers to sum, and it also adds the elements.
## sum> sum(1, 2, 3, 4, 5)
## [1] 15
##
## sum> ## In fact, you can pass vectors into several arguments, and everything gets added.
## sum> sum(1:2, 3:5)
## [1] 15
##
## sum> ## If there are missing values, the sum is unknown, i.e., also missing, ....
## sum> sum(1:5, NA)
## [1] NA
##
## sum> ## ... unless we exclude missing values explicitly:
## sum> sum(1:5, NA, na.rm = TRUE)
## [1] 15
One of the most important helper functions used is class(...). It helps us to determine the datatype of the variable:
# Returns the class of the object
class(apples)## [1] "numeric"
Another way to determine the class of an object follows:
# Returns a boolean value
is.numeric(tomatoes)## [1] FALSE
A syntactically valid variable name consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number. Names such as “.2way” are not valid, and neither are the reserved words. See ?make.names.
Commands are separated either by a semi-colon (;), or by a newline. Elementary commands can be grouped together into one compound expression by braces ({ and }). Comments can be put almost anywhere, starting with a hashmark (#), everything to the end of the line is a comment. Finally, print(...) prints a message.
# Commands seperated by ;
x <- 1; y <- 2; print(x + y)## [1] 3