# Introduction to R

## Basics

In its simplest form, R can be used as a calculator.

1 + 1
##  2
2 + log(2)
##  2.693147
sin(pi / 2) # pi is a predefined constant in R (3.141593)
##  1

## Variables

Printed results are interesting, but not very useful if we can not reuse them.

In R, two operators (= , <- ) can be used to assign a value to a variable. However, by convention <- is preferred. Use keys alt + - to automatically insert the <- symbol.

x = 120 # Works, but not recommended by convention
x <- pi
x <- "hi"

## Variable names

You have to be careful when choosing a variable name. There are few rules to respect.

1. Can contain both letters and numbers, but must start with a letter (good: x123, bad: 1x23)

2. Should not contain special characters (à, è, æ, ø, å, ?, %, ’, \$, etc.)

3. Should not be the same as a R function or predefined variable name (cos, sin, pi, min, etc.)

## Basic data types

• numeric: 12.3, pi, 1, etc.
• character: "a", "text", "long text with space"
• boolean/logical: TRUE/1, FALSE/0
• date: 2016-01-31 (use ISO format, large -> small)

We can use the class() function to get the type of a variable.

x <- -32.234 # Create variable x and assign the value -32.234
class(x) # What is the class of x?
##  "numeric"

## Basic data types

#### Exercise #1

What is the class of x?

x <- "2015-05-24"

## Vectors

• Vectors are arrays of successive elements. Elements of a vector can be, for example: numeric, character, logical, date.
• Vectors are created using the c() function with , used to separate elements.
x <- c(1, 4, -1, 10, 5.4) # Vector of numeric values
y <- c("This", "is", "a", "vector") # Vector of character values
z <- c(FALSE, TRUE, FALSE, FALSE, TRUE) # Vector of logical values

class(z)
##  "logical"

## Exercise

#### Question #1

What is the class of this vector?

x <- c(1, "2", -3, "pi")

## Elements of vector

• Particular elements of a vector can be accessed using the operator.
• The first element of a vector is located at position 1.
x <- c(1, 4, -1, 10, 5.4) # Vector of numeric values
x # Value at the first position
##  1
x + x # 1 + 10
##  11

## Exercise

Given the vector x:

x <- c(1, 4, -1, 10, 5.4)

#### Question #1

What is the value of:

x

## Exercise

Given the vector x:

x <- c(1, 4, -1, 10, 5.4)

#### Question #2

What is the value of:

x[-1]

## Elements of vector

We can also apply a function to all elements of a vector.

x <- c(1, 4, -1, 10, 5.4)

x / 2   # Divide all elements of x by 2
##   0.5  2.0 -0.5  5.0  2.7
sin(x)  # sin value of all elements of x
##   0.8414710 -0.7568025 -0.8414710 -0.5440211 -0.7727645

## Elements of vector

Vectors and matrices of the same size can be operated on arithmetically, with the operations acting element-wise.

x <- c(1, 3, 5, 1)  # vector 1
y <- c(3, 1, 4, 1)  # vector 2

x + y
##  4 4 9 2

## Exercise

#### Question #1

What will happen when executing the following code:

x <- c(1, 3, 5, 1)  # vector 1
y <- c(3, 1)        # vector 2

x + y

## Exercise

#### Question #2

What will happen when executing the following code:

x <- c(1, 3, 5, 1)  # vector 1
y <- c(3, 1, 4)     # vector 2

x + y

## Matrix

• A matrix is a 2-dimensional entity containing a number of rows and columns.
• The upper left element is located at line 1 and column 1.

$A_{m,n} = \begin{bmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} \end{bmatrix}$

## Matrix

To understand how matrix work in R, lets create the following matrix A.

set.seed(1234)

# Create a 5 by 5 matrix
A <- matrix(data = sample(25), nrow = 5, ncol = 5)

A
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    3   13   11   16    6
## [2,]   15    1    8   25   18
## [3,]   24    5    4   10   21
## [4,]   14   12   17    2   22
## [5,]   19    9   20    7   23

## Matrix indexes

Elements of a matrix can be accessed using the following scheme:

A[m, n] # Element located a line m and column n

For example, the value at position $$A_{3,1}$$ is:

A[3, 1] # Return the value at line 3 and column 1
##  24

## Matrix indexes

If the line or row number is omitted, the corresponding vector at line or row will be returned.

A[1, ] # Returns all elements from the first line
##   3 13 11 16  6
A[, 4] # Returns all elements from the 4th column
##  16 25 10  2  7

## Exercise

#### Question #1

What is the value of:

A

#### Question #2

What is the value of:

A[, -1]

## Missing values

Missing values in R are represented by NA (not available). All operations performed on NA values will return the value NA.

# Create a vector with one NA at the 4th position
x <- c(1, 2, 3, NA, 5)

x + 1
##   2  3  4 NA  6

NA can be easily identified with the is.na() function.

is.na(x)
##  FALSE FALSE FALSE  TRUE FALSE

## Operations on NA

Most R functions will return NA if such values are present in the data.

x <- c(1, 2, 3, NA, 5)
mean(x) # Will returns NA
##  NA

You have to explicitly tell R what to do with NA. This is often done using the na.rm argument.

mean(x, na.rm = TRUE) # Remove NA values before calculating the mean
##  2.75

## Logical operators

Logical operators are another important concept. When using these operators in R, TRUE or FALSE will be always returned.

Operator Description
< less than
> greater than
<= lesser than or equal to
>= greater than or equal to
== equal to
!= not equal to
$$|$$ entry wise or
& entry wise and
! not

## Some examples

1 < 0
##  FALSE
x <- c(1, 2, 3, 4, 5, 6)
x < 4
##   TRUE  TRUE  TRUE FALSE FALSE FALSE
x != 4
##   TRUE  TRUE  TRUE FALSE  TRUE  TRUE

## Boolean

It is important to know that: 0 == FALSE and 1 == TRUE.

0 == FALSE # 0 equal to FALSE
##  TRUE
1 == TRUE  # 1 equal to TRUE
##  TRUE

Logical operators can also be used on characters or strings.

"Hi" == "hi" # R is case sensitive
##  FALSE
"A" > "B" # R uses alphabet position (index) to compare letters
##  FALSE
"AB" < "AC"
##  TRUE

## Logical table

Logical operators can also be combined.

A B AND (A & B) OR (A | B) NOT (!A)
FALSE FALSE FALSE FALSE TRUE
FALSE TRUE FALSE TRUE TRUE
TRUE FALSE FALSE TRUE FALSE
TRUE TRUE TRUE TRUE FALSE

## Examples

Logical operators can also be combined.

(1 < 2) & (2 < 1) # If 1 < 2 AND 2 < 1.  TRUE and FALSE = FALSE
##  FALSE
(1 < 2) | (2 < 1) # If 1 < 2 OR 2 < 1.  TRUE or FALSE = TRUE 
##  TRUE

## Exercise

#### Question #1

What will be the result of the following code?

!(10 < 2) | (2 > 1) & ("B" > "A") 

## Functions

A function is a procedure that performs different commands and returns only the computed result.

The user does not see the source code of the function since he/she is (usually) only interested by the result.

## Schema of a function

For example, whitin the following code, the mean() and rnorm() functions are used to calculate the average of 10 numbers generated randomly.

mean(rnorm(10))
##  0.1565971

We know what it does, but we do not know how it has been programmed internally.

## Parameters order

The order of the parameters passed to a function is important. There are two ways for passing parameters to a function.

### Unnamed approach

add(1, 2, 3) # R will automatically sets x = 1, y = 2, z = 3

### Named approach

add(x = 1, y = 2, z = 3) # The named approach is usually a better way

## Structure of a function

my_function <- function(param1, param2, ...) {

# Do something super cool with param1 and param2

# Return the result(s) of what you did in the function
return(...)
}

The function return() is used to return the final result(s) of a function.

Lets create a function that will sum 3 values.

my_first_r_function <- function(x, y, z)
{
# Do something here with arguments x, y, z
result <- x + y + z

# Return the value of result
return(result)
}
# x = 1, y = 2, z = 3
my_first_r_function(1, 2, 3)
##  6

my_first_r_function <- function(x, y, z)
{
# Do something here with arguments x, y, z
result <- x + y + z

# Return the value of result
return(result)
}
# x = -3.141593, y = 0.9092974, z = 32
my_first_r_function(-pi, sin(2), 32)
##  29.7677

## Quick-RStudio tip

To rapidly create a new function, start typing “fun” and press tab and select snippet.

## Required vs optional parameters

• Sometimes, a function needs a minimum number of parameters to be invoked. These are called required parameters.

• On other hand, optional parameters have default values and can be omitted.

## Required parameters

In some cases, parameters of a function must be obligatory provided by the user.

# x = 1, y = 2, z = 3
my_first_r_function(1, 2, 3) # Call the function with 3 parameters.
##  6
# x = 1, y = 2, z = ???
my_first_r_function(1, 2) # Call the function with only 2 parameters.
## Error in my_first_r_function(1, 2): argument "z" is missing, with no default

## Optional parameters

Optional parameters mean that the function has been written with default values for some of its parameters. If the user do not specify them, the default value will be used.

my_first_r_function <- function(x, y , z = 2)
{
result <- x + y + z
return(result)
}
my_first_r_function(1, 2) # Same as x = 1, y = 2, z = 2
##  5
my_first_r_function <- function(x, y , z = 2)
{
result <- x + y + z
return(result)
}
my_first_r_function(1, 2, 1) # Same as x = 1, y = 2, z = 1
##  4

## Control flow

Control flow describes how functions should be evaluated according to condition(s) when your code is being executed.

For example, you could tell R to execute one particular task if the value of a variable is greater than a specific threshold.

## if-then-else

An if condition offers the possibility to execute a code section if a certain condition is TRUE.

if (TRUE) {
"This will be executed."
}

It is also possible to add an else clause to specify what to do if the condition is not met.

if (TRUE) {
"This will be executed."
} else {
"Otherwise, this will be executed."
}

## Examples

if (1 < 3) {
print("A")
} else {
print("B")
}
##  "A"
if ("abc" == "ABC") {
print("abc is equal to ABC")
}

## ifelse

For simple conditions, it can be convenient to use the ifelse() function.

# Test if pi is < than 3. If yes, return "A", if no return "B"
ifelse(test = pi < 3, yes = "A", no = "B")
##  "B"

## Quick-RStudio tip

To rapidly create an if condition, start typing “if” and press tab and select snippet.

## More complex control

#### Question #1:

What will be the value of b after this code is executed?

a <- TRUE
b <- 10

if (a == TRUE & b != 9) {
b <- b + 1
} else {
b <- b - 1
}

## Loops

Loops allow repeating operation(s) a number of time. There is two main types of loop in R: (1) for loop and (2) while loop. In both cases, everything inside the curly brackets {} will be repeated a certain number of time.

## For loops

For loops are usually used when we know in advance how many loops we need to do. This type of loop depends on a counter that determines when the looping should stop. For loops are defined as follows:

for (counter in vector) {
# Stuff to do here
}

At each iteration, the variable counter will take a new value based on all elements contained in vector.

## Example

For example, the following for loop will be executed five times.

# Loop through all elements of a numeric vector
for(i in 1:5) {
print(i)
}
##  1
##  2
##  3
##  4
##  5

## Example

# Loop through all elements of a character vector
for(i in c("a", "c", "b")) {
print(i)
}
##  "a"
##  "c"
##  "b"

## Quick-RStudio tip

To rapidly create a for loop, start typing “for” and press tab and select snippet.

## Exercise

#### Exercise #1

First, use the following code to create a numeric vector with 100 random numbers.

set.seed(1234)
vec <- rnorm(100)

Create a loop that will calculate the average of all elements in the vector vec. The answer should be: -0.1567617.

## while loop

While loops are usually used when we do not know in advance how many loops we need to do. A while loop will execute a block of commands until the condition is no longer satisfied.

## Example

In this example, the loop is executed as long as $$x \geq 2$$.

x <- 10

while (x >= 2) {

print(x)    # Print the value of x

x <- x - 2  # Decrease the value of x by 2
}
##  10
##  8
##  6
##  4
##  2

## Quick-RStudio tip

To rapidly create a while loop, start typing “while” and press tab and select snippet.

## Exercise

#### Exercise #1

Create a while loop that will divide a given number (ex: $$x = 12344$$) by 3 until the result is not smaller than 0.001.

x <- 12334

while(...){
}

#### Exercise #2

How many loop were done?

## Packages

There are already many functions defined in R.

R has a rapid growing community -> users developing external functions which are regrouped into packages or libraries.