Introduction to R

Basics

In its simplest form, R can be used as a calculator.

## [1] 2
## [1] 2.693147
## [1] 1

Variables

Printed results are interesting, but not very useful if we can not reuse them.

In R, two operators (= , <- ) can be used to assign a value to a variable. However, by convention <- is preferred. Use keys alt + - to automatically insert the <- symbol.

Variable names

You have to be careful when choosing a variable name. There are few rules to respect.

  1. Can contain both letters and numbers, but must start with a letter (good: x123, bad: 1x23)

  2. Should not contain special characters (à, è, æ, ø, å, ?, %, ’, $, etc.)

  3. Should not be the same as a R function or predefined variable name (cos, sin, pi, min, etc.)

Basic data types

  • numeric: 12.3, pi, 1, etc.
  • character: "a", "text", "long text with space"
  • boolean/logical: TRUE/1, FALSE/0
  • date: 2016-01-31 (use ISO format, large -> small)

We can use the class() function to get the type of a variable.

## [1] "numeric"

Basic data types

Exercise #1

What is the class of x?

Vectors

Vectors

  • Vectors are arrays of successive elements. Elements of a vector can be, for example: numeric, character, logical, date.
  • Vectors are created using the c() function with , used to separate elements.
## [1] "logical"

Exercise

Question #1

What is the class of this vector?

Elements of vector

  • Particular elements of a vector can be accessed using the operator.
  • The first element of a vector is located at position 1.
## [1] 1
## [1] 11

Exercise

Given the vector x:

Question #1

What is the value of:

Exercise

Given the vector x:

Question #2

What is the value of:

Elements of vector

We can also apply a function to all elements of a vector.

## [1]  0.5  2.0 -0.5  5.0  2.7
## [1]  0.8414710 -0.7568025 -0.8414710 -0.5440211 -0.7727645

Elements of vector

Vectors and matrices of the same size can be operated on arithmetically, with the operations acting element-wise.

## [1] 4 4 9 2

Exercise

Question #1

What will happen when executing the following code:

Exercise

Question #2

What will happen when executing the following code:

Matrix

Matrix

  • A matrix is a 2-dimensional entity containing a number of rows and columns.
  • The upper left element is located at line 1 and column 1.

\[ A_{m,n} = \begin{bmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} \end{bmatrix} \]

Matrix

To understand how matrix work in R, lets create the following matrix A.

##      [,1] [,2] [,3] [,4] [,5]
## [1,]    3   13   11   16    6
## [2,]   15    1    8   25   18
## [3,]   24    5    4   10   21
## [4,]   14   12   17    2   22
## [5,]   19    9   20    7   23

Matrix indexes

Elements of a matrix can be accessed using the following scheme:

For example, the value at position \(A_{3,1}\) is:

## [1] 24

Matrix indexes

If the line or row number is omitted, the corresponding vector at line or row will be returned.

## [1]  3 13 11 16  6
## [1] 16 25 10  2  7

Exercise

Question #1

What is the value of:

Question #2

What is the value of:

Missing values

Missing values

Missing values in R are represented by NA (not available). All operations performed on NA values will return the value NA.

## [1]  2  3  4 NA  6

NA can be easily identified with the is.na() function.

## [1] FALSE FALSE FALSE  TRUE FALSE

Operations on NA

Most R functions will return NA if such values are present in the data.

## [1] NA

You have to explicitly tell R what to do with NA. This is often done using the na.rm argument.

## [1] 2.75

Logical operators

Logical operators

Logical operators are another important concept. When using these operators in R, TRUE or FALSE will be always returned.

Operator Description
< less than
> greater than
<= lesser than or equal to
>= greater than or equal to
== equal to
!= not equal to
\(|\) entry wise or
& entry wise and
! not

Some examples

## [1] FALSE
## [1]  TRUE  TRUE  TRUE FALSE FALSE FALSE
## [1]  TRUE  TRUE  TRUE FALSE  TRUE  TRUE

Boolean

It is important to know that: 0 == FALSE and 1 == TRUE.

## [1] TRUE
## [1] TRUE

Logical operators can also be used on characters or strings.

## [1] FALSE
## [1] FALSE
## [1] TRUE

Logical table

Logical operators can also be combined.

A B AND (A & B) OR (A | B) NOT (!A)
FALSE FALSE FALSE FALSE TRUE
FALSE TRUE FALSE TRUE TRUE
TRUE FALSE FALSE TRUE FALSE
TRUE TRUE TRUE TRUE FALSE

Examples

Logical operators can also be combined.

## [1] FALSE
## [1] TRUE

Exercise

Question #1

What will be the result of the following code?

Functions

Functions

A function is a procedure that performs different commands and returns only the computed result.

The user does not see the source code of the function since he/she is (usually) only interested by the result.

Schema of a function

Schema of a function

For example, whitin the following code, the mean() and rnorm() functions are used to calculate the average of 10 numbers generated randomly.

## [1] 0.1565971

We know what it does, but we do not know how it has been programmed internally.

Definition of a function

Parameters order

The order of the parameters passed to a function is important. There are two ways for passing parameters to a function.

Unnamed approach

Named approach

Structure of a function

The function return() is used to return the final result(s) of a function.

Your first R function

Lets create a function that will sum 3 values.

## [1] 6

Your first R function

## [1] 29.7677

Quick-RStudio tip

To rapidly create a new function, start typing “fun” and press tab and select snippet.

Required vs optional parameters

  • Sometimes, a function needs a minimum number of parameters to be invoked. These are called required parameters.

  • On other hand, optional parameters have default values and can be omitted.

Required parameters

In some cases, parameters of a function must be obligatory provided by the user.

## [1] 6
## Error in my_first_r_function(1, 2): argument "z" is missing, with no default

Optional parameters

Optional parameters mean that the function has been written with default values for some of its parameters. If the user do not specify them, the default value will be used.

## [1] 5
## [1] 4

Control flow

Control flow

Control flow describes how functions should be evaluated according to condition(s) when your code is being executed.

For example, you could tell R to execute one particular task if the value of a variable is greater than a specific threshold.

if-then-else

An if condition offers the possibility to execute a code section if a certain condition is TRUE.

It is also possible to add an else clause to specify what to do if the condition is not met.

Examples

## [1] "A"

ifelse

For simple conditions, it can be convenient to use the ifelse() function.

## [1] "B"

Quick-RStudio tip

To rapidly create an if condition, start typing “if” and press tab and select snippet.

More complex control

Question #1:

What will be the value of b after this code is executed?

Loops

Loops

Loops allow repeating operation(s) a number of time. There is two main types of loop in R: (1) for loop and (2) while loop. In both cases, everything inside the curly brackets {} will be repeated a certain number of time.

For loops

For loops are usually used when we know in advance how many loops we need to do. This type of loop depends on a counter that determines when the looping should stop. For loops are defined as follows:

At each iteration, the variable counter will take a new value based on all elements contained in vector.

Example

For example, the following for loop will be executed five times.

## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5

Example

## [1] "a"
## [1] "c"
## [1] "b"

Quick-RStudio tip

To rapidly create a for loop, start typing “for” and press tab and select snippet.

Exercise

Exercise #1

First, use the following code to create a numeric vector with 100 random numbers.

Create a loop that will calculate the average of all elements in the vector vec. The answer should be: -0.1567617.

while loop

While loops are usually used when we do not know in advance how many loops we need to do. A while loop will execute a block of commands until the condition is no longer satisfied.

Example

In this example, the loop is executed as long as \(x \geq 2\).

## [1] 10
## [1] 8
## [1] 6
## [1] 4
## [1] 2

Quick-RStudio tip

To rapidly create a while loop, start typing “while” and press tab and select snippet.

Exercise

Exercise #1

Create a while loop that will divide a given number (ex: \(x = 12344\)) by 3 until the result is not smaller than 0.001.

Exercise #2

How many loop were done?

Packages

Packages

There are already many functions defined in R.

R has a rapid growing community -> users developing external functions which are regrouped into packages or libraries.

Packages in R

The number of downloaded packages everyday is impressive.

Most downloaded packages

R packages

We first need to install and load libraries before we can use them.

Installation

Loading

R packages

Updating

Packages

There are a ton of packages on CRAN (10 000+) and not all of them are good.

The frequency at which a package is updated is a good first sign of quality. The date at which a package has been updated can be found in the DESCRIPTION file.

vegan package

https://cran.r-project.org/web/packages/available_packages_by_name.html

https://cran.r-project.org/web/packages/vegan/index.html

Using R help

Documentation is one of the most important aspects of a good package.

Style guide

Style guide

http://r-pkgs.had.co.nz/style.html

Good coding style is like using correct punctuation. You can manage without it, but it sure makes things easier to read.

You do not have to follow every rules, but be consistent.

Naming convention

Object names

Spacing

Line wrapping

File names

# Good
fit_models.R
utility_functions.R

# Bad
foo.r
stuff.r

Exercises

Exercise #1

Create the following vector and find a way to extract odd numbers. Hint: use the seq() function.

##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

Exercise #2

Create a matrix A using the following command. Then, write a command to find how many values are smaller than 0. Hints: use the sum() and which() functions.

##            [,1]       [,2]        [,3]       [,4]       [,5]
## [1,] -1.2070657  0.5060559 -0.47719270 -0.1102855  0.1340882
## [2,]  0.2774292 -0.5747400 -0.99838644 -0.5110095 -0.4906859
## [3,]  1.0844412 -0.5466319 -0.77625389 -0.9111954 -0.4405479
## [4,] -2.3456977 -0.5644520  0.06445882 -0.8371717  0.4595894
## [5,]  0.4291247 -0.8900378  0.95949406  2.4158352 -0.6937202

Exercise #3

Create the subtraction() function using the following code:

What will be the outputs of these commands?

Exercise #4

Create a vector of integers using the following code:

How many even numbers (0, 2, 4, …) are there in the vector?

Hint: Type ?Arithmetic and read about mod.

You can also find hints here: https://en.wikipedia.org/wiki/Modulo_operation