Do you remember our favourite package? it’s tidyverse. load it now
library(tidyverse)
A function in R is a command. It is a pre-made set of instructions
that are easily called upon by using the name of the function. Functions
in R will have a parentheses after their name. For example,
sum()
is a function.
The parentheses contain the arguments for a function. Functions can have more than one argument. How many arguments can sum take?
One?
sum(1)
## [1] 1
Two?
sum(1,2)
## [1] 3
More?
sum(1,2,3,4,5,6,7,8,9,10)
## [1] 55
The help function gives you lots of information about functions, including the argument information about a function.
help(sum)
Description
sum returns the sum of all the values present in its arguments.
Arguments
... numeric or complex or logical vectors.
na.rm logical. Should missing values (including NaN) be removed?
help(function)
or ?function
are two ways to
call help for things in R.
?sum
Functions usually aren’t as nice as sum()
, and will
require a default number of arguments. Not including the correct number
will result in errors or other bad stuff. It’s always good to look at
the help documentation for a function and scroll down to check the
examples for use.
Let’s look at another function.
help(seq)
Description
Generate regular sequences...
Arguments
... arguments passed to or from methods.
from, to the starting and (maximal) end values of the sequence. Of length 1 unless just from is supplied as an unnamed argument.
by number: increment of the sequence.
length.out desired length of the sequence. A non-negative number, which for seq and seq.int will be rounded up if fractional.
along.with take the length from the length of this argument.
the seq()
function creates a sequence and has several
arguments used to create a sequence. The first arguments are
from
and to
. These define the boundaries of
the sequence.
seq(from = 1, to = 8)
## [1] 1 2 3 4 5 6 7 8
Notice that you can actually type the names of the arguments and use
=
to specify their values. The seq()
function
also has a by
argument - it allows you to specify the size
of the sequence’s increments
seq(from = 10, to = 100, by = 5)
## [1] 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
Arguments also have default positions in a function. Compare:
# not calling arguments
seq(1,10)
## [1] 1 2 3 4 5 6 7 8 9 10
# calling arguments
seq(from = 1, to = 10)
## [1] 1 2 3 4 5 6 7 8 9 10
# calling arguments in a different order
seq(to = 10, from = 1)
## [1] 1 2 3 4 5 6 7 8 9 10
What is the default argument order for seq()
?
A variable is an R object that you create. It can be the result of a
function, the result of data being loaded in, or the result of you
manually typing the values. A variable has the name of the variable on
the left side, followed by a <-
, followed by the value
of the variable.
# value is text
dogs <- 'cool'
cats <- 'drool'
# type the variable name to see the value.
dogs
## [1] "cool"
cats
## [1] "drool"
# value is numbers
new.zealand <- 1
australia <- 2
new.zealand
## [1] 1
australia
## [1] 2
Can you use sequence to make a variable that is 10 digits between 2
and 20? Save the results as a variable named ten.twenty
ten.twenty <- seq(2,20,2)
ten.twenty
## [1] 2 4 6 8 10 12 14 16 18 20
Check the help for the c()
function.
?c
Can you use the c()
function to make a variable that is
the first ten digits of the English alphabet (a through j)? Save the
results as a variable named ten.letters
. Remember to use
quotes around your letters.
ten.letters <- c('a','b','c','d','e','f','g','h','i','j')
ten.letters
## [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
Search the help for the length()
function
?length
What is the length of ten.twenty
and
ten.letters
?
length(ten.twenty)
## [1] 10
length(ten.letters)
## [1] 10
Using JUST the length function and the ten.twenty
and
ten.letters
variables, can you create a new variable named
one.hundred
which is the value 100
?
one.hundred <- length(ten.twenty) * length(ten.letters)
one.hundred
## [1] 100
Pipes are that weird %>%
thing. We use them to chain
functions together. It helps us write more readable code, saves time,
avoids having to rename variables helps with debugging. Knowing how
arguments work should help you understand the pipe a bit better now.
Let’s create a tibble of our variables. Name the tibble
lals.pipes
. The look at the tibble with the
glimpse()
function.
lals.pipes <- tibble(numbers = ten.twenty, letters = ten.letters)
glimpse(lals.pipes)
## Rows: 10
## Columns: 2
## $ numbers <dbl> 2, 4, 6, 8, 10, 12, 14, 16, 18, 20
## $ letters <chr> "a", "b", "c", "d", "e", "f", "g", "h", "i", "j"
Now let’s repeat the same procedure but with a pipe. What is different about this code? What does this tell us about what pipes “do”?
lals.pipes <- tibble(numbers = ten.twenty, letters = ten.letters) %>%
glimpse()
## Rows: 10
## Columns: 2
## $ numbers <dbl> 2, 4, 6, 8, 10, 12, 14, 16, 18, 20
## $ letters <chr> "a", "b", "c", "d", "e", "f", "g", "h", "i", "j"
So, pipes take an R object and then pass the name of that R object to subsequent functions in the pipe. The first argument of many functions in R is the name of the object. Tidyverse is designed to take advantage of this.
To start a pipe with the same object, we can call the object on itself, like this
object <- object %>%
more functions here...
Let’s add a column to our data. the mutate
function adds
varibles to a tibble. The first argument for mutate is the name of the
object, and the second argument is the name of the new variable.
However, the second argument also requires a =
and then the
values to set the variable to. For example
mutate(data, variable1 = 1).
You can also put a function
inside the mutate()
callmutate(data, variable_1 = sum(1,2))
, which should
demonstrate how powerful mutate()
is.
Create a new variable in our lals.pipes
called
colour
and assign it the value “blue”. Then run the
glimpse()
function on lals.pipes
lals.pipes <- lals.pipes %>%
mutate(colour = 'blue')
glimpse(lals.pipes)
## Rows: 10
## Columns: 3
## $ numbers <dbl> 2, 4, 6, 8, 10, 12, 14, 16, 18, 20
## $ letters <chr> "a", "b", "c", "d", "e", "f", "g", "h", "i", "j"
## $ colour <chr> "blue", "blue", "blue", "blue", "blue", "blue", "blue", "blue"…
What happened to our “colour” variable?
You can put make multiple variables with one mutate()
function
d1 <- lals.pipes %>%
mutate(variable1 = seq(1,10),
variable2 = seq(11,20))
glimpse(d1)
## Rows: 10
## Columns: 5
## $ numbers <dbl> 2, 4, 6, 8, 10, 12, 14, 16, 18, 20
## $ letters <chr> "a", "b", "c", "d", "e", "f", "g", "h", "i", "j"
## $ colour <chr> "blue", "blue", "blue", "blue", "blue", "blue", "blue", "blu…
## $ variable1 <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
## $ variable2 <int> 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
Now for your challenge…
rep()
commandpipes.rule
which is the value
of lals.pipes
colour
so that instead of ten values of blue, it
instead alternates between “yellow” and “green” (please use the
rep()
function). You will have to use c()
inside the rep()
function.numbers2
in
lals.pipes
which is the same as numbers
but in
the opposite order (use the seq()
function inside a
mutate()
function). You will have to use a negative
by
value.numbers3
in
lals.pipes
which is the result of multiplying
numbers
and numbers2
mutate()
function.Here is the answer with three mutate()
calls
# Look up help for rep, it stands for "replicate"
?rep
pipes.rule <- lals.pipes %>%
mutate(colour = rep(c('yellow','green'),5)) %>%
mutate(numbers2 = seq(20,2,-2)) %>%
mutate(numbers3 = numbers * numbers2)
pipes.rule
## # A tibble: 10 × 5
## numbers letters colour numbers2 numbers3
## <dbl> <chr> <chr> <dbl> <dbl>
## 1 2 a yellow 20 40
## 2 4 b green 18 72
## 3 6 c yellow 16 96
## 4 8 d green 14 112
## 5 10 e yellow 12 120
## 6 12 f green 10 120
## 7 14 g yellow 8 112
## 8 16 h green 6 96
## 9 18 i yellow 4 72
## 10 20 j green 2 40
Here is the answer with one mutate()
call
pipes.rule <- lals.pipes %>%
mutate(colour = rep(c('yellow','green'),5),
numbers2 = seq(20,2,-2),
numbers3 = numbers * numbers2)
pipes.rule
## # A tibble: 10 × 5
## numbers letters colour numbers2 numbers3
## <dbl> <chr> <chr> <dbl> <dbl>
## 1 2 a yellow 20 40
## 2 4 b green 18 72
## 3 6 c yellow 16 96
## 4 8 d green 14 112
## 5 10 e yellow 12 120
## 6 12 f green 10 120
## 7 14 g yellow 8 112
## 8 16 h green 6 96
## 9 18 i yellow 4 72
## 10 20 j green 2 40