Goal: To be to define a function that takes arguments and return a value from a function. Overall, be able to explain why we should divide programs into small, single-purpose functions.
##What is a function?
Functions gather a sequence of operations into a whole, preserving it for ongoing use. Functions provide:
As the basic building block of most programming languages, user-defined functions constitute programming as much as any single abstraction can. If you have written a function, you are a computer programmer.
Open a R Script File > New File > R Script
:
my_sum <- function(a, b) {
the_sum <- a + b
return(the_sum)
}
Using (calling) this function:
my_sum(10, 15)
## [1] 25
Components of a function:
body(my_sum)
## {
## the_sum <- a + b
## return(the_sum)
## }
formals(my_sum)
## $a
##
##
## $b
#args(my_sum) #for a more human readable version
environment(my_sum)
## <environment: R_GlobalEnv>
** show environment window - read only functions view **
Let’s define a function that converts Fahrenheit to Kelvin.
fahr_to_kelvin <- function(temp) {
kelvin <- ((temp - 32) * (5 / 9)) + 273.15
return(kelvin)
}
function
.body
of the funtion – the operations that are executed – are within the curly braces {}
Calling the function
Return Statement
is a statement that causes a function to stop executing and return a value back to the call immediately.#freezing point of water
fahr_to_kelvin(32)
## [1] 273.15
#boiling point of water
fahr_to_kelvin(212)
## [1] 373.15
http://swcarpentry.github.io/r-novice-gapminder/10-functions#challenge-1
Write a function called kelvin_to_celsius
that takes a temperature in Kelvin and returns that temperature in Celsiu. Hint: To convert from Kelvin to Celsius you minus 273.15
kelvin_to_celsius <- function(temp) {
celsius <- temp - 273.15
return(celsius)
}
The real power of functions comes when we mix, match and combine them into ever larger chunks to get what we need
As an example, let’s define two functions that will convert temperature from Fahrenheit to Kelvin, and Kelvin to Celsius
fahr_to_kelvin <- function(temp) {
kelvin <- ((temp - 32) * (5 / 9)) + 273.15
return(kelvin)
}
kelvin_to_celsius <- function(temp) {
celsius <- temp - 273.15
return(celsius)
}
#freezing point of water in Celsius
kelvin_to_celsius(fahr_to_kelvin(32.0))
## [1] 0
http://swcarpentry.github.io/r-novice-gapminder/10-functions#challenge-2
Now, we’re going to define a function that calculates the Gross Domestic Product of a nation from the data available in our dataset:
gapminder <- read.csv(
file = 'data/gapminder-FiveYearData.csv',
header = T
)
head(gapminder)
## country year pop continent lifeExp gdpPercap
## 1 Afghanistan 1952 8425333 Asia 28.801 779.4453
## 2 Afghanistan 1957 9240934 Asia 30.332 820.8530
## 3 Afghanistan 1962 10267083 Asia 31.997 853.1007
## 4 Afghanistan 1967 11537966 Asia 34.020 836.1971
## 5 Afghanistan 1972 13079460 Asia 36.088 739.9811
## 6 Afghanistan 1977 14880372 Asia 38.438 786.1134
# Takes a dataset and multiplies the population column
# with the GDP per capita column.
calcGDP <- function(dat){
gdp <- dat$pop * dat$gdpPercap
return(gdp)
}
dat
in parenthesesgdp
return statment
to stop executing and send back the result.calcGDP(head(gapminder))
## [1] 6567086330 7585448670 8758855797 9648014150 9678553274 11697659231
When we call the function calcGDP()
, the values we pass to it head(gapminder)
are assigned to the arguments, which become variables inside the body of the function.
output not very informative. let’s add country and year to our arguments to extract year and country
# Takes a dataset and multiplies the population column
# with the GDP per capita column.
calcGDP <- function(dat, year=NULL, country=NULL) {
if(!is.null(year)) {
dat <- dat[dat$year %in% year, ]
}
if (!is.null(country)) {
dat <- dat[dat$country %in% country,]
}
gdp <- dat$pop * dat$gdpPercap
new <- cbind(dat, gdp=gdp)
return(new)
}
let’s save as a script so we can load our fuctions and source them.
you can load in the functions into our R session by using the source function
source("functions/functions.R")
Lot going on in this function:
country
and/or year
if their arguments aren’t empty.gdp
on the resulting subsetted datafunction then adds the result to a new column to the subsetted data and returns result.
GDP for year
head(calcGDP(gapminder, year=2007))
## country year pop continent lifeExp gdpPercap gdp
## 12 Afghanistan 2007 31889923 Asia 43.828 974.5803 31079291949
## 24 Albania 2007 3600523 Europe 76.423 5937.0295 21376411360
## 36 Algeria 2007 33333216 Africa 72.301 6223.3675 207444851958
## 48 Angola 2007 12420476 Africa 42.731 4797.2313 59583895818
## 60 Argentina 2007 40301927 Americas 75.320 12779.3796 515033625357
## 72 Australia 2007 20434176 Oceania 81.235 34435.3674 703658358894
calcGDP(gapminder, country="Australia")
## country year pop continent lifeExp gdpPercap gdp
## 61 Australia 1952 8691212 Oceania 69.120 10039.60 87256254102
## 62 Australia 1957 9712569 Oceania 70.330 10949.65 106349227169
## 63 Australia 1962 10794968 Oceania 70.930 12217.23 131884573002
## 64 Australia 1967 11872264 Oceania 71.100 14526.12 172457986742
## 65 Australia 1972 13177000 Oceania 71.930 16788.63 221223770658
## 66 Australia 1977 14074100 Oceania 73.490 18334.20 258037329175
## 67 Australia 1982 15184200 Oceania 74.740 19477.01 295742804309
## 68 Australia 1987 16257249 Oceania 76.320 21888.89 355853119294
## 69 Australia 1992 17481977 Oceania 77.560 23424.77 409511234952
## 70 Australia 1997 18565243 Oceania 78.830 26997.94 501223252921
## 71 Australia 2002 19546792 Oceania 80.370 30687.75 599847158654
## 72 Australia 2007 20434176 Oceania 81.235 34435.37 703658358894
calcGDP(gapminder, year=2007, country="Australia")
## country year pop continent lifeExp gdpPercap gdp
## 72 Australia 2007 20434176 Oceania 81.235 34435.37 703658358894
country
and year
with default value of null
meaning arguments will take on vaules unless specifiedcalcGDP <- function(dat, year=NULL, country=NULL) {
if(!is.null(year)) {
dat <- dat[dat$year %in% year, ]
}
if (!is.null(country)) {
dat <- dat[dat$country %in% country,]
}
this was done so function can be more flexible for calculating GDP for whole data set, single year, single country, year and country (like examples)
and then subset the rows
E.g. variables dat
, gdp
, and new
only exist in the body of the function.
gdp <- dat$pop * dat$gdpPercap
new <- cbind(dat, gdp=gdp)
return(new)
}
*TIP: Important to test and document your functions for others to understand, how to use and to make sure the function does what you think it will do.