Goal: To be to define a function that takes arguments and return a value from a function. Overall, be able to explain why we should divide programs into small, single-purpose functions.
##What is a function?
Functions gather a sequence of operations into a whole, preserving it for ongoing use. Functions provide:
As the basic building block of most programming languages, user-defined functions constitute programming as much as any single abstraction can. If you have written a function, you are a computer programmer.
Open a R Script File > New File > R Script:
my_sum <- function(a, b) {
  the_sum <- a + b
  return(the_sum)
}Using (calling) this function:
my_sum(10, 15)## [1] 25Components of a function:
body(my_sum)## {
##     the_sum <- a + b
##     return(the_sum)
## }formals(my_sum)## $a
## 
## 
## $b#args(my_sum) #for a more human readable version
environment(my_sum)## <environment: R_GlobalEnv>** show environment window - read only functions view **
Let’s define a function that converts Fahrenheit to Kelvin.
fahr_to_kelvin <- function(temp) {
  kelvin <- ((temp - 32) * (5 / 9)) + 273.15
  return(kelvin)
}function.body of the funtion – the operations that are executed – are within the curly braces {}Calling the function
Return Statement is a statement that causes a function to stop executing and return a value back to the call immediately.#freezing point of water
fahr_to_kelvin(32)## [1] 273.15#boiling point of water
fahr_to_kelvin(212)## [1] 373.15http://swcarpentry.github.io/r-novice-gapminder/10-functions#challenge-1
Write a function called kelvin_to_celsius that takes a temperature in Kelvin and returns that temperature in Celsiu. Hint: To convert from Kelvin to Celsius you minus 273.15
kelvin_to_celsius <- function(temp) {
  celsius <- temp - 273.15
  return(celsius)
}The real power of functions comes when we mix, match and combine them into ever larger chunks to get what we need
As an example, let’s define two functions that will convert temperature from Fahrenheit to Kelvin, and Kelvin to Celsius
fahr_to_kelvin <- function(temp) {
  kelvin <- ((temp - 32) * (5 / 9)) + 273.15
  return(kelvin)
}
kelvin_to_celsius <- function(temp) {
  celsius <- temp - 273.15
  return(celsius)
}#freezing point of water in Celsius
kelvin_to_celsius(fahr_to_kelvin(32.0))## [1] 0http://swcarpentry.github.io/r-novice-gapminder/10-functions#challenge-2
Now, we’re going to define a function that calculates the Gross Domestic Product of a nation from the data available in our dataset:
gapminder <- read.csv(
  file = 'data/gapminder-FiveYearData.csv',
  header = T
  
)
head(gapminder)##       country year      pop continent lifeExp gdpPercap
## 1 Afghanistan 1952  8425333      Asia  28.801  779.4453
## 2 Afghanistan 1957  9240934      Asia  30.332  820.8530
## 3 Afghanistan 1962 10267083      Asia  31.997  853.1007
## 4 Afghanistan 1967 11537966      Asia  34.020  836.1971
## 5 Afghanistan 1972 13079460      Asia  36.088  739.9811
## 6 Afghanistan 1977 14880372      Asia  38.438  786.1134# Takes a dataset and multiplies the population column
# with the GDP per capita column.
calcGDP <- function(dat){
  gdp <- dat$pop * dat$gdpPercap
  return(gdp)
}dat in parenthesesgdpreturn statment to stop executing and send back the result.calcGDP(head(gapminder))## [1]  6567086330  7585448670  8758855797  9648014150  9678553274 11697659231When we call the function calcGDP(), the values we pass to it head(gapminder) are assigned to the arguments, which become variables inside the body of the function.
output not very informative. let’s add country and year to our arguments to extract year and country
# Takes a dataset and multiplies the population column
# with the GDP per capita column.
calcGDP <- function(dat, year=NULL, country=NULL) {
  if(!is.null(year)) {
    dat <- dat[dat$year %in% year, ]
  }
  if (!is.null(country)) {
    dat <- dat[dat$country %in% country,]
  }
  gdp <- dat$pop * dat$gdpPercap
  new <- cbind(dat, gdp=gdp)
  return(new)
}let’s save as a script so we can load our fuctions and source them.
you can load in the functions into our R session by using the source function
source("functions/functions.R")Lot going on in this function:
country and/or year if their arguments aren’t empty.gdp on the resulting subsetted datafunction then adds the result to a new column to the subsetted data and returns result.
GDP for year
head(calcGDP(gapminder, year=2007))##        country year      pop continent lifeExp  gdpPercap          gdp
## 12 Afghanistan 2007 31889923      Asia  43.828   974.5803  31079291949
## 24     Albania 2007  3600523    Europe  76.423  5937.0295  21376411360
## 36     Algeria 2007 33333216    Africa  72.301  6223.3675 207444851958
## 48      Angola 2007 12420476    Africa  42.731  4797.2313  59583895818
## 60   Argentina 2007 40301927  Americas  75.320 12779.3796 515033625357
## 72   Australia 2007 20434176   Oceania  81.235 34435.3674 703658358894calcGDP(gapminder, country="Australia")##      country year      pop continent lifeExp gdpPercap          gdp
## 61 Australia 1952  8691212   Oceania  69.120  10039.60  87256254102
## 62 Australia 1957  9712569   Oceania  70.330  10949.65 106349227169
## 63 Australia 1962 10794968   Oceania  70.930  12217.23 131884573002
## 64 Australia 1967 11872264   Oceania  71.100  14526.12 172457986742
## 65 Australia 1972 13177000   Oceania  71.930  16788.63 221223770658
## 66 Australia 1977 14074100   Oceania  73.490  18334.20 258037329175
## 67 Australia 1982 15184200   Oceania  74.740  19477.01 295742804309
## 68 Australia 1987 16257249   Oceania  76.320  21888.89 355853119294
## 69 Australia 1992 17481977   Oceania  77.560  23424.77 409511234952
## 70 Australia 1997 18565243   Oceania  78.830  26997.94 501223252921
## 71 Australia 2002 19546792   Oceania  80.370  30687.75 599847158654
## 72 Australia 2007 20434176   Oceania  81.235  34435.37 703658358894calcGDP(gapminder, year=2007, country="Australia")##      country year      pop continent lifeExp gdpPercap          gdp
## 72 Australia 2007 20434176   Oceania  81.235  34435.37 703658358894country and year with default value of null meaning arguments will take on vaules unless specifiedcalcGDP <- function(dat, year=NULL, country=NULL) {  if(!is.null(year)) {
    dat <- dat[dat$year %in% year, ]
  }
  if (!is.null(country)) {
    dat <- dat[dat$country %in% country,]
  }this was done so function can be more flexible for calculating GDP for whole data set, single year, single country, year and country (like examples)
and then subset the rows
E.g. variables dat, gdp, and new only exist in the body of the function.
gdp <- dat$pop * dat$gdpPercap
  new <- cbind(dat, gdp=gdp)
  return(new)
}*TIP: Important to test and document your functions for others to understand, how to use and to make sure the function does what you think it will do.