Goal: To be to define a function that takes arguments and return a value from a function. Overall, be able to explain why we should divide programs into small, single-purpose functions.

##What is a function?

Functions gather a sequence of operations into a whole, preserving it for ongoing use. Functions provide:

As the basic building block of most programming languages, user-defined functions constitute programming as much as any single abstraction can. If you have written a function, you are a computer programmer.

Creating a function

Open a R Script File > New File > R Script:

my_sum <- function(a, b) {
  the_sum <- a + b
  return(the_sum)
}

Using (calling) this function:

my_sum(10, 15)
## [1] 25

Components of a function:

  1. body
  2. arguments
  3. environment
body(my_sum)
## {
##     the_sum <- a + b
##     return(the_sum)
## }
formals(my_sum)
## $a
## 
## 
## $b
#args(my_sum) #for a more human readable version
environment(my_sum)
## <environment: R_GlobalEnv>

** show environment window - read only functions view **

Let’s define a function that converts Fahrenheit to Kelvin.

fahr_to_kelvin <- function(temp) {
  kelvin <- ((temp - 32) * (5 / 9)) + 273.15
  return(kelvin)
}

Calling the function

#freezing point of water
fahr_to_kelvin(32)
## [1] 273.15
#boiling point of water
fahr_to_kelvin(212)
## [1] 373.15

Challenge 1

http://swcarpentry.github.io/r-novice-gapminder/10-functions#challenge-1

Write a function called kelvin_to_celsius that takes a temperature in Kelvin and returns that temperature in Celsiu. Hint: To convert from Kelvin to Celsius you minus 273.15

kelvin_to_celsius <- function(temp) {
  celsius <- temp - 273.15
  return(celsius)
}

Combining functions

fahr_to_kelvin <- function(temp) {
  kelvin <- ((temp - 32) * (5 / 9)) + 273.15
  return(kelvin)
}

kelvin_to_celsius <- function(temp) {
  celsius <- temp - 273.15
  return(celsius)
}
#freezing point of water in Celsius
kelvin_to_celsius(fahr_to_kelvin(32.0))
## [1] 0

Challenge 2

http://swcarpentry.github.io/r-novice-gapminder/10-functions#challenge-2

More on combining functions

Calculate GDP from GapMinder dataset

Now, we’re going to define a function that calculates the Gross Domestic Product of a nation from the data available in our dataset:

  • make sure dataset is loaded as ‘gapminder’
gapminder <- read.csv(
  file = 'data/gapminder-FiveYearData.csv',
  header = T
  
)
head(gapminder)
##       country year      pop continent lifeExp gdpPercap
## 1 Afghanistan 1952  8425333      Asia  28.801  779.4453
## 2 Afghanistan 1957  9240934      Asia  30.332  820.8530
## 3 Afghanistan 1962 10267083      Asia  31.997  853.1007
## 4 Afghanistan 1967 11537966      Asia  34.020  836.1971
## 5 Afghanistan 1972 13079460      Asia  36.088  739.9811
## 6 Afghanistan 1977 14880372      Asia  38.438  786.1134
  • type function example:
# Takes a dataset and multiplies the population column
# with the GDP per capita column.
calcGDP <- function(dat){
  gdp <- dat$pop * dat$gdpPercap
  return(gdp)
}
  • defined calcGDP by using function.
  • List arguments dat in parentheses
  • The body of the function calculates gdp
  • return statment to stop executing and send back the result.
calcGDP(head(gapminder))
## [1]  6567086330  7585448670  8758855797  9648014150  9678553274 11697659231
  • When we call the function calcGDP(), the values we pass to it head(gapminder) are assigned to the arguments, which become variables inside the body of the function.

  • output not very informative. let’s add country and year to our arguments to extract year and country

# Takes a dataset and multiplies the population column
# with the GDP per capita column.
calcGDP <- function(dat, year=NULL, country=NULL) {
  if(!is.null(year)) {
    dat <- dat[dat$year %in% year, ]
  }
  if (!is.null(country)) {
    dat <- dat[dat$country %in% country,]
  }
  gdp <- dat$pop * dat$gdpPercap

  new <- cbind(dat, gdp=gdp)
  return(new)
}
  • let’s save as a script so we can load our fuctions and source them.

  • you can load in the functions into our R session by using the source function

source("functions/functions.R")

Lot going on in this function:

  • function now subsets the data by country and/or year if their arguments aren’t empty.
  • then calculates the gdp on the resulting subsetted data
  • function then adds the result to a new column to the subsetted data and returns result.

  • GDP for year

head(calcGDP(gapminder, year=2007))
##        country year      pop continent lifeExp  gdpPercap          gdp
## 12 Afghanistan 2007 31889923      Asia  43.828   974.5803  31079291949
## 24     Albania 2007  3600523    Europe  76.423  5937.0295  21376411360
## 36     Algeria 2007 33333216    Africa  72.301  6223.3675 207444851958
## 48      Angola 2007 12420476    Africa  42.731  4797.2313  59583895818
## 60   Argentina 2007 40301927  Americas  75.320 12779.3796 515033625357
## 72   Australia 2007 20434176   Oceania  81.235 34435.3674 703658358894
  • GDP for specific country
calcGDP(gapminder, country="Australia")
##      country year      pop continent lifeExp gdpPercap          gdp
## 61 Australia 1952  8691212   Oceania  69.120  10039.60  87256254102
## 62 Australia 1957  9712569   Oceania  70.330  10949.65 106349227169
## 63 Australia 1962 10794968   Oceania  70.930  12217.23 131884573002
## 64 Australia 1967 11872264   Oceania  71.100  14526.12 172457986742
## 65 Australia 1972 13177000   Oceania  71.930  16788.63 221223770658
## 66 Australia 1977 14074100   Oceania  73.490  18334.20 258037329175
## 67 Australia 1982 15184200   Oceania  74.740  19477.01 295742804309
## 68 Australia 1987 16257249   Oceania  76.320  21888.89 355853119294
## 69 Australia 1992 17481977   Oceania  77.560  23424.77 409511234952
## 70 Australia 1997 18565243   Oceania  78.830  26997.94 501223252921
## 71 Australia 2002 19546792   Oceania  80.370  30687.75 599847158654
## 72 Australia 2007 20434176   Oceania  81.235  34435.37 703658358894
  • or both year and country
calcGDP(gapminder, year=2007, country="Australia")
##      country year      pop continent lifeExp gdpPercap          gdp
## 72 Australia 2007 20434176   Oceania  81.235  34435.37 703658358894
  • walk through body of function:
  • added two arguments country and year with default value of null meaning arguments will take on vaules unless specified
calcGDP <- function(dat, year=NULL, country=NULL) {
  • we check to see if they are not null with the if operator
  if(!is.null(year)) {
    dat <- dat[dat$year %in% year, ]
  }
  if (!is.null(country)) {
    dat <- dat[dat$country %in% country,]
  }
  • this was done so function can be more flexible for calculating GDP for whole data set, single year, single country, year and country (like examples)

  • and then subset the rows

  • scoping - any variables or functions created or modified in body of the function only exist for the lifetime of the functions execution.
  • E.g. variables dat, gdp, and new only exist in the body of the function.

gdp <- dat$pop * dat$gdpPercap
  new <- cbind(dat, gdp=gdp)
  return(new)
}
  • lastly, calculate gdp and creeate a new data frame with column to return (output)

*TIP: Important to test and document your functions for others to understand, how to use and to make sure the function does what you think it will do.