shoot for 1hr
we’re going to teach you some of the fundamentals of the R language as well as some best practices for organizing code for scientific projects that will make your life easier
Tools>Keyboard shortcut help
Alt+- inserts <-
at cursor Control+Shift+m inserts %>%
at cursor (we’ll get into thisfor day 2) Control+Enter = Run current line/selection ctrl+1 = Move cursor to source ctrl+2 = Move cursor to console
Much of your time in R will be spent in the R interactive console.
This is where you will run all of your code, and can be a useful environment to try out ideas before adding them to an R script file.
* much time spent in console working out code
* console `>` with blinking cursor, much like command line
* "read, evaluate, print, loop" REPL - many languages adopt this paradigm (bash, stata, python)
* R tries to execute them, and then returns a result
1 + 100
## [1] 101
+
instead of ~>`, waiting for you to comlete command, ESC or control+c will escapeUsing R as a calculator: R uses same order of opterations to lowest precedence:
( )
^
or **
/
*
3+5*2
v. (3+5) * 2
3+5*2
## [1] 13
vs. Use parentheses to group operations in order to force the order of evaluation if it differs from the default, or to make clear what you intend.
(3+5) * 2
## [1] 16
(3 + (5 * (2 ^ 2))) # hard to read
## [1] 23
3 + 5 * 2 ^ 2 # clear, if you remember the rules
## [1] 23
3 + 5 * (2 ^ 2) # if you forget some rules, this might help
## [1] 23
2/10000
#2e-04 shorthand for 10^XX2/10000
## [1] 2e-04
So 2e-4 is shorthand for 2 * 10^(-4).
You can write numbers in scientific notation too:
5e3 # Note the lack of minus here
## [1] 5000
5e3
#notice the lack of minus hereTo call a function, we simply type its name, followed by open and closing parentheses.
Anything we type inside the parentheses is called the function’s arguments
sin(1) #trig functions
## [1] 0.841471
log(1) # natural log
## [1] 0
log10(10) #base-10 log
## [1] 1
exp(0.5) # e^(1/2)
## [1] 1.648721
notice the use of the #
after, any idea what this does? this doesn’t get evaluated b/c it’s a comment, use this to document or leave notes for yourself, e.g. #TODO fix code
use RSudio’s autocompletion feature if you can remember beginning of function
?
before a function brings up help page in Rstudio help panelwe’ll look at help later on
1 == 1 # equality (note two equals signs, read as "is equal to")
## [1] TRUE
1 != 2 # inequality (read as "is not equal to")
## [1] TRUE
1 < 2 # less than
## [1] TRUE
1 <= 1 # less than or equal to
## [1] TRUE
1 > 0 # greater than
## [1] TRUE
1 >= -9 # greater than or equal to
## [1] TRUE
# Tip: dont' use == to compare numbers unless integers, computers represent decimals with a certain degree of precision
# check out ?all.equal for comparing things involving doubles
0.1+0.05==0.15
## [1] FALSE
all.equal(0.1+0.05, 0.15)
## [1] TRUE
0.1+0.05==0.15
all.equal(0.1+0.05, 0.15)
<-
, like x <- 1/40
x <- 1/40
Notice that assignment does not print a value.
Look in Environment tab in Rstudio
x
## [1] 0.025
decimal approximation of this fraction called a floating point number.
our var can be used in place of a number in calculations log(x)
log(x)
## [1] -3.688879
x<-100
x <- 100
x
## [1] 100
x
used to contain the value 0.025 and and now it has the value 100.
x <- x + 1 #notice how RStudio updates its description of x on the top right tab
y <- x * 2
x
## [1] 101
y
## [1] 202
x <- x + 1 #notice how RStudio updates its description of x on the top right tab
right hand side of assignment can be any valid R expression & is evaluated prior assigment
Can use = for assignment but less common among R users
be consisten with operator usage, <-
is more common and recommended
Challenge 1 (5 mins)
1:5
## [1] 1 2 3 4 5
2^(1:5)
## [1] 2 4 8 16 32
x <- 1:5
2^x
## [1] 2 4 8 16 32
Challenge 2 & 3 (5 mins)
ls()
lists all varialbles and fucntions stored in the R golbal environmentls()
## [1] "x" "y"
ls
along will print out code for that function (or any R function)ls
## function (name, pos = -1L, envir = as.environment(pos), all.names = FALSE,
## pattern, sorted = TRUE)
## {
## if (!missing(name)) {
## pos <- tryCatch(name, error = function(e) e)
## if (inherits(pos, "error")) {
## name <- substitute(name)
## if (!is.character(name))
## name <- deparse(name)
## warning(gettextf("%s converted to character string",
## sQuote(name)), domain = NA)
## pos <- name
## }
## }
## all.names <- .Internal(ls(envir, all.names, sorted))
## if (!missing(pattern)) {
## if ((ll <- length(grep("[", pattern, fixed = TRUE))) &&
## ll != length(grep("]", pattern, fixed = TRUE))) {
## if (pattern == "[") {
## pattern <- "\\["
## warning("replaced regular expression pattern '[' by '\\\\['")
## }
## else if (length(grep("[^\\\\]\\[<-", pattern))) {
## pattern <- sub("\\[<-", "\\\\\\[<-", pattern)
## warning("replaced '[<-' by '\\\\[<-' in regular expression pattern")
## }
## }
## grep(pattern, all.names, value = TRUE)
## }
## else all.names
## }
## <bytecode: 0x7fed8d828040>
## <environment: namespace:base>
when using ls
parenthasis are importnat to tell R to call the function ls
Remove objects use rm(x)
rm(x)
You can use rm
to delete objects you no longer need
If you have a lot of objects and want to delete all, use rm(list=ls())
rm(list=ls())
In this case we are using ls()
function inside another function that takes a list argument, so we are listing all objects and then deleting them with rm()
arguments need the =, not <- rm(list <- ls())
causes errors
run once: rm(list <- ls())
Challenge 4
installed.packages() #list packages
install.packages("packagename1", "packagename2") #install one or many packages
update.pakcages() updating packages
remove.packages("packagename")
library(packagename) #make package available
Challenge 5
The scientific process is naturally incremental, and many projects start life as random notes, some code, then a manuscript, and eventually everything is a bit mixed together.
A good project layout will ultimately make your life easier:
It will help ensure the integrity of your data; It makes it simpler to share your code with someone else (a lab-mate, collaborator, or supervisor); It allows you to easily upload your code with your manuscript submission; It makes it easier to pick the project back up after a break.
we’ll look a litle more at Data Management near the end of quarter.
http://swcarpentry.github.io/r-novice-gapminder/fig/bad_layout.png
Creating a project in RStudio
We’re going to create a new project in RStudio:
- Click the “File” menu button, then “New Project”.
- Click “New Directory”.
- Click “Empty Project”.
- Type in the name of the directory to store your project, e.g. “swc_ucla”.
- If available, select the checkbox for “Create a git repository.” (We’ll come back to this tomorrow)
- Click the “Create Project” button.
Although there is no “best” way to lay out a project, there are some general principles to adhere to that will make project management easier:
Data is typically time consuming and/or expensive to collect.
Working with them interactively (e.g., in Excel) where they can be modified means you are never sure of where the data came from, or how it has been modified since collection.
It is therefore a good idea to treat your data as “read-only”.
This task is sometimes called “data munging”.
it useful to store these scripts in a separate folder, and create a second “read-only” data folder to hold the “cleaned” data sets.
There are lots of different ways to manage this output. its useful to have an output folder with different sub-directories for each separate analysis.
This makes it easier later, as many of my analyses are exploratory and don’t end up being used in the final project, and some of the analyses get shared between projects.
gives the following recommendations for project organization:
doc
directory.data
directory, and files generated during cleanup and analysis in a results
directory.src
directory, and programs brought in from elsewhere or compiled locally in the bin
directory.Name all files to reflect their content or function.
mention TIER Protocol
- covered later on in the Data Management section
Challenge 1
To be able to access R help files for functions and operators.
?function_name
or help(function_name)
?"+"
??function_name
will do fuzzy search for function help (if you don’t know the exact name)** hope to be at 2:00pm**