Genomics - Data Manipulation in R
Introduction
This workshop is geared to researchers wanting to use R for basic data manipulation and analysis. It will introduce participants to the basics of data tidying and manipulation (notably using tidyr & dplyr), how to set up data processing pipelines, and briefly cover working with a database. We will be using a genomics dataset for this course. This workshop is designed for novices, but we would like you to have some experience with R or have attended the Intro to R course on 3/7.
Topics:
- Data Manipulation Using dplyr (subsetting, filtering, aggregating)
- Dataframe Manipulation with tidyr (reshaping data)
Date
March 8, 2016 (10:00 AM - 12:00 PM)
Location
Biomedical Library Building - Classroom 4
Audience
All graduate students or researchers.
Code Samples:
Resources
Collaborative Notes: Etherpad Registration Page
Setup
This workshop is taught in a hands-on style, so participants are encouraged to use their own computers. To participate in this course, you will need access to the software described below. In addition, you will need an up-to-date web browser. This class is
R
R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.
Windows
Install R by downloading and running this .exe file from CRAN. Also, please install the RStudio IDE.
Mac OS X
Install R by downloading and running this .pkg file from CRAN. Also, please install the RStudio IDE.
Linux
You can download the binary files for your distribution from CRAN. Or you can use your package manager (e.g. for Debian/Ubuntu run sudo apt-get install r-base and for Fedora run sudo yum install R). Also, please install the RStudio IDE.