#install.packages('ggplot2')
#library(ggplot2)GOALS: Students should be able to use ggplot2 to generate publication quality graphics and understand and use the basics of the grammar of graphics.
##DataViz
ggplot2 is built on the grammar-of-graphics:
ggplot2 is thinking about a figure in layers – think of ArcGIS or programs like Photoshopgeom_point(), geom bar(), geom density(), geom line(), geom area()#gapminder <- read.csv("https://goo.gl/BtBnPg", header = T)
gapminder <- read.csv('gapminder-FiveYearData.csv', header=T)Let’s start off with an example:
library(ggplot2)ggplot(data = gapminder, aes(x = gdpPercap, y = lifeExp)) +
  geom_point()NOTE:
First we call the ggplot function, letting R know that we’re creating a new plot
any arguments we provide the ggplot function are considered global options: they apply to all layers on the plot.
We passed two arguments to ggplot:
data we want to show e.g. gapminder dataan aes function - which tells ggplot how variables in the data map to aesthetic properties in the x & y locations e.g. gdpPercap column on the x and lifeExp column on the y axies
notice we didn’t have to define the data and column,ggplot is smart enough to look in the data for the columns.
Alone the ggplot call isn’t enough to render the plot.
ggplot(data = gapminder, aes(x = gdpPercap, y = lifeExp))
## If run, would produce a blank plot or error.geom layer.geom_point to create a scatter plot to represent relationship between x/y.ggplot(data = gapminder, aes(x = gdpPercap, y = lifeExp)) +
  geom_point()ggplot to visualize data as a line plot.ggplot(data = gapminder, aes(x=year, y=lifeExp, by=country, color=continent)) +
  geom_line()geom_line instead of geom_point for the geom layeradded a by aesthetic by=country to get a line per country and color by continent
All there is to do is add another layer + geom_point() to the plot:
ggplot(data = gapminder, aes(x=year, y=lifeExp, by=country, color=continent)) +
  geom_line() + geom_point()important to note this is layered: so points have been drawn on top of the previous lines layer.
As an example of this
ggplot(data = gapminder, aes(x=year, y=lifeExp, by=country)) +
  geom_line(aes(color=continent)) + geom_point()in the above the aesthetic mapping of color has been moved from the global plot options in ggplot to the geom_line layer so it no longer applies to the points
this shows the points are drawn on top of the lines.
ggplot(data = gapminder, aes(x = gdpPercap, y = lifeExp, color=continent)) +
  geom_point()x axis using the scale functionsWe’ll also use the alpha function, which is helpful when you have a large amount of data which is v. clustered
alpha value are any numbers from 0 (transparent) to 1 (opaque). default is usually 1.
ggplot(data = gapminder, aes(x = lifeExp, y = gdpPercap)) +
  geom_point(alpha=0.5) + scale_y_log10()log10 function applied a transformation to the values of the gdpPercap column before rendering them on the plot
This makes it easier to visualize the spread of data on the x-axis.
We can fit a simple relationship to the data by adding another layer, geom_smooth:
ggplot(data = gapminder, aes(x = gdpPercap, y = lifeExp)) +
  geom_point() + scale_x_log10() + geom_smooth(method="lm")make the line thicker by setting the size aesthetic in the geom_smooth layer:
you can also assign plots to varialbes using the <- operator
# example of assigning a plot to variable pwd
pwd <- ggplot(data = gapminder, aes(x = gdpPercap, y = lifeExp)) +
  geom_point() + scale_x_log10() + geom_smooth(method="lm", size=1.5)
pwdgeom_smooth.aes function to define a mapping between data variables and their visual representation.starts.with <- substr(gapminder$country, start = 1, stop = 1)
az.countries <- gapminder[starts.with %in% c("A", "Z"), ]Talk thru code: * We’ll start by subsetting the data using the substr function topull out a part of a character string; * in this case, the letters that occur in positions start through stop, inclusive, of the gapminder$country vector. * The operator %in% allows us to make multiple comparisons rather than write out long subsetting conditions (in this case, starts.with %in% c("A", "Z") is equivalent to starts.with == "A" | starts.with == "Z")
ggplot(data = az.countries, aes(x = year, y = lifeExp, color=continent)) +
  geom_line() + facet_wrap( ~ country)facet_wrap layer took a “formula” as its argument, denoted by the tilde (~).~ tells R to draw a panel for each unique value in the country column of the gapminder dataset.Now lets clean up this figure for publication.
X-axis is too cluttered, y-axis should read “Life Expectancy” instead of column name.
labs function.aes specification.color = "Continent", while the title of a fill legend would be set using ’fill = “MyTitle”.ggplot(data = az.countries, aes(x = year, y = lifeExp, color=continent)) +
  geom_line() + facet_wrap( ~ country) +
  labs(
    x = "Year",              # x axis title
    y = "Life expectancy",   # y axis title
    title = "Figure 1",      # main title of figure
    color = "Continent"      # title of legend
  ) +
  theme(axis.text.x=element_blank(), axis.ticks.x=element_blank())ggsave('~/path/to/figure/filename.png')
ggsave(filename_to_save, file = "~/path/to/figure/filename.png") # filename, path to save locationggsave(file = "/path/to/figure/filename.png", width = 6,
height =4)      # Plot size in units ("in", "cm", or "mm"). If not supplied, uses the size of current graphics device.# file can be either be a device function (e.g. png), or one of "eps", "ps", "tex" (pictex), "pdf", "jpeg", "tiff", "png", "bmp", "svg" or "wmf" (windows only).
ggsave(file = "/path/to/figure/filename.eps")
ggsave(file = "/path/to/figure/filename.jpg")
ggsave(file = "/path/to/figure/filename.pdf")This is just a taste of what you can do with ggplot2. RStudio provides a really useful cheat sheet of the different layers available, and more extensive documentation is available on the ggplot2 website. Finally, if you have no idea how to change something, a quick Google search will usually send you to a relevant question and answer on Stack Overflow with reusable code to modify!
ggplot save reference http://ggplot2.tidyverse.org/reference/ggsave.html
ggplot cheat sheet https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf
ggplot site http://ggplot2.org/