1 Introduction

R has a steep learning curve. It is often useful to cannibalize and modify someone else’s code rather than starting from scratch. I hope this page offers just such a resource. If you have a plot that you are particularly proud of, please feel free to send me the code. I will post it here and acknowledge you as author.

Often half the battle of R is reading the data in. If you plan to share a code snippet, consider posting your data to a public repository. You can send a link, and R will read the data in. This obviates unique working directories for each user. The ‘read_csv’ command from the readr library works pretty well. You can also link to Google Drive or a similar public URL, but Github seems to work the smoothest.

Here’s sample code that successfully reads data in from a CSV file posted on my lab’s Github repository.

url.dat <- read_csv("https://raw.githubusercontent.com/reilly-lab/reilly-lab.github.io/master/BoyGirl.csv", 
    col_names = TRUE)

Here are some libraries we will be using.

library(reshape2)  #melt and cast
library(tidyverse)  #ggplot2 and dplyr
library(readr)  #read in URLs
library(formatR)
library(knitr)  #knits this RMarkdown to HTML or whatever the hell else you want it to
library(gplots)
library(corrplot)  #correlation matrices and correlograms
library(tibble)  #clean little tibbles
library(RColorBrewer)  #creates custom color palettes
library(splitstackshape)  #generates ID variables, crazy useful little tool.
library(ggdendro)  #dendrograms
library(TTR)  #smoothing and simple moving averages for time series
library(ggthemes)  #tufte, economist, etc.
library(psych)  #describe_by
library(igraph)  #used here for converting hclust to igraph object (vector(s) of edges)
library(ggraph)  #plots cluster dendrogram as graph/network
library(dendextend)  #the mighty dendrogram
library(ggrepel)  #jitters points and labels a wee bit
library(DescTools)  #reorders factors
library(readxl)

2 Aesthetics, Themes, Graphical Parameters

2.1 My custom theme for ggplot2

Here’s a minimalist home brew of a theme for ggplot2. I’ll add this to most of the plots to follow. It strips panel gridlines and all sorts of other default junk.

jamie.theme <- theme_bw() + theme(axis.line = element_line(colour = "black"), 
    panel.grid.minor = element_blank(), panel.grid.major = element_blank(), 
    panel.border = element_blank(), panel.background = element_blank(), legend.title = element_blank())


2.2 Axis adjustments (limits, breaks, etc.)

ggplot2 can be a bit challenging regarding axes. Let’s first generate a dataframe populated with randomly sampled data from 1-100 without replacement. Then plot it.

set.seed(999)  #fixed random sampling
dat <- data.frame(replicate(2, sample(0:100, 100, rep = F)))  #selection without replacement
baseplot <- ggplot(dat, aes(X1, X2)) + geom_point(shape = 21, fill = "blue", 
    size = 2.3, alpha = 0.6) + jamie.theme + ylab(NULL) + xlab(NULL)
print(baseplot)

2.2.1 Specify user-defined axis breaks.

Yuck. We need finer-grained notation on both axes. Add breaks every 10.

newplot <- baseplot + scale_x_continuous(breaks = seq(0, 100, 10), limits = c(0, 
    100)) + scale_y_continuous(breaks = seq(0, 100, by = 10))
print(newplot)  #scale continuous is seq(from,to,by)

2.2.2 Set axis limits.

‘xlim’ & ‘ylim’ can cut off data. Tread lightly. Here is what xlim=50 looks like.

smaller <- baseplot + xlim(c(0, 50))
print(smaller)

2.2.3 Zoom in on a specific plot range.

Coord_cartesian zooms in on a specific range without cutting/eliminating data

focused <- baseplot + coord_cartesian(xlim = c(0, 25), ylim = c(0, 50))
print(focused)