Graphics in R
Graphics in R
Dot Plots
Create dotplots with the dotchart (x, labels=) function, where x is a numeric vector and
labels is a vector of labels for each point. You can add a groups= option to designate a factor
specifying how the elements of x are grouped. If so, the option gcolor= controls the color of
the groups label. cex controls the size of the labels.
#############################################################################
# Simple Dotplot
dotchart(mtcars$mpg,labels=row.names(mtcars),cex=.7,
main="Gas Milage for Car Models",
xlab="Miles Per Gallon")
#############################################################################
Bar Plots
Create barplots with the barplot(height) function, where height is a vector or matrix. If
height is a vector, the values determine the heights of the bars in the plot. If height is a
matrix and the option beside=FALSE then each bar of the plot corresponds to a column of
height, with the values in the column giving the heights of stacked “sub-bars”. If height is a
matrix and beside=TRUE, then the values in each column are juxtaposed rather than
stacked. Include option names.arg=(character vector) to label the bars. The option
horiz=TRUE to createa a horizontal barplot.
#############################################################################
# Simple Bar Plot
counts <- table(mtcars$gear)
barplot(counts, main="Car Distribution",
xlab="Number of Gears")
#############################################################################
# Simple Horizontal Bar Plot with Added Labels
counts <- table(mtcars$gear)
barplot(counts, main="Car Distribution", horiz=TRUE,
names.arg=c("3 Gears", "4 Gears", "5 Gears"))
#############################################################################
#############################################################################
# Stacked Bar Plot with Colors and Legend
counts <- table(mtcars$vs, mtcars$gear)
barplot(counts, main="Car Distribution by Gears and VS",
xlab="Number of Gears", col=c("darkblue","red"),
legend = rownames(counts))
#############################################################################
# Grouped Bar Plot
counts <- table(mtcars$vs, mtcars$gear)
barplot(counts, main="Car Distribution by Gears and VS",
xlab="Number of Gears", col=c("darkblue","red"),
legend = rownames(counts), beside=TRUE)
#############################################################################
Notes
Bar plots need not be based on counts or frequencies. You can create bar plots that
represent means, medians, standard deviations, etc. Use the aggregate( ) function and pass
the results to the barplot( ) function.
By default, the categorical axis line is suppressed. Include the option axis.lty=1 to draw it.
With many bars, bar labels may start to overlap. You can decrease the font size using the
cex.names = option. Values smaller than one will shrink the size of the label. Additionally,
you can use graphical parameters such as the following to help text spacing:
#############################################################################
# Fitting Labels
par(las=2) # make label text perpendicular to axis
par(mar=c(5,8,4,2)) # increase y-axis margin.
counts <- table(mtcars$gear)
barplot(counts, main="Car Distribution", horiz=TRUE, names.arg=c("3 Gears",
"4 Gears", "5 Gears"), cex.names=0.8)
#############################################################################
Scatterplots
Simple Scatterplot
There are many ways to create a scatterplot in R. The basic function is plot(x, y), where x
and y are numeric vectors denoting the (x,y) points to plot.
#############################################################################
# Simple Scatterplot
attach(mtcars)
plot(wt, mpg, main="Scatterplot Example",
xlab="Car Weight ", ylab="Miles Per Gallon ", pch=19)
#############################################################################
# Add fit lines
abline(lm(mpg~wt), col="red") # regression line (y~x)
lines(lowess(wt,mpg), col="blue") # lowess line (x,y)
#############################################################################
Scatterplot Matrices
There are at least 4 useful functions for creating scatterplot matrices. Analysts must love
scatterplot matrices!
#############################################################################
# Basic Scatterplot Matrix
pairs(~mpg+disp+drat+wt,data=mtcars,
main="Simple Scatterplot Matrix")
#############################################################################
Boxplots
Boxplots can be created for individual variables or for variables by group. The format is
boxplot(x, data=), where x is a formula and data= denotes the data frame providing the
data. An example of a formula is y~group where a separate boxplot for numeric variable y
is generated for each value of group. Add varwidth=TRUE to make boxplot widths
proportional to the square root of the samples sizes. Add horizontal=TRUE to reverse the
axis orientation.
#############################################################################
# Boxplot of MPG by Car Cylinders
boxplot(mpg~cyl,data=mtcars, main="Car Milage Data",
xlab="Number of Cylinders", ylab="Miles Per Gallon")
#############################################################################
#Notched Boxplot of Tooth Growth Against 2 Crossed Factors
# boxes colored for ease of interpretation
boxplot(len~supp*dose, data=ToothGrowth, notch=TRUE,
col=(c("gold","darkgreen")),
main="Tooth Growth", xlab="Suppliment and Dose")
#############################################################################
n the notched boxplot, if two boxes' notches do not overlap this is ‘strong evidence’ their
medians differ (Chambers et al., 1983, p. 62).
Colors recycle. In the example above, if I had listed 6 colors, each box would have its own
color. Earl F. Glynn has created an easy to use list of colors is PDF format.
Line Charts
Line charts are created with the function lines(x, y, type=) where x and y are numeric
vectors of (x,y) points to connect. type= can take the following values:
type description
p points
l lines
s, S stair steps
To demonstrate the creation of a more complex line chart, let's plot the growth of 5 orange
trees over time. Each tree will have its own distinctive line. The data come from the dataset
Orange.
#############################################################################
# Create Line Chart
# convert factor to numeric for convenience
Orange$Tree <- as.numeric(Orange$Tree)
ntrees <- max(Orange$Tree)
#############################################################################
# get the range for the x and y axis
xrange <- range(Orange$age)
yrange <- range(Orange$circumference)
#############################################################################
# set up the plot
plot(xrange, yrange, type="n", xlab="Age (days)",
ylab="Circumference (mm)" )
colors <- rainbow(ntrees)
linetype <- c(1:ntrees)
plotchar <- seq(18,18+ntrees,1)
#############################################################################
#############################################################################
# add lines
for (i in 1:ntrees) {
tree <- subset(Orange, Tree==i)
lines(tree$age, tree$circumference, type="b", lwd=1.5,
lty=linetype[i], col=colors[i], pch=plotchar[i])
}
#############################################################################
# add a title and subtitle
title("Tree Growth", "example of line plot")
#############################################################################
# add a legend
legend(xrange[1], yrange[2], 1:ntrees, cex=0.8, col=colors,
pch=plotchar, lty=linetype, title="Tree")
#############################################################################
Pie Charts
Pie charts are not recommended in the R documentation, and their features are somewhat
limited. The authors recommend bar or dot plots over pie charts because people are able to
judge length more accurately than volume. Pie charts are created with the function pie(x,
labels=) where x is a non-negative numeric vector indicating the area of each slice and
labels= notes a character vector of names for the slices.