0% found this document useful (0 votes)
80 views

A Crash R Course On Statistical Graphics

This document outlines a course on statistical graphics using R. It covers learning goals, an overview of R, important things to know about R, good practices for working in R, and how to import/export data and create basic and advanced graphs in R. The course aims to teach participants how to organize their work in R, import/export data, produce standard and advanced statistical plots, customize graphs, and save graphs in different formats.

Uploaded by

Shafayet Hossain
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

A Crash R Course On Statistical Graphics

This document outlines a course on statistical graphics using R. It covers learning goals, an overview of R, important things to know about R, good practices for working in R, and how to import/export data and create basic and advanced graphs in R. The course aims to teach participants how to organize their work in R, import/export data, produce standard and advanced statistical plots, customize graphs, and save graphs in different formats.

Uploaded by

Shafayet Hossain
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 169

A Crash R Course

on Statistical Graphics

Dr. Isabella R. Ghement


Ghement Statistical Consulting Company Ltd.
[email protected]
American Statistical Association Conference on Statistical Practice
February 21, 2013, 1:00pm – 5:00pm
New Orleans, LA
1
Outline
1. Learning Goals
2. Overview of R
3. Things to Know about R
4. Good Practices
5. Getting Started with R
6. Data Import/Export
7. Graphical Systems in R

2
Outline - Continued
8. Basic R Graphics
9. Customizing Basic R Graphics
10. Advanced R Graphics
11. Customizing Advanced R Graphics
12. R Graphics Housekeeping
13. Summary
14. References

3
Learning Goals

4
Overarching Learning Goals
After attending this course, you will be able to:
 Organize your work in R by creating and saving R
scripts;
 Import/export data using R;
 Produce standard statistical plots using the R
package graphics;
 Produce advanced statistical plots using the R
package lattice;
 Customize basic and advanced statistical plots;
 Save basic and advanced statistical plots in a
variety of formats (e.g., jpeg, pdf).
5
Overview of R
Learning Goal:

Understand what R is, what it can


do for you and where to find R
resources.

6
Overview of R
 R is an open-source software environment and
programming language for statistical computing and
graphics.
 R’s use is governed by the GNU general Public License.
 R was created in the mid 90’s by Ross Ihaka and Robert
Gentleman (also known as “R & R”) of the Statistics
Department at the University of Auckland, New Zealand.
 Some people claim that R was created by academics for
academics. This may explain the steep learning curve
some learners face when switching to R.

7
Overview of R
 R gets updated several times a year and each upgrade
includes new functionality. It’s good to keep up with the
latest upgrades by installing the latest version of R.
However, it is also important to keep all previously
installed versions of R, as sometimes old R code will no
longer work with recent versions of R.
 You can check the website http://cran.stat.ucla.edu/ for
R upgrades.
 R is supported by all major operating systems: Windows,
Mac, Linux and Unix.
 R is developed at present by the R Development Core
Team, a group of researchers with write access to the R
8
source code.
Overview of R
R has its own dedicated website:
http://www.r-project.org/
The R website provides access to a variety of
resources, including:
- R Mailing Lists (e.g., R-help)
- R Conferences (e.g., UseR!)
- CRAN (i.e., go to website for installing R)
- Search resources
- R Manuals
- R Books
- R Journal
9
Overview of R
To cite R in publications, use the following:

R Development Core Team (2012). R: A language and


environment for statistical computing. R Foundation for
Statistical Computing, Vienna, Austria.
ISBN 3-900051-07-0, URL http://www.R-project.org/.

10
Overview of R
The original R is focused on function
rather than form and its graphical user
interface reflects this focus.

Efforts to improve R’s graphical user interface have led to enhanced


versions of R such as:

• RStudio (http://www.rstudio.com)

• Revolution R (http://www.revolutionanalytics.com)
11
Overview of R
R offers a powerful and versatile platform for:

 Data Processing and Manipulation (e.g., packages plyr, reshape);

 Statistical Graphics (e.g., packages graphics, grid, lattice, ggplot2);

 Statistical Analyses (see CRAN Task Views website for list of R


packages dedicated to implementing specific statistical analyses,
http://cran.r-project.org/web/views/) ;

 Statistical Programming (e.g., built-in R programming language as


well as ability to interface with C++, FORTRAN, Java and Python via
packages such as Rcpp, Rfortran, rJava and RPy);

 Statistical Reporting (e.g., excellent interface with Latex via Sweave


and some interface with Word via packages such as R2HTML and
rtf). 12
Overview of R
To learn more about R, you can refer to introductory R books
such as:

 “R in Action: Data Analysis and Graphics with R”, by


Robert I. Kabacoff (Manning Publications Co., 2011)

 “R for Dummies”, by Andrie de Vries and Joris Meys


(John Wiley & Sons, 2012)

 “R Cookbook”, by Paul Teetor (O'Reilly, 2011)

 “R for Statistics”, by P.-A. Cornillon et al. (CRC Press, 2012)

The first of these books is accompanied by an excellent


website, Quick-R: http://www.statmethods.net/. 13
Overview of R
R users familiar with other statistical software (i.e.,
Stata, SAS, SPSS) can also consult these books:

 “R for Stata Users”, by Robert A. Muenchen and


Joseph M. Hilbe (Springer, 2010);

 “R for SAS and SPSS Users”, by Robert A.


Muenchen (Springer, 2009).

See http://www.r-project.org/doc/bib/R-books.html
for additional R book references.
14
Things To Know
About R
Learning Goal:

Be aware of some of R’s unique


features and quirks.

15
Things to know about R
 R is case sensitive!
e.g.: anova is different from Anova
 R uses the assignment operator <-
to assign names or create new data objects.
e.g.: m <- 1 + 2
 R provides access to help files via the
question mark.
e.g.: ?mean
16
Things to know about R

R uses the concatenate operator c


to combine values or labels.
e.g.: var <- c(1, 2,3)

 R uses quotation marks for character strings


that can be interpreted as names or labels.
e.g.: col <- c("red", "blue")
data <- read.csv("datafile.csv")
17
Things to know about R
R uses different types of structures for storing
data: e.g.:
M
 vectors Vector: 1 Factor:
F
 factors 2 F
3
 matrices M 40 9
Matrix: Array:
 arrays 1 10 4
18 7

 data frames 2 20 5 1 10
2 20
 lists 3 30 17

Data Frame:
1 M 1.5
2 F 1.8
3 M 1.7 18
Things to know about R
• R uses the symbol NA to denote missing
values (i.e., Not Available).
e.g.: 1
NA
3

• In R, operations performed on variables which


include missing values produce a missing
value as a result.
19
Things to know about R
 R relies on functions for the automation of
operations.
e.g.:

f <- function(x){
plot(x)
return(summary(x))
}
20
Things to know about R
 R uses packages to bundle up functions useful for
performing certain data processing tasks, producing
certain types of graphs or performing specialized
statistical analyses. R packages may also include data
sets and help documentation.
 Thousands of R packages are available on CRAN
(Comprehensive R Archive Network) and can be installed
in R with the command:
install.packages("package_name")
 Once installed in R, packages need to be
attached to the current R working session:
require(package_name)
 For a list of R package available on CRAN, see:
http://cran.r-project.org/web/packages/
21
Things to know about R
For the purpose of creating graphs or
implementing statistical analyses, R uses formulas
such as:

y~x (y as a function of x)

y~x|f (y as a function of x, conditional on f)

y ~ x1*x2 (y as a function of x1, x2 and their


interaction)
22
Things to know about R
 R uses various types of brackets:
[]
[[ ]]
()
{}
It takes a while to get used to the meaning of
each of these brackets and know when and how
to use them.
23
Good Practices
Learning Goal:

Adopt a basic set of good practices


when working with R in order to keep
your R work organized and ensure it
is reproducible.
24
Be organized when working with R
• Set your working directory at the very beginning of each R
session. This way, everything you save during that session
is placed in your working directory.

• Type all of your R commands in script files, to ensure your


work is reproducible. Script files are simply text files having
the extension .R.

• Set desired options for controlling various aspects of the


session (e.g., maximum object size? maximum memory
size?).
Note:
To access the help file for the
options(object.size=10e10) options() function, type the
memory.size(max=TRUE) following R command in the R
Console window:
?options 25
Be organized when working with R
• It pays off to be diligent about dating, versioning and
commenting all of your R script files.
• The pound symbol, #, is used to comment lines in an R
script file.
e.g.: # This is a comment in an R script file.
demo(graphics)

• R script files bear the extension .R.


• Suggested naming conventions for R script files:
Project_Results.R
ProjectResults.R
Project.Results.R
26
Getting Started
with R
Learning Goal:

Understand the R workflow and


know how to interact with R via R
script files.
27
R Workflow
Launch R and set
up current session

Save R output and


Type R commands in an
quit R R script

Create R output (e.g., Send R commands to


numerical output, graphs, the R Console for
processed data sets) execution

28
Launching R
• If you have an R icon on your desktop, double click
on it to launch R.
Example of
R Icon

• If you don’t have an R icon on your desktop, go to


Start  All Programs. Find the R application among
the list of programs installed on your computer and
select it in order to launch R.

29
Taking Stock of R’s Interface
Notes:

 R has an R Console
window, where we
can either type
commands directly
or send commands
stored in script files
for execution.

 R also has a GUI


menu, which allows
us to change the
working directory for
the current R working
session and open
GUI = Graphical User Interface new R script files.
30
Setting Up the Current R Session
• Set your working directory for the current R session via the R
Gui commands:

File  Change dir...

• Check your current working directory using the R command:

getwd()

• Set your options for the current R session by typing the


following in your R script:

# set options for current session


options(object.size=10e10)
memory.size(max=TRUE)
31
Opening and Saving an R Script File
• Open a new R script using the following commands from
the R Gui menu: Note: Existing R script files can
be accessed in R via the R GUI
File  New Script. menu commands:
File  Open script...

• Save the script using the R Gui commands


File  Save as... For now, you can call the script
Script.R.

• Make it a good habit to keep saving the script file as you


continue to add R commands to it. Simply press the
keyboard keys Ctrl + S whenever you are ready to save
the script.
32
Modifying an R Script File
• Create a header for your script file, similar to the one
below.
###################################
# A Crash R Course on Statistical Graphics
# New Orleans, LA
# February 21, 2013
###################################
• Save the script file and carry on.
• As we progress through the course, please copy and
paste R commands from the course slides into your R
script file(s) and then send these commands to R for
execution by selecting them and using the keyboard
keys Ctrl + R. 33
R Graphics Demo
• For now, type the following command
in your R script to access an R Graphics
Demo:
demo(graphics)
Press the Enter key to scroll through the various graphs
available in this Demo.
• You can send the demo(graphics) command to R for
execution by selecting it in the script file and pressing the
keys Ctrl + R.
• You can also send commands to R for execution by copying
them from the script file with Ctrl + C and pasting them
into the R Console window with Ctrl + V). 34
Quitting R
• To quit R at the end of a session , you can simply
type the following command in the R Console
window:
quit()
• In general, you don’t need to save the working
space attached to the current R session if you
save all of the script files and numerical and
graphical output they produce.

• For this course, we do not need to quit R yet.

35
Data Import
Learning Goal:

Be able to import comma delimited


data files and text data files in R.

36
R Functions for Data Import
R offers a variety of functions for importing data files. Two
of these functions are shown below.

File Type File Extension R Function R Help


Comma Separated .csv read.csv() ?read.csv
File
Text File .txt read.table() ?read.table

Note that read.csv() is a special version of read.table().


Both of these functions require the name of the import file to
be specified (provided the file is located in the current R working
directory):
dataset <- read.table("datafile.csv")

dataset <- read.table("datafile.txt") 37


read.csv()
One of the easiest ways to import data into R is
to save that data as a comma delimited file (.csv)
and then use the function read.csv() to bring this file
into R.

dataset <- read.csv("datafile.csv", as.is = TRUE)

name of csv file where option for preserving


import data are stored; character variables
this file must be located
in the R working directory
38
read.csv()
When calling read.csv(), it is important to use
the option as.is = TRUE.
This will prevent R from automatically
converting all of the character variables in the
data to factors.

As a result, dates will be particularly easy to


handle in R using the lubridate package.

39
read.csv()
The command read.csv() can also be used with
the following arguments:

dataset <- read.csv(file.choose(), as.is = TRUE)

browse interactively
for the import data file

dataset <- read.csv("C://desktop//datafile.csv", as.is = TRUE)

extract the import data file


from a specific location on
the computer 40
read.table()
The function read.table() is used to import text data files (.txt)
into R. In general, this function requires more arguments
than read.csv().
Notes:
sep stands for Type of separator
dataset <- read.table("datafile.txt", used to delimitate data columns,
such as:
sep="\t", sep="\t" tab
sep= " " white space
sep= "," comma
header = TRUE,
header indicates whether or not the
file header should be retained
as.is = TRUE)
as.is indicates whether or not
character variables should be
preserved
41
read.table()
The command read.table() can also be used with
the following arguments:

dataset <- read.table(file.choose(), as.is = TRUE)

browse interactively
for the import data file

dataset <- read.table("C://desktop//datafile.table", as.is = TRUE)

extract the import data file


from a specific location on
the computer
42
Example of Data Import in R
air <- read.csv("Air Quality Baton Rouge 2011.csv", as.is=TRUE)

str(air) Notes on lubridate package:

require(lubridate) The lubridate package includes the following


functions for converting character variables
storing dates into date variables:
air$Date <- mdy(air$Date)
ymd() year month day
str(air) mdy() month day year
dmy() day month year

air <- air[1:365, ] Example of dates handled by these functions:

View(air) ymd()  "2012-10-31 " or "2012/10/31"


mdy()  "10-31-2011" or "10/31/2012"
dmy()  "31-10-2012" or "31/10/2012".
43
Exploring Data Imported in R
R stores any data set imported via read.csv() or read.table() as a
data frame (i.e., a tabular data set whose columns correspond to
statistical variables and whose rows correspond to records).
R Commands for Exploring a Data Frame Description

View(dataset) View data frame

str(dataset) Explore structure of data frame

names(dataset) Extract names of variables and


rownames(dataset) records in data frame
nrow(dataset) Extract number of rows and
ncol(dataset) columns in data frame
summary(dataset) Summarize the data frame

attach(dataset) Attach/detach the data frame to


detach(dataset) the R working space 44
Exercise on Data Import
Import the data file ozone.txt into R. For ease, the R commands for
data import are given below. Explore the resulting data frame.
ozone <- read.table("ozone.txt", header=TRUE, as.is=TRUE)

head(ozone)

str(ozone) Details on ozone.txt

rownames(ozone) • Ozone and meteorological variables collected


in Rennes (France) during the summer of 2001.
ozone$date <- rownames(ozone) • The variables available are:
- maxO3 (maximum daily ozone)
require(lubridate) - T12 (temperature at midday)
- wind (wind direction)
ozone$date <- ymd(ozone$date) - rain
- Wx12 (projection of the wind speed vector
head(ozone) on the east-west axis at midday)
attach(ozone)
45
Data Export

Learning Goal:

Be able to export data from R in the


form of comma delimited or text
files.
46
R Functions for Data Export
R offers a variety of functions for exporting data files, but
we will focus only on the two functions listed below.

File Type File Extension R Function R Help


Comma Separated .csv write.csv() ?write.csv
File
Text File .txt write.table() ?write.table

47
Data Export Notes:

When using write.csv():

R Command: • The argument dataframe can


be any data frame available in
your R working space;
write.csv() • The argument "datafile.csv"
represents the name of the csv
file storing the exported data
Generic Syntax: frame;

• The option row.names = FALSE


prevents R from adding row
write.csv(dataframe, "datafile.csv", names to the exported data file;
row.names=FALSE,
• The option quote=FALSE
quote=FALSE) prevents R from adding quotes
around values of character
variables.
48
Notes:

Data Export When using write.table():


• The argument dataframe can
be any data frame available in
your R working space;
R Command:
• The argument sep is used to
specify the name of the column
write.table() separator.
• The argument "datafile.txt"
represents the name of the text
Generic Syntax: file storing the exported data
frame;

write.table(dataframe, "datafile.txt", • The option row.names = FALSE


sep= "\t", prevents R from adding row
names to the exported data file;
row.names=FALSE,
quote=FALSE) • The option quote=FALSE
prevents R from adding quotes
around values of character
variables. 49
Example of Data Export in R
# Export comma separated file
write.csv(air, "airexport.csv",
row.names=FALSE,
quote=FALSE)

# Export text file


write.table(air, "airexport.txt",
sep= "\t",
row.names=FALSE,
quote=FALSE)

50
Exercise on Data Export
Export the data frame ozone in your working space as a
comma delimited file (.csv). For your convenience, the
R command for data export is given below.

# Export as a comma separated file


write.csv(ozone, "ozone.csv",
row.names=TRUE,
quote=FALSE)

51
Graphical
Systems in R
Learning Goal:

Know about the 4 graphical systems


available in R and how to access
references and help for each system.
52
Graphical Systems in R

Most Sophisticated Grammar of Graphics

Trellis Graphics

Grid Graphics

Least Sophisticated Base Graphics

53
Graphical Systems in R
Graphical System R Package Book Reference
Base Graphics graphics “Graphics for Statistics and Data Analysis
with R”, by Kevin J. Keen (CRC Press, 2010)
Trellis Graphics lattice “Lattice: Multivariate Data Visualization
with R”, by Deepayan Sarkar (Springer,
2008)
Grammar of Graphics ggplot2 “ggplot2: Elegant Graphics for Data
Analysis”, by Hadley Wickham (Springer-
Verlag, 2009)
Grid Graphics grid “R Graphics”, 2nd Edition, by Paul Murrell
(Chapman & Hall/CRC, 2006)

Note: The graphics packages comes with the default installation of R. The other
packages need to be installed in R one time only and then required for each R session.
install.packages(c("lattice", "ggplot2", "grid"))
require("lattice")
require("ggplot2")
require("grid") 54
Example of Graph Produced in R
2011 BATON ROUGE/CAPITOL
0.14

graphics package
0.12
0.10
Daily Max Ozone (ppm)

0.08
0.06
0.04
0.02
0.00

January February March April May June July August September October November December

Month

55
Example of Graph Produced in R

2011 BATON ROUGE/CAPITOL

lattice package
0.10
Daily Max Ozone (ppm)

0.05

0.00

January February March April May June July August September October November December

Month

56
Example of Graph Produced in R
2011 BATON ROUGE/CAPITOL

ggplot2 package
0.10
Daily Max Ozone (ppm)

0.05

0.00

January February March April May June July August September October November December
Month

57
Getting Help on R Graphical Systems
To access the R help files associated with each of the
three graphical systems, type the following commands in
the R Console window:

help(package="graphics")

help(package="lattice")

help(package="ggplot2")

help(package="grid")

58
Getting Help on R Graphical Systems
To access the R help files associated with specific
functions within a particular graphical system package,
use commands similar to the ones below:

function name package name


| |
help(barplot, package="graphics")

help(bwplot, package="lattice")

help(qplot, package="ggplot2")

help(arrowsGrob, package="grid") 59
Basic R Graphics
Learning Goal:

Learn how to produce basic graphs


using the R graphics package.

60
Basic R Graphics
R offers a collection of functions for producing
standard graphics that are useful when
conducting exploratory data analysis.

These functions are available via the graphics


package, which is pre-installed in R.

61
Basic R Graphics
Graph Type R Command
Histogram hist(x)

Density Plot plot(density(x))


Boxplot
boxplot(x)
Cumulative Distribution
Plot plot.ecdf(x)

x = quantitative variable

62
Basic R Graphics
Graph Type R Command
Bar chart barplot(table(f))

Dot chart dotchart(table(f))


Pie Chart pie(table(f))

f = qualitative variable (i.e., factor)

63
Basic R Graphs
Graph Type R Command
Scatter plot plot(y ~ x)
Time series plot plot(y ~ date)
Coplot coplot(y ~ x|z)
Line Plot matplot(x, cbind(y,z))
Pairs plot pairs(cbind(y,x,z))
Side-by-side boxplots boxplot(y ~ f)
Side-by-side bar charts barplot(table(f1, f2))

x, y, z = quantitative variables
f, f1, f2 = qualitative variables (i.e., factors)
date = time variable

64
Histogram

Histogram

80
?hist

60
Frequency

40
20
0
Air Quality in Baton Rouge in 2011 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14

Daily Max Ozone (ppm)


2011 BATON ROUGE/CAPITOL

65
R Code for Histogram
air <- read.csv("Air Quality Baton Rouge 2011.csv", as.is=TRUE)

View(air)

str(air)

require(lubridate)

air$Date <- mdy(air$Date)

str(air)

air <- air[1:365, ]

names(air)

attach(air)

66
R Code for Histogram

hist(Ozone,
xlab="Daily Max Ozone (ppm)",
main="Histogram",
col="lightblue",
sub="2011 BATON ROUGE/CAPITOL",
col.sub="red")

67
Density Plot
Kernel Density Plot

25
20

?density
15
Density

10
5
0

0.00 0.05 0.10 0.15

N = 365 Bandwidth = 0.004334

68
R Code for Density Plot

plot(density(Ozone),
main="Kernel Density Plot")

69
Boxplot

Daily Max Ozone (ppm)

0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14


?boxplot
Boxplot

2011 BATON ROUGE/CAPITOL

70
R Code for Boxplot

boxplot(Ozone,
ylab="Daily Max Ozone (ppm)",
main="Boxplot",
col="lightblue",
sub="2011 BATON ROUGE/CAPITOL",
col.sub="red")

71
Side-by-Side Boxplots
Side-by-Side Boxplots
0.14

?boxplot
0.12
0.10
Daily Max Ozone (ppm)

0.08
0.06
0.04
0.02
0.00

January February March April May June July August September October November December

Month
2011 BATON ROUGE/CAPITOL

72
R Code for Side-by-Side Boxplots

Month <- months(Date)


Month <- factor(Month, levels=unique(Month))

boxplot(Ozone ~ Month,
xlab="Month",
ylab="Daily Max Ozone (ppm)",
main="Side-by-Side Boxplots",
col="lightblue",
sub="2011 BATON ROUGE/CAPITOL",
col.sub="red")
73
Empirical CDF
plot.ecdf(Ozone,
xlab="Daily Max Ozone (ppm) ",
main="Empirical Cumulative Distribution Function")

Empirical Cumulative Distribution Function

1.0
0.8
0.6
Fn(x)

0.4
0.2

?plot.ecdf
0.0

0.00 0.05 0.10 0.15

Daily Max Ozone (ppm)

74
Barchart
Percent Change in Hispanic Population from Previous Decade
United States

Percent Change in Hispanic Population from Previous Decade


Decade US

70
1990 53%

60
58%
53%

50
2000 58% 43%

40
2010 43%

30
20
10
0
Decade

?barplot

75
Barchart
Decade <- c(1990, 2000, 2010)

Pct.Change.Hispanic.US <- c(53, 58, 43)

b <- barplot(Pct.Change.Hispanic.US,
col=c("#599ad3"),
xlab="Decade",
ylab="Percent Change in Hispanic Population from Previous Decade",
ylim=c(0,70),
main="United States")

abline(h=0)

text(b, Pct.Change.Hispanic.US + 2,
paste(Pct.Change.Hispanic.US,"%",sep=""))
76
Side-by-Side Barcharts
Percent Change in Hispanic Population from Previous Decade

Decade US New Orleans Metro


1990 53% 6%
2000 58% 9%
2010 43% 57%

Percent Change in Hispanic Population from Previous Decade

70
United States
New Orleans Metro
60 58% 57%
53%
50

43%
40
30
20

9%
10

6%
?barplot
0

1990 2000 2010


77
Decade
R Code for Side-by-Side Barcharts
Decade <- c(1990, 2000, 2010)

Pct.Change.Hispanic.US <- c(53, 58, 43)

Pct.Change.Hispanic.NewOrleansMetro <- c(6,9,57)

Pct.Change.Hispanic <- data.frame(Pct.Change.Hispanic.US,


Pct.Change.Hispanic.NewOrleansMetro)

Pct.Change.Hispanic <- data.matrix(Pct.Change.Hispanic)

rownames(Pct.Change.Hispanic) <- Decade

Pct.Change.Hispanic
78
R Code for Side-by-Side Barcharts
b <- barplot(t(Pct.Change.Hispanic),
beside=TRUE,
col=c("#599ad3", "#79c36a"),
xlab="Decade",
ylab="Percent Change in Hispanic Population from Previous Decade",
ylim=c(0,70))

abline(h=0)

text(b[1,], Pct.Change.Hispanic.US + 2,
paste(Pct.Change.Hispanic.US,"%",sep=""))
text(b[2,], Pct.Change.Hispanic.NewOrleansMetro + 2,
paste(Pct.Change.Hispanic.NewOrleansMetro,"%",sep=""))

legend("topleft", c("United States","New Orleans Metro"),


fill=c("#599ad3", "#79c36a"), bty="n")
79
Stacked Barcharts
Age Distribution in New Decade Under 5 5 to 17 18 to 64 65 years
Orleans Metro years years years and older
(Expressed as Counts)
1980 105,801 285,440 774,773 116,291
1990 97,768 256,363 771,383 138,877
2000 90,471 261,362 815,010 149,667
2010 77,154 195,664 752,855 142,091

2010 0.07 0.17 0.64 0.12

2000 0.07 0.20 0.62 0.11

1990 0.08 0.20 0.61 0.11

1980 0.08 0.22 0.60 0.09

0.0 0.2 0.4 0.6 0.8 1.0 80


Proportion
Under 5 years 5 to 17 years 18 to 64 years 65 years and over
R Code for Stacked Barcharts
library(lattice)
library(plyr)

aged <- matrix(c(105801, 285440, 774773, 116291,


97768, 256363, 771383, 138877,
90471, 261362, 815010, 149667,
77154, 195664, 752855, 142091),
nrow=4, ncol=4, byrow=TRUE)

aged

colnames(aged) <- c("Under 5 years", "5 to 17 years", "18 to 64 years",


"65 years and over")

rownames(aged) <- c("1980","1990","2000","2010")

aged 81
R Code for Stacked Barcharts

colors <- c(rgb(166,27,30,maxColorValue = 255),


rgb(192,80,77,maxColorValue = 255),
rgb(24,65,83,maxColorValue = 255),
rgb(130,184,208,maxColorValue = 255))

colorset <- simpleTheme(col=colors, border="white")

82
R Code for Stacked Barcharts
sb <- barchart(prop.table(aged, margin=1), xlab="Proportion",
par.settings=colorset,
panel=function(...) {
panel.barchart(...)
tmp <- list(...)
tmp <- data.frame(x=tmp$x, y=tmp$y)
# calculate positions of text labels
df <- ddply(tmp, .(y),
function(x) {
data.frame(x, pos=cumsum(x$x)-x$x/2)
})
panel.text(x=df$pos, y=df$y,
label=sprintf("%.02f", df$x),
cex=0.7)
},
auto.key=list(columns=4, space="bottom",
cex=0.8, size=1.4, adj=1,
between=0.2, between.colums=0.1))

plot(sb)
83
Comparative Pie Charts
Share of Hispanic Population by Nationality in Three New Orleans Parishes in 2010
(Expressed as a Count)

Parish Mexican Puerto Rican Cuban Other


Jefferson 10,194 2,682 3,840 36,986
Orleans 4,298 948 1,285 11,520
St. Tammany 3,593 933 816 5,628

Mexican
Mexican

Mexican

Puerto Rican

Other
Puerto Rican
Cuban

Other
Cuban
Other

Puerto Rican

Cuban

?pie 84
R Code for Comparative Piecharts
PopulationShare <- c(10194, 2682, 3840, 36986,
4298, 948, 1285, 11520,
3593, 933, 816, 5628)

PopulationShare <- matrix(PopulationShare, nrow=3, ncol=4, byrow=TRUE)

PopulationShare

rownames(PopulationShare) <- c("Jefferson","Orleans","St. Tammany")

colnames(PopulationShare) <- c("Mexican","Puerto Rican","Cuban","Other")

PopulationShare

85
R Code for Comparative Piecharts

layout(matrix(c(1,2,3),1,3, byrow=TRUE))

cols <- c("#599ad3", "#9e66ab", "#79c36a", "#f9a65a")

pie(PopulationShare["Jefferson",], init=90, clockwise=T, col=cols,


radius=1.2)

pie(PopulationShare["Orleans",],init=90, clockwise=T, col=cols,


radius=1.2)

pie(PopulationShare["St. Tammany",], init=90, clockwise=T, col=cols,


radius=1.2)

86
Line Charts 150000

130,896
152,42
143,793
104,349
Population Living in Poverty

100000

110,179
105,687

82,469
50000

62,114
?matplot

?matpoints

?matlines Orleans Parish


Rest of the New Orleans Metro
0

1979 1989 1999 2009

Year

87
R Code for Line Charts
Year <- c(1979, 1989, 1999, 2009)

OrleansParish <- c(62114, 105687, 110179, 104349)

RestNewOrleansMetro <- c(143793, 152042, 130896, 82469)

matplot(Year, cbind(OrleansParish, RestNewOrleansMetro),


type="l",
ylab="Population Living in Poverty",
ylim=c(0,160000),
lty=1,
lwd=2,
col=c("darkgreen","orange"),
axes=FALSE
)

axis(1, at=c(1979, 1989, 1999, 2009), labels=c(1979, 1989, 1999, 2009))


axis(2,at=pretty(0:160000)) 88
R Code for Line Charts
segments(Year[1],OrleansParish[1]-10000, Year[1], OrleansParish[1])
segments(Year[2],OrleansParish[2]-10000, Year[2], OrleansParish[2])
segments(Year[3],OrleansParish[3]-10000, Year[3], OrleansParish[3])
segments(Year[4],OrleansParish[4]+10000, Year[4], OrleansParish[4])

text(Year[1], OrleansParish[1]-15000, paste(62,114,sep=","),col="darkgreen")


text(Year[2], OrleansParish[2]-15000, paste(105,687,sep=","),col="darkgreen")
text(Year[3], OrleansParish[3]-15000, paste(110,179,sep=","),col="darkgreen")
text(Year[4], OrleansParish[4]+15000,paste(104,349,sep=","),col="darkgreen")

89
R Code for Line Charts
segments(Year[1],RestNewOrleansMetro[1]-10000, Year[1], RestNewOrleansMetro[1])
segments(Year[2],RestNewOrleansMetro[2]-10000, Year[2], RestNewOrleansMetro[2])
segments(Year[3],RestNewOrleansMetro[3]+10000, Year[3],
RestNewOrleansMetro[3])
segments(Year[4],RestNewOrleansMetro[4]-10000, Year[4], RestNewOrleansMetro[4])

text(Year[1], RestNewOrleansMetro[1]-15000, paste(143,793,sep=","), col="orange")


text(Year[2], RestNewOrleansMetro[2]-15000, paste(152,042,sep=","), col="orange")
text(Year[3], RestNewOrleansMetro[3]+15000, paste(130,896,sep=","), col="orange")
text(Year[4], RestNewOrleansMetro[4]-15000, paste(82,469,sep=","), col="orange")

90
R Code for Line Charts

legend("bottomright",
c("Orleans Parish","Rest of the New Orleans Metro"),
col=c("darkgreen","orange"),
lty=1,
lwd=2,
bty="n"
)

box()

91
Time Series Plot
Used for plotting the values of a quantitative
variable Y versus a time variable T.
e.g.: Y = Ozone
T = Date ?plot

plot(Y ~ T) plot(Y ~ T, type= "l") plot(Y ~ T, type= "h")


0.06

0.06

0.06
0.05

0.05

0.05
0.04

0.04

0.04
Y

Y
0.03

0.03

0.03
0.02

0.02

0.02
Jan 01 Jan 06 Jan 11 Jan 16 Jan 21 Jan 26 Jan 31 Jan 01 Jan 06 Jan 11 Jan 16 Jan 21 Jan 26 Jan 31 Jan 01 Jan 06 Jan 11 Jan 16 Jan 21 Jan 26
92Jan 31
T T T
Time Series Plot (v.1)
Time Series Plot

0.14
plot(Ozone ~ Date,

0.12
ylab="Daily Max Ozone (ppm)",
main="Time Series Plot",

0.10
Daily Max Ozone (ppm)
sub="2011 BATON ROUGE/CAPITOL",

0.08
col.sub="red")

0.06
0.04
0.02
0.00

Jan Mar May Jul Sep Nov Jan

Date
2011 BATON ROUGE/CAPITOL

93
Time Series Plot (v.2)
plot(Ozone ~ Date,
type="l",
ylab="Daily Max Ozone (ppm)",
main="Time Series Plot",
sub="2011 BATON ROUGE/CAPITOL",
col.sub="red")
Time Series Plot

0.14
0.12
0.10
Daily Max Ozone (ppm)

0.08
0.06
0.04
0.02
0.00

Jan Mar May Jul Sep Nov Jan

Date
2011 BATON ROUGE/CAPITOL
94
Time Series Plot (v.3)
plot(Ozone ~ Date,
type="h",
ylab="Daily Max Ozone (ppm)",
main="Time Series Plot",
sub="2011 BATON ROUGE/CAPITOL",
col.sub="red")
Time Series Plot

0.14
0.12
0.10
Daily Max Ozone (ppm)

0.08
0.06
0.04
0.02
0.00

Jan Mar May Jul Sep Nov Jan

Date 95
2011 BATON ROUGE/CAPITOL
Time Series Plot (v.4)

bad <- ifelse(Ozone > 0.10, "red", "darkgrey")

plot(Ozone ~ Date,
type="h", Time Series Plot

0.14
ylab="Daily Max Ozone (ppm)",
col=bad,

0.12
main="Time Series Plot", 0.10

0.10
sub="2011 BATON ROUGE/CAPITOL",

Daily Max Ozone (ppm)

0.08
col.sub="blue")

0.06
abline(h=0.10, lty=2, col="red") 0.04
0.02

text(locator(1),"0.10")
0.00

Jan Mar May Jul Sep Nov Jan

Date
2011 BATON ROUGE/CAPITOL
96
Scatterplot (v.1)

plot(Ozone ~ Temp, Scatterplot

xlab="Temperature (°F)",

0.14
ylab="Daily Max Ozone (ppm)",

0.12
main="Scatterplot")

0.10
Daily Max Ozone (ppm)

0.08
0.06
0.04
0.02
0.00

30 40 50 60 70 80 90 100

Temperature (°F)

97
Scatterplot (v.2)
Scatterplot

0.14
require(car)

0.12
scatterplot(Ozone ~ Temp,
xlab="Temperature (°F)",

0.10
Daily Max Ozone (ppm)
ylab="Daily Max Ozone (ppm)",

0.08
smooth=FALSE,

0.06
reg.line=FALSE,
main="Scatterplot")

0.04
0.02
0.00

30 40 50 60 70 80 90 100

Temperature (°F)

help(scatterplot, package="car")
98
Coplot
coplot(Ozone ~ Temp | RelativeHumidity,
panel = panel.smooth)

Given : RelativeHumidity
60 70 80 90 100

30 40 50 60 70 80 90 100 30 40 50 60 70 80 90 100

0.12
0.08
?coplot

0.04
Ozone

0.00
0.12
0.08
0.04
0.00

30 40 50 60 70 80 90 100

99
Temp
Pairs Plot
pairs(cbind(Ozone, Temp, WindSpeed))

30 40 50 60 70 80 90

0.12
0.08
Ozone

0.04
0.00
90
80
60 70
Temp

50
40
30

15
10
WindSpeed

5
0
0.00 0.04 0.08 0.12 0 5 10 15

100
Exercise on Basic R Graphics
For this exercise, refer to the ozone data frame
available in your R working space and follow the
instructions below to create a variety of basic R
graphs using variables from this data frame.

1. Create a histogram of maxO3.


2. Create a density plot of maxO3.
3. Create a boxplot of maxO3.
4. Create a cumulative distribution plot of maxO3.

101
Exercise on Basic R Graphics
5. Create a scatter plot of maxO3 versus T12.
6. Create a time series plot of maxO3.
7. Create side-by-side boxplots of maxO3 for the
four wind directions stored in the wind
variable).
8. Create a bar chart for the variable rain.
9. Create a bar chart for rain according to wind.

102
Customizing
Basic R Graphics
Learning Goal:

Learn how to customize basic graphs


using the R graphics package.

103
Customizing Basic R Graphics
Adding a main title: BATON ROUGE

80
e.g.: hist(Ozone,

Frequency

40
main="BATON ROUGE")

0
0.00 0.04 0.08 0.12

Ozone

Adding a subtitle: Histogram of Ozone

e.g.: hist(Ozone,

80
Frequency
sub="Year 2011")

40
0
0.00 0.04 0.08 0.12

Ozone
Year 2011
104
Customizing Basic R Graphics
Adding x-axis and y- axis labels:
e.g.: hist(Ozone,
xlab="Ozone",
ylab="Frequency") Histogram of Ozone

80
Frequency

40
0
0.00 0.04 0.08 0.12

Ozone

105
Customizing Basic R Graphics
Adding a legend:
hist(Ozone, freq=FALSE, ylim=c(0,30))
lines(density(Ozone))
legend("topright", "Density Curve", lty=1,bty="n")
Histogram of Ozone
30

Density Curve
25
20
Density

15
10
5
0

0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14


106
Ozone
Customizing Basic R Graphics
Adding text annotation:

hist(Ozone)
text(locator(1), "BATON ROUGE")
text(0.12, 20, "Year 2011") Histogram of Ozone

Notes:

80
BATON ROUGE

Frequency

60
When using the text() function:

40
• locator(1) places text wherever we click on

20
Year 2011
the current graph;
0
• text(x,y, "some text") places text at graph 0.00 0.04 0.08 0.12

location defined by (x,y) coordinates. Ozone


107
Customizing Basic R Graphics
Histogram of Ozone

Adding colors:

80
Frequency

60
hist(Ozone, col="violet")

40
20
0
0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14

Ozone

Note:
To see the list of 600+ colors available in R, type the following command in the
R Console window:

colors()

See also the help files for functions such as rainbow(), heat.colors(),
terrain.colors() and palette().
108
Customizing Basic R Graphics
Adding graphical symbols:

pch=1 pch=19
0.12

0.12
0.08

0.08
Ozone

Ozone
0.04

0.04
0.00

0.00

30 40 50 60 70 80 90 30 40 50 60 70 80 90

Temp Temp

par(mfrow=c(1,2))
plot(Ozone ~ Temp, pch=1, main="pch=1") Note:
plot(Ozone ~ Temp, pch=19, main="pch=19") To see the graphical symbols
available in R, use the command:
example(pch) 109
Customizing Basic R Graphics
Controlling the size of graphical symbols:
par(mfrow=c(1,3)) Options for cex:
cex = 1 (default size)
plot(Ozone ~ Temp, cex=1, main="cex=1")
cex = 0.5 (half default)
plot(Ozone ~ Temp, cex=0.5, main="cex=0.5") cex = 2 (twice default)
plot(Ozone ~ Temp, cex=2, main= "cex=2")

cex=1 cex=0.5 cex=2


0.12

0.12

0.12
0.08

0.08

0.08
Ozone

Ozone

Ozone
0.04

0.04

0.04
0.00

0.00

0.00
30 40 50 60 70 80 90 100 30 40 50 60 70 80 90 100 30 40 50 60 70 80 90 100
110
Temp Temp Temp
Customizing Basic R Graphics
Options for line width:
Adding lines: lwd = 1 (default)
lwd = 0.5 (half default)
hist(Ozone, freq=FALSE, ylim=c(0,30)) lwd= 2 (twice default)

lines(density(Ozone))

Controlling the type and width of lines:


hist(Ozone, freq=FALSE, ylim=c(0,30)) Histogram of Ozone

30
lines(density(Ozone), lty=2, lwd=2)

25
20
Options for line type:

Density

15
lty=1 (solid) lty=5 (longdash)

10
lty=2 (dashed) lty=6 (twodash)

5
lty=3 (dotted)
0
lty=4 (dotdash) 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14

Ozone 111
Exercise on Customizing Basic R Graphics
Create a scatter plot of maxO3 versus T12 using
the variables in the ozone data frame. Enhance
this scatter plot by adding the following
elements to it:

- main title and subtitle


- x-axis and y-axis labels
- some text annotation
- some color
- a particular type of graphical symbol (e.g., pch=8)

112
Advanced
R Graphics
Learning Goal:

Learn how to produce advanced


graphs using the R lattice package.

113
Advanced R Graphics
Recall that advanced R graphics can be produced
using any of the following R packages:
Note:
• grid (not covered in this course) The lattice package
can replicate most of
the basic graphics.
However, the lattice
• trellis (covered in this course) package is particularly
helpful for visualizing
data conditional on
the values of one or
• ggplot2 (not covered in this course) more variables.

114
Lattice Functions
lattice function graphics function Description
histogram() hist() Histogram
densityplot() plot(density()) Density Plot
bwplot() boxplot() Boxplot
stripplot() stripchart() Strip Plot
xyplot() plot() Scatter Plot
dotplot() dotchart() Dot Plot
barchart() barplot() Bar Chart
splom() pairs() Pairwise Scatterplot
cloud() persp() 3-D Scatterplot

115
Lattice Formulas
The functions in the lattice package rely on a
formula framework. For instance:

histogram(~ Y)
Symbol interpretation
histogram(~ Y|F)
~ as a function of
histogram(~Y|F1*F2)
| conditional on
xyplot(Y ~ X)
* crossed with
xyplot(Y ~ X |F)
xyplot(Y ~ X|F1*F2)
116
Histogram 2011 BATON ROUGE/CAPITOL

require(lattice)
100

Count
histogram(~Ozone,
50
xlab="Daily Max Ozone (ppm)",
type="count",
main="2011 BATON ROUGE/CAPITOL") 0

0.00 0.05 0.10 0.15

Daily Max Ozone (ppm)

Note:

For the histogram function, we can also use type= "density" to obtain a density
histogram or type="percent" to obtain a percent of total histogram.
117
Conditional Histograms
histogram(~Ozone | Month,
xlab="Daily Max Ozone (ppm)",
type="count",
main="2011 BATON ROUGE/CAPITOL")
2011 BATON ROUGE/CAPITOL
0.00 0.05 0.10 0.15 0.00 0.05 0.10 0.15

September October November December


20
15
10
5
0
May June July August
20
15
Count

10
5
0
January February March April
20
15
10
5
0

0.00 0.05 0.10 0.15 0.00 0.05 0.10 0.15 118


Daily Max Ozone (ppm)
Density Plot

densityplot(~Ozone,
xlab="Daily Max Ozone (ppm)",
main="2011 BATON ROUGE/CAPITOL")
2011 BATON ROUGE/CAPITOL

25

20

15
Density

10

0.00 0.05 0.10 0.15

Daily Max Ozone (ppm)


119
Conditional Density Plots
densityplot(~Ozone | Month,
xlab="Daily Max Ozone (ppm)",
main="2011 BATON ROUGE/CAPITOL",
as.table=TRUE) 2011 BATON ROUGE/CAPITOL
0.00 0.05 0.10 0.15 0.00 0.05 0.10 0.15

January February March April


50
40
30
20
10
0
May June July August
50
40
Density

30
20
10
0
September October November December
50
40
30
20
10
0

0.00 0.05 0.10 0.15 0.00 0.05 0.10 0.15

Daily Max Ozone (ppm) 120


Boxplot
bwplot(~Ozone,
xlab="Daily Max Ozone (ppm)",
main="2011 BATON ROUGE/CAPITOL")

2011 BATON ROUGE/CAPITOL

0.00 0.05 0.10

Daily Max Ozone (ppm)


121
Side-by-Side Boxplots
bwplot(Ozone ~ Month,
xlab="Month",
ylab="Daily Max Ozone (ppm)")

0.10
Daily Max Ozone (ppm)

0.05

0.00

January February March April May June July August September October November December
122
Month
Conditional Boxplots
bwplot(~Ozone | Month,
xlab="Daily Max Ozone (ppm)",
main="2011 BATON ROUGE/CAPITOL")

2011 BATON ROUGE/CAPITOL


0.00 0.05 0.10 0.00 0.05 0.10 0.00 0.05 0.10

July August September October November December

January February March April May June

0.00 0.05 0.10 0.00 0.05 0.10 0.00 0.05 0.10

Daily Max Ozone (ppm) 123


Strip Plot
stripplot(~Ozone,
xlab="Daily Max Ozone (ppm)",
main="2011 BATON ROUGE/CAPITOL")

2011 BATON ROUGE/CAPITOL

0.00 0.05 0.10

Daily Max Ozone (ppm)


124
Side-by-Side Strip Plots
2011 BATON ROUGE/CAPITOL

stripplot(Ozone ~ Month,
xlab="Month",

Daily Max Ozone (ppm)


0.10

ylab="Daily Max Ozone (ppm)",


main="2011 BATON ROUGE/CAPITOL",
0.05
jitter=FALSE)

0.00

January February March April May June July August September October November December

Month

2011 BATON ROUGE/CAPITOL

stripplot(Ozone ~ Month,
xlab="Month",
ylab="Daily Max Ozone (ppm)",
Daily Max Ozone (ppm)

0.10

main="2011 BATON ROUGE/CAPITOL",


jitter=TRUE) 0.05

0.00

January February March April May June July August September October
125
November December

Month
Conditional Strip Plots
stripplot(~ Ozone | Month,
xlab="Daily Max Ozone (ppm)",
main="2011 BATON ROUGE/CAPITOL",
as.table=TRUE)
2011 BATON ROUGE/CAPITOL
0.00 0.05 0.10 0.00 0.05 0.10

January February March April

May June July August

September October November December

0.00 0.05 0.10 0.00 0.05 0.10

Daily Max Ozone (ppm)


126
Time Series Plot
xyplot(Ozone ~ Date, type="l",
ylab="Daily Max Ozone (ppm)",
main="2011 BATON ROUGE/CAPITOL")

2011 BATON ROUGE/CAPITOL


Daily Max Ozone (ppm)

0.10

0.05

0.00

Jan Apr Jul Oct Jan

Date
127
Scatterplot
xyplot(Ozone ~ Temp,
xlab="Temperature (°F)",
ylab="Daily Max Ozone (ppm)",
main="2011 BATON ROUGE/CAPITOL")
2011 BATON ROUGE/CAPITOL

Daily Max Ozone (ppm)

0.10

0.05

0.00

40 60 80 100

Temperature (°F)
128
Conditional Scatterplots
xyplot(Ozone ~ Temp | Month,
xlab="Temperature (°F)",
ylab="Daily Max Ozone (ppm)",
as.table=TRUE,
main="2011 BATON ROUGE/CAPITOL")
2011 BATON ROUGE/CAPITOL
40 60 80 100 40 60 80 100

January February March April

0.10

0.05
Daily Max Ozone (ppm)

0.00
May June July August

0.10

0.05

0.00
September October November December

0.10

0.05

0.00

40 60 80 100 40 60 80 100 129


Temperature (°F)
December

Dot Plot (v.1) November

October

September

August

July

June

May

require(plyr) April

March

air$Month <- Month February

January
View(air)
0.03 0.04 0.05 0.06
Median Value of Daily Max Ozone on a Given Month (ppm)

OzoneMonthlySummary <- ddply(air, "Month",


summarise,
Median = median(Ozone),
Q1 = quantile(Ozone, prob=0.25),
Q3 = quantile(Ozone, prob=0.75))

OzoneMonthlySummary
130
Dot Plot (v.1)

dotplot(Month ~ Median, data=OzoneMonthlySummary,


aspect=1.0,
xlab="Median Value of Daily Max Ozone on a Given Month (ppm)",
scales=list(cex=1.0),
panel = function (x, y) {
panel.abline(h = as.numeric(y), col = "gray", lty = 2)
panel.xyplot(x, as.numeric(y), col = "blue", pch = 16)
}
)

131
Dot Plot (v.2)
December

November

October

September

August

July

June

May

April

March

February

January

0.02 0.04 0.06 0.08

Median Value of Daily Max Ozone on a Given Month (ppm)


Note:
Reported ranges represent
inter-quartile ranges.
132
Dot Plot (v.2)

dotplot(Month ~ Median, data = OzoneMonthlySummary,


aspect = 1,
xlim = c(0, 0.10),
xlab = "Median Value of Daily Max Ozone on a Given Month (ppm)",
panel = function (x, y) {
panel.xyplot(x, y, pch = 16, col = "red")
panel.segments(OzoneMonthlySummary$Q1, as.numeric(y),
OzoneMonthlySummary$Q3, as.numeric(y),
lty = 1, col = "black")
}
)

133
Bar Chart (v.1)
barchart(Median ~ Month, data = OzoneMonthlySummary,
xlab="Month",
ylab="Median Value of Daily Max Ozone on a Given Month (ppm)",
main="2011 BATON ROUGE/CAPITOL")

2011 BATON ROUGE/CAPITOL

Median Value of Daily Max Ozone on a Given Month (ppm) 0.06

0.05

0.04

0.03

January February March April May June July August September October November December

Month 134
Bar Chart (v.2)
barchart(Median ~ Month, data = OzoneMonthlySummary,
xlab="Month",
ylab="Median Value of Daily Max Ozone on a Given Month (ppm)",
main="2011 BATON ROUGE/CAPITOL",
panel=function(x, y, ...) {
panel.barchart(x, y, ...)
ltext(x=x, y=y+0.001, labels=y)
}
) 2011 BATON ROUGE/CAPITOL

0.068

Median Value of Daily Max Ozone on a Given Month (ppm)

0.06

0.0565
0.054
0.052

0.05

0.0445 0.045
0.044
0.043
0.041
0.04
0.0375
0.036

0.03
0.028

January February March April May June July August September October November December

Month 135
Bar Chart (v.3)
barchart(Month ~ Median, data = OzoneMonthlySummary,
xlab="Month",
xlim=c(0,0.08),
ylab="Median Value of Daily Max Ozone on a Given Month (ppm)",
main="2011 BATON ROUGE/CAPITOL",
panel=function(x, y, ...) {
panel.barchart(x, y, ...)
ltext(x=x+0.003, y=y, labels=x) 2011 BATON ROUGE/CAPITOL

} December 0.028

Median Value of Daily Max Ozone on a Given Month (ppm)


November 0.0375

October 0.052

September 0.054

August 0.068

July 0.043

June 0.0565

May 0.045

April 0.0445

March 0.044

February 0.041

January 0.036

0.02 0.04 0.06

Month 136
Splom Plots
splom( ~ cbind(Ozone, Temp, RelativeHumidity))

100
80 90 100

90

80 RelativeHumidity 80

70

60 70 80
60

100 70 80 90 100
90
80
70
Temp
60
50
40
30 40 50 60
30
0.14
0.080.100.120.14
0.12
0.10
0.08
Ozone
0.06
0.04
0.02
0.000.020.040.06
0.00

Scatter Plot Matrix

137
Cloud Plot
cloud(Ozone ~ Temp*RelativeHumidity)

Ozone

RelativeHumidity
Temp

138
Conditional Cloud Plot
RelHum <- RelativeHumidity

cloud(Ozone ~ Temp*RelHum | Month, as.table=TRUE, panel.aspect=0.8)

January February March April

Ozone Ozone Ozone Ozone

RelHum Temp RelHum Temp RelHum Temp RelHum Temp

May June July August

Ozone Ozone Ozone Ozone

RelHum Temp RelHum Temp RelHum Temp RelHum Temp

September October November December

Ozone Ozone Ozone Ozone

RelHum Temp RelHum Temp RelHum Temp RelHum Temp 139


Enhancing Lattice
Graphs
Learning Goal:

Learn how to enhance advanced


graphs using the R lattice package.

140
Enhancing Lattice Graphs
Lattice graphs can be enhanced by:
• Adding basic features (e.g., titles, x-axis and
y-axis labels, colors)
• Using panel functions
• Using additional lattice graphics via the
latticeExtra package

141
Basic Enhancement of Lattice Graphs
histogram( ~ Ozone,
xlab = "Daily Max Ozone (ppm)",
main= "2011 BATON ROUGE/CAPITOL",
col= "lightblue") 2011 BATON ROUGE/CAPITOL

30

Percent of Total

20

10

0.00 0.05 0.10 0.15

Daily Max Ozone (ppm)


142
Panel Enhancements of Lattice Graphs
histogram( ~ Ozone ,
xlab = "Ozone", type = "density",
panel = function(x, ...) {
panel.histogram(x, ...);
panel.mathdensity(dmath = dnorm,
col = "black",
args = list(mean=mean(x),sd=sd(x))) 25

} 20

15

Density
)
10

0.00 0.05 0.10 0.15

Ozone

143
Panel Enhancements of Lattice Graphs

mypanel <- function(x,y){


panel.xyplot(x,y)
panel.lmline(x,y, col="red", lty=2)
panel.loess(x,y, col="blue")
}
0.10

xyplot(Ozone ~ Date, panel=mypanel)

Ozone
0.05

0.00

Jan Apr Jul Oct Jan

Date

144
The latticeExtra Package
The latticeExtra package extends the lattice package and has its own
dedicated website:

http://latticeextra.r-forge.r-project.org/

The latticeExtra package can be installed in R with the command:

install.packages("latticeExtra")

Once installed in R, the latticeExtra package can be attached to the


current R session with the command:

require(latticeExtra)

145
latticeExtra: marginal.plot()
General Syntax:
January
marginal.plot( ~ Y) February
March
marginal.plot( ~ Y, groups = G, Ozone Temp

auto.key=list(lines=TRUE))
require(latticeExtra)
2 4 6 40 60 80
0. 0 0. 0 0. 0
air$Month <- Month RelativeHumidity SolarRadiation

air1 <- subset(air, Month=="January" |


Month=="February" |
Month =="March") 60 80 10
0 0. 5 1. 0

air2 <- air1[ ,c("Month","Ozone","Temp","RelativeHumidity","SolarRadiation")]

air2$Month <- factor(air2$Month, levels=c("January","February","March"))

marginal.plot(air2[ ,-1])

marginal.plot(air2[,-1], groups=air2$Month,
auto.key=list(lines=TRUE)) 146
latticeExtra: xyplot()
General Syntax:
xyplot( Y~ X,
xlab="x-axis label",
ylab="y-axis label",
main="Main Title",
panel = function(...) {
panel.xyplot(...)
panel.smoother(..., span = 0.5)
}
)

xyplot(Ozone~Date,
xlab="Year 2011",
ylab="Daily Max Ozone (ppm)",
main="BATON ROUGE/CAPITOL",
panel = function(...) {
panel.xyplot(...)
panel.smoother(..., span = 0.9)
}
) 147
latticeExtra: ecdfplot() Empirical Cumulative Distribution Plot

1.0

0.8

Empirical CDF
0.6
require(latticeExtra) 0.4

0.2

ecdfplot(~Ozone, 0.0

0.00 0.05 0.10


xlab="Daily Max Ozone (ppm)", Daily Max Ozone (ppm)

main="Empirical Cumulative Distribution Plot") Empirical Cumulative Distribution Plot


0.00 0.05 0.10 0.00 0.05 0.10

September October November December


1.0
0.8
0.6
0.4
ecdfplot(~Ozone | Month, 0.2
0.0

Empirical CDF
May June July August
1.0
xlab="Daily Max Ozone (ppm)", 0.8
0.6
0.4
0.2
0.0
main="Empirical Cumulative Distribution Plot") 1.0
January February March April
0.8
0.6
0.4
0.2
0.0
0.000.05 0.10 0.00 0.05 0.10

Daily Max Ozone (ppm) 148


Exercise on Lattice Graphics
For this exercise, please refer to the variables in the
ozone data frame. Before starting the exercise,
create a month variable using the R commands
provided below.

month <- months(date)


month <- factor(month, levels=unique(month))

Remember to attach the lattice package to your


current R session with the command:

require(lattice) 149
Exercise on Lattice Graphics
Use the functions in the lattice package to
create the following graphs.
1) Create a histogram of maxO3.
2) Create conditional histograms of maxO3 for
each month.
3) Create a density plot of maxO3.
4) Create conditional density plots of maxO3 for
each month.
5) Create a boxplot of maxO3.
6) Create side-by-side boxplots of maxO3 for each
month. 150
Exercise on Lattice Graphics
7) Create a scatter plot of maxO3 vs. T12 and
enhance it by adding a title and axes labels.
8) Create conditional scatter plots of maxO3 vs.
T12 given month.
9) Create a time series plot of maxO3.
10) Create a separate time series plot of maxO3
for each month.
11) Create a splom plot of the variables maxO3, T12
and Wx12.
12) Create a cloud plot visualizing the dependency
of maxO3 on T12 and Wx12.
151
R Graphics
Housekeeping
Learning Goal:

Learn how to save the graphs you


produce in R.

152
Exporting R Graphics
R Graphics can be exported in a variety of formats:

• metafile (.wmf)
• postscript (.ps)
• pdf (.pdf)
• png (.png)
• bmp (.bmp)
• TIFF (.tiff)
• JPEG (.jpeg)

153
Exporting R Graphics
Graphics can be exported from R in one of three
ways:

1. Using the R Gui Menu: File  Save as.

2. Using Ctrl + C or Ctrl + W to copy the graph from the R


Graphics window and Ctrl + V to insert the graph in a
Word, Excel or Power Point document.

3. Using the R command line.

154
Exporting R Graphics
To save graphs from the R command line, use any of
the following R functions:
Accessing Help Files for R Graphics
win.metafile() Export Functions:
postscript()
pdf() ?win.metafile
?postscript
png() ?pdf
bmp() ?png
tiff() ?bmp
?tiff
jpeg() ?jpeg

Each of these functions will re-direct graphical output from the


R graphics window to a file of the corresponding type (e.g.,
pdf). Using dev.off() after constructing the graph is required.
155
R Graphics Housekeeping
To create a windows metafile from the R
command line, use:

win.metafile("graph.wmf")
hist(rnorm(100))
dev.off()

156
R Graphics Housekeeping
To create a postscript file from the R
command line, use:

postscript("graph.ps")
hist(rnorm(100))
dev.off()

Postscript files can be viewed with GPL Ghostscript


(http://www.cs.wisc.edu/~ghost/).
157
R Graphics Housekeeping
To create a pdf file from the R command line,
use:

pdf("graph.pdf")
hist(rnorm(100))
dev.off()

158
R Graphics Housekeeping
To create a png file from the R command line,
use:

pdf("graph.png")
hist(rnorm(100))
dev.off()

159
R Graphics Housekeeping

To create a bmp file from the R command line,


use:

bmp("graph.bmp")
hist(rnorm(100))
dev.off()

160
R Graphics Housekeeping

To create a tiff file from the R command line,


use:

tiff("graph.tiff")
hist(rnorm(100))
dev.off()

161
R Graphics Housekeeping
To create a jpeg file from the R command line,
use:

jpeg("graph.jpeg")
hist(rnorm(100))
dev.off()

162
Exercise on R Graphics Housekeeping
With reference to the ozone data frame, create a
histogram of the variable maxO3 using either the
hist() function in the graphics package or the
histogram() function in the lattice package.

1) Save this histogram as a pdf file using the R Gui


menu commands File  Save as.

2) Save this histogram as a pdf file using the


command line approach facilitated by the pdf()
function.
163
Summary

164
Summary
 R provides 4 different graphical systems for producing
elegant, publication-quality graphics: graphics, grid,
lattice, ggplot2.
 In this course, we explored in more detail some of the
functionality available in the graphics and lattice packages.
The lattice package relies heavily on the grid package.
 Once you get comfortable with the graphics and lattice
packages, you can start exploring the ggplot2 package.
 The ggplot2 package purports to combine the best
features of the graphics and lattice packages,
but has a completely different, more abstract syntax.
165
References on
Graphics in R

166
References on Graphics in R
Books:
• “Graphics for Statistics and Data Analysis with
R”, by Kevin J. Keen (CRC Press, 2010)
• “Lattice: Multivariate Data Visualization with
R”, by Deepayan Sarkar (Springer, 2008)
• “ggplot2: Elegant Graphics for Data Analysis”,
by Hadley Wickham (Springer-Verlag, 2009)
• “R Graphics”, 2nd Edition, by Paul Murrell
(Chapman & Hall/CRC, 2006)
167
References on Graphics in R
Websites:
Website Address Website Description
http://www.r-project.org R Project
http://www.statmethods.net Quick-R
http://www.r-bloggers.com R Bloggers
http://lmdvr.r-forge.r-project.org Lattice Website
http://ggplot2.org Ggplot2 Website
http://www.stat.auckland.ac.nz/~paul/grid/grid.html Grid Website
http://gallery.r-enthusiasts.com R Graph Gallery

168
Thank you
Thank you very much for attending this course.

If you have any questions related to the content of this


course, please contact Dr. Isabella Ghement at the
following address:
Dr. Isabella R. Ghement
Ghement Statistical Consulting Company Ltd.
301-7031 Blundell Road
Richmond, B.C.
Canada, V6Y 1J5
Tel: 604-767-1250
Fax: 604-270-3922
E-Mail: [email protected]
169

You might also like