0% found this document useful (0 votes)
16 views

Graphics Lecture

Uploaded by

madiha yousaf
Copyright
© © All Rights Reserved
Available Formats
Download as ODP, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Graphics Lecture

Uploaded by

madiha yousaf
Copyright
© © All Rights Reserved
Available Formats
Download as ODP, PDF, TXT or read online on Scribd
You are on page 1/ 14

A grammar for graphics

Taxonomy for understanding data graphics.



Visual Cues

Coordinate Systems

Scale

“ggplot2 package can be used to create data


graphics”

two-dimensional data graphics in R include base graphics and the


lattice system

But we use “ggplot2” as it provides “a grammar”—for describing


and specifying graphics
The grammar of ggplot2

Different functions for different kinds of visual representation.

Geoms – these are the geometric objects. Do you need bars, points, lines?

add ‘geoms’ – graphical representations of the data in the plot (points, lines,
bars). ggplot2 offers many different geoms; some common ones includes:


geom_point() for scatter plots, dot plots, etc.

geom_boxplot() for, well, boxplots!

geom_line() for trend lines, time series, etc.

geom_col() for making columns
The grammar of ggplot2
ggplot(mydata100, aes(x = factor(""), fill = workshop) ) +
geom_bar()
ggplot(mydata100,
aes(x = factor(""), fill = workshop) ) +
geom_bar() +
coord_polar(theta = "y") +
scale_x_discrete("")
The grammar of ggplot2

Aesthetics(aes()): We typically understand aesthetics as how
something looks, color, size etc. Aesthetics do not refer how something
looks in R

these are the roles that the variables play in each graph. A variable may
control where points appear,

the color or shape of a point, the height of a bar and so on.

map a variable to a visual cue.

aes(y = gdp, x = educ)

aes(label = country, color = net_users)

ggplot(data, aes(x=distance, y= dep_delay)) +

geom_point(color=blue)
Aesthetics
g <- ggplot(data = CIACountries, aes(y = gdp, x = educ))
g + geom_point(size = 3)
g + geom_text(aes(label = country, color = net_users), size = 3)

g + geom_point(aes(color = net_users, size = roadways))


The grammar of ggplot2
Scales: these are legends that show things like circular symbols
represent females while circles represent males.
ggplot(mydata100,
aes(x = factor(""), fill = workshop) ) +
geom_bar() +
coord_polar(theta = "y") +
scale_x_discrete("")

scale_x_continuous(), scale_x_discrete(), scale_color(),
scale_y_continous()
The grammar of ggplot2
Guides: Context is provided by guides (more commonly called
legends).
A guide helps a human reader understand the meaning of the visual cues by
providing context.
For position visual cues, the most common sort of guide is the familiar
axis with its tick marks and labels.

legends relate how dot color corresponds to different variables

Functions: geom_text() and geom_label()


The grammar of ggplot2
Using multiple aesthetics such as shape, color, and size to display multiple
variables can produce a confusing, hard-to-read graph

Facets: multiple side-by-side graphs used to display levels of a categorical
variable—provide a simple and effective alternative.
facet_wrap(): creates a facet for each level of a single categorical variable,
facet_grid(): creates a facet for each combination of two categorical variables,
arranging them in a grid.

g+
geom_point(alpha = 0.9, aes(size = roadways)) +
coord_trans(y = "log10") +
facet_wrap(~net_users, nrow = 1) +
theme(legend.position = "top")
Canonical data graphics in R

Univariate displays: how a single variable is distributed

variable is numeric, then its distribution is commonly summarized
graphically using a histogram or density plot.

g <- ggplot(data = SAT_2010, aes(x = math))

g + geom_histogram(binwidth = 10) + labs(x = "Average math SAT
score")
Canonical data graphics in R

g + geom_density(adjust = 0.3)

variable is categorical,a bar graph to display the distribution of a categorical variable


ggplot(
data = head(SAT_2010, 10),
aes(x = reorder(state, math), y = math)
)+
geom_col() +
labs(x = "State", y = "Average math SAT score")
Canonical data graphics in R

Multivariate displays: most effective way to convey the
relationship between more than one variable.

distribution is commonly summarized graphically using a scatter plot.

g <- ggplot(
data = SAT_2010,
aes(x = expenditure, y = math)
)+
geom_point()
g + aes(color = SAT_rate)

g + facet_wrap(~
SAT_rate)
Canonical data graphics in R

g + facet_wrap(~ SAT_rate)
Canonical data graphics in R

Maps: Geographically
distributed data
Canonical data graphics in R
Networks:is a set of connections, called edges, between
nodes, called vertices. A vertex represents an entity. The
edges indicate pairwise relationships between those
entities.

You might also like