Visualization Techniques
Visualization Techniques
Visualization Tools
1. HISTOGRAM
2. DENSITY PLOT
3. BOX PLOT
4. BAR GRAPH
5. PIE CHART
6. LINE CHART
7. SCATTERPLOT
8. JOINT BAR
GRAPH
Visualization Techniques
• Is a factor significant?
• Does the location differ between
subgroups?
• Does the variation differ between
subgroups?
• Are there any outliers?
Boxplot: using ggplot
ggplot(titanic, aes(x =
factor(Survived), y = Age, fill =
factor(Survived))) +
geom_boxplot() +
labs(title = "Titanic Survivals by
Age",
x = "Survival (0 = No, 1 = Yes)",
y = "Age") +
scale_fill_manual(values =
c("red", "pink"), labels = c("Did
not Survive", "Survived")) +
theme_minimal()
Plotting Age against Survival (0 = No, 1 = Yes).
Bargraph: using
ggplot
A bar graph showing the count of
passengers by gender in the Titanic
dataset, with "Gender" on the x-axis
and "Count" on the y-axis.
ggplot(titanic, aes(x =
as.factor(Pclass), fill =
as.factor(Survived))) +
geom_bar(position = "stack") +
labs(x = "Passenger Class", y =
"Count", fill = "Survived") +
theme_minimal() +
scale_fill_manual(values
Explanati = c("red",
"green"))
on
Pclass is on the x-axis, and the bars are filled based on Survived.
geom_bar(position = "stack"): This tells ggplot2 to stack the bars.
Pie Chart: using simple method
• Pie Chart is used to see to the
proportion of each categories of a
particular categorical variable.
• It produces a pie chart showing the
distribution of the different classes of
passengers (e.g., 1st class, 2nd class,
3rd class) in the dataset, with each
group represented in a different color. Explanati
1. type_counts <- table(titanic$Pclass)
type_counts <- on line creates a frequency table of the group
This
Step 1:
plot(airmiles, type = "o", col =
"blue", lwd = 2,
xlab = "Year", ylab = "Passenger
Miles (millions)",
main = "Airline Passenger Miles
Explanation
(1937-1960)")
1. type = "o": Plot both points and lines, with points overlaid on the
lines.
Step 1:
Line Chart: using ggplot
airmiles1 <- data.frame(Year = 1937:1960,
Miles = as.numeric(airmiles))
Step 2:
ggplot(airmiles1, aes(x = Year, y = Miles))
+ geom_line(color = "blue", linewidth =
1.2) + geom_point(color = "blue", size =
2) +
labs(x = "Year", y = "Passenger Miles
(millions)",
Explanation
title = "Airline Passenger Miles (1937-
Convert 'airmiles'
1960)") to a dataframe
+ theme_minimal()
geom_line() adds a line connecting the data points.
geom_point()
size = 2 increases the size of the points.
Scatterplot: using
Scatterplot is being used to show ggplot
relationship between two variables.
You can easily that there is a negative
correlation between MPG and weight
Dataset used: mtcars
plot(mtcars$mpg, mtcars$wt, main
= "Scatter plot of MPG vs Weight",
xlab = "Miles per Gallon (MPG)",
ylab = "Weight (wt)",
pch = 19, col = "blue")
Explanati
on
The plot function automatically creates a scatter plot by pairing each value of mpg
with its corresponding value of wt.
pch stands for "plot character" and the number 19 specifies that the points should
be filled circles.
Scatter plot: using ggplot
Explanation
Convert 'airmiles' to a dataframe
geom_line() adds a line connecting the data points.
geom_point()
size = 2 increases the size of the points.