22MSM40206 Data Visualisation
22MSM40206 Data Visualisation
SUBMITTED TO
CHANDIGARH UNIVERSITY, MOHALI PUNJAB
IN PARTIAL FULFILMENT OF THE
REQUIREMENTS FOR THE DEGREE OF
M.SC DATA SCIENCE
Install.packages(“dplyr”)
Library(dplyr)
Summary(mtcars)#data set used
o/p-
A data frame with 32 observations on 11 (numeric) variables.
• Aesthetic layer :- here we will display and map dataset into certain aesthetic.(attributes)
• Geometric layer:- it control the essential elements,see how our data being
displayed using the
point
line,
histogram,
bar,
boxplot
Col=disp, col(parameter) means color and disp (variable) that represent the engine displacement of a car. This
variable is commonly used in datasets related to cars or automotive industry.
Col=disp maps the Values of the “disp” variable to the color of the points in the plot.theme means that points with
different “disp” values will be shown in different colors , making it easier to visually distinguish between different
groups groups or levels of the “disp” variable.
• Geometric Layer: Adding size ,color and shape and then ploting histogram.
In R, cyl likely refers to a variable that represents the number of cylinders in an engine of a car.
$ and is used to extract .
In R,’am’,likely refers to a variable that represent that represents the type of transmission in a car
• Q why y axis is not used in R?
• Ans:-In a histogram plot, the y-axis represents the frequency or count of observations that fall within each bin or interval on the x-axis.
However, the height of the bars in a histogram is not the primary focus of the plot, and therefore, the y-axis is often not labeled or is simply
labeled as "Frequency" or "Count".
• The primary focus of a histogram is to display the distribution of a single variable.
• Facet Layer:- In R, this allow you to create multiple plots arranged in a grid,where wach plot represent a subset of your data based on
acategorical variable.
• To create facet plot ,facet_grid or facet _wrap() functions are used
• Facet_grid() function creates a grid of plots with one variable along the rows and another variable along the columns.
• Facet_wrap() function create a series of plots, each representing a level of a single variable.
• Statistic Layer :- we transform our data using binning,smoothing,descriptive,intermediate.
• You can use geom_point() to create a scatter plot, and then use stat_smooth() to add a smooth curve to the
plot.
Coordinate Layer: that display the coordinates of the data points. This often used in scatter
plot and other types of plots where the x and y axes represent numeric values.
Function we are going to use is coor_*().
1. Coord_flip():-flip the x axes,useful for horizontal plots or bar charts.
2. Coord_polar():-useful for creating pie charts or radial plots.
3. Coord_sf():- converts the plot to a spatial object using the simple features(sf)package, useful
for Visualizing maps.
4. Coord_cartesian():-sets the limits of the x and y axes, zooming in or out on the plot without
changing the scale.
‘geom_smooth()’ function adds a layer to the plot with a fitted
regression line.
The method=“lm” argument specifies that a linear regression
line should be used.