0% found this document useful (0 votes)
20 views

Data Visualization Exp. 2

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Data Visualization Exp. 2

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Experiment No.

2
Aim: Create bar charts using Matplotlib or Seaborn to compare categorical data or
display frequency distributions with data labels.
Introduction:
What Are Bar Plots?
A bar graph is a graphical representation of data in which we can highlight the
category with particular shapes like a rectangle. The length and heights of the bar
chart represent the data distributed in the dataset. In a bar chart, we have one axis
representing a particular category of a column in the dataset and another axis
representing the values or counts associated with it. Bar charts can be plotted
vertically or horizontally. A vertical bar chart is often called a column chart.
Barplot vs Histograms: Which Plot to Use?
Histograms are used to represent the distribution of continuous data. They are a
graphical representation of a set of continuous or discrete data frequency
distributions. By plotting the data into bins or intervals, a histogram allows us to
easily visualize the number of data points within each bin, giving us a sense of the
distribution of the data. Some common uses of histograms include:

1. Exploring the distribution of a single variable: A histogram can help you


understand the distribution of a single variable, such as height, weight, or
income.
2. Comparing two or more groups: By plotting histograms for different groups,
you can compare the distributions of two or more variables. This can help
you identify differences and similarities between groups.
3. Detecting outliers: Histograms can help you detect outliers or data points
significantly different from the rest of the data.
4. Estimating probability density: By normalizing a histogram, you can
estimate the probability density function, which describes the probability of
observing a data point within a given interval.

1
A bar plot, on the other hand, is used to represent categorical data, which is data
that can be divided into distinct categories. In a bar plot, each category is
represented by a separate bar, and the height of the bar represents the frequency or
count of data points in that category.
Some of the most common types of bar plots in Python are:
1. Simple bar plot: A bar plot representing a single data set, where each bar
represents a single data point.
2. Grouped bar plot: A bar plot representing multiple sets of data, where each group
represents a separate data set.
3. Stacked bar plot: A bar plot representing multiple sets of data, where the height
of each bar is the sum of the values for each data set.
4. Horizontal bar plot: A bar plot that is rotated 90 degrees to the left, where the x-
axis is the vertical axis, and the y-axis is the horizontal axis.
5. Error bar plot: A bar plot that includes error bars representing the data’s
uncertainty.
So, the choice between a histogram and a bar plot depends on the type of data
you are working with. If you have continuous data, a histogram is an
appropriate choice. If you have categorical data, a bar plot is an appropriate
choice. Additionally, if you have ordinal data, which is data that can be ordered,
such as star ratings or levels of education, you may choose to use a bar plot.
Common Use Cases for Bar Plots
Here are some common use cases for bar plots:

1. Comparing frequencies or counts: Bar plots can be used to compare the


frequency or count of data points in different categories. For example, you
could use a bar plot to compare the number of books sold in different
genres.
2. Displaying proportions: Bar plots can be used to display proportions, such
as the percentage of respondents who selected each option in a survey.

2
3. Visualizing changes over time: Bar plots can be used to display changes in
categorical data over time, such as the number of sales in different months
or the number of new customers in different years.
4. Comparing multiple variables: By grouping bar plots, you can compare
multiple variables simultaneously. For example, you could compare the
number of books sold by different authors and by different genres.
5. Displaying nominal data: Bar plots are a common way to display nominal
data, which is data that has no inherent order or structure, such as hair color
or preferred drink.
Creating a Python Bar Plot Using Matplotlib
Python matplotlib module provides us with various functions to plot the data and understand
the distribution of the data values.
The matplotlib.pyplot.bar() function is used to create a Bar plot using matplotlib module.
Syntax:
matplotlib.pyplot.bar(x, height, width, bottom, align)
 x: The scalar x-coordinates of the barplot
 height: The height of the bars to be plotted
 bottom: The vertical baseline
 width: The width of the bars to be plotted(optional)
 align: The type of alignment of the bar plot(optional).
 Further, we need to make sure and understand that only categorical data values can be
provided to the barplot.
Example:
import matplotlib.pyplot as plt
country = ['INDIA', 'JAPAN', 'CHINA', 'USA', 'GERMANY']
population = [1000,800,600,400,1100]
plt.bar(country,population)
plt.show()
Output:

3
BARPLOT Using Matplotlib

Bar Plot using Seaborn module


Python Seaborn module is built over the Matplotlib module and offers us with some
advanced functionalities to have a better visualization of the data values.
Syntax:
seaborn.barplot(x,y)
Example:
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
BIKE = pd.read_csv("BIKE.csv")
sn.barplot(x="season",y="cnt",data=BIKE)
plt.show()

Output:

4
BARPLOT Using Seaborn

Result: Thus, in this Experiment we have understood the various techniques to construct a
Python bar plot.

You might also like