42.Histograms
42.Histograms
import pandas as pd
import matplotlib.pyplot as plt
data = {'Name':['Arnav', 'Sheela', 'Azhar', 'Bincy', 'Yash',
'Nazar'],
'Height' : [60,61,63,65,61,60],
'Weight' : [47,89,52,58,50,47]}
}
df=pd.DataFrame(data)
How can we make the
df.plot(kind='hist')
plt.show()
The Program 4-9 displays the histogram corresponding to all attributes having
numeric values, i.e., ‘Height’ and ‘Weight’ attributes as shown in Figure 4.9. On
thebasis of the height and weight values provided in the DataFrame, the plot()
calculated the bin values.It is also possible to set value for the bins parameter,
for example,
df.plot(kind=’hist’,bins=20)
df.plot(kind='hist',bins=[18,19,20,21,22])
df.plot(kind='hist',bins=range(18,25)
A histogram is a graph showing frequency distributions. It is a graph showing the
number of observations within each given interval.
Example: Say you ask for the height of 250 people, you might end up with a
histogram like this
Creating a Histogram
To create a histogram the first step is to create bin of the ranges, then
distribute the whole range of the values into a series of intervals, and
count the values which fall into each of the intervals.Bins are clearly
identified as consecutive, non-overlapping intervals of variables.The
matplotlib.pyplot.hist() function is used to compute and create histogram
of x.
The following table shows the parameters accepted by
matplotlib.pyplot.hist() function :
Attribute parameter
x array or sequence of array
bins optional parameter contains integer or sequence or strings
density optional parameter contains boolean values
range optional parameter represents upper and lower range of bins
optional parameter used to create type of histogram [bar, barstacked, step,
histtype
stepfilled], default is “bar”
align optional parameter controls the plotting of histogram [left, right, mid]
weights optional parameter contains array of weights having same dimensions as x
bottom location of the basline of each bin
rwidth optional parameter which is relative width of the bars with respect to bin width
color optional parameter used to set color or sequence of color specs
label optional parameter string or sequence of string to match with multiple datasets
log optional parameter used to set histogram axis on log scale
Let’s create a basic histogram of some random values. Below code creates
a simple histogram of some random values
x = np.random.normal(170,10,250)
plt.hist(x)
plt.show()
from matplotlib import pyplot as plt
import numpy as np
# Creating dataset
a = np.array([22, 87, 5, 43, 56,
73, 55, 54, 11,
20, 51, 5, 79, 31,
27])
# Creating histogram
fig, ax = plt.subplots(figsize =(10, 7))
ax.hist(a, bins = [0, 25, 50, 75, 100])
# Show plot
plt.show()
Customising Histogram:
Taking the same data as above, now let see how the histogram can be customised. Let
us change the edgecolor, which is the border of each hist, to green.Also, let us change
the line style to ":" and line width to 2. Let us try another property called fill, which
takes boolean values. The default True means each hist will be filled with color and
False means each hist will be empty. Another property called hatch can be used to fill
to each hist with pattern ( '-', '+', 'x', '\\', '*', 'o', 'O', '.'). In the Program 4-10, we have
used the hatch value as "o".
mport pandas as pd
import matplotlib.pyplot as plt
data = {'Name':['Arnav', 'Sheela', 'Azhar','Bincy','Yash',
'Nazar'],
'Height' : [60,61,63,65,61,60],
'Weight' : [47,89,52,58,50,47]}
df=pd.DataFrame(data)
df.plot(kind='hist',edgecolor='Green',linewidth=2,linestyle=':',fil
l=False,hatch='o')
plt.show()