0% found this document useful (0 votes)
10 views7 pages

Practical 2

Uploaded by

Om Bachhav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views7 pages

Practical 2

Uploaded by

Om Bachhav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

4/22/24, 11:02 AM Practical 2

MGV's Loknete Vyankatrao Hiray Arts, Science


and Commerce College Nashik

Department of Mathematics

M. Sc. 1 Data Science

Practical 2 Probability Distributions of Discrete


Random Variables

a. Binomial Random Variable

1. Example. Given 10 trials for coin toss generate 10


data points:n - number of trials.p - probability of
occurence of each trial (e.g. for toss of a coin 0.5
each). size - The shape of the returned array
In [1]: from numpy import random
x = random.binomial(n=10, p=0.5, size=10)
print(x)

[3 3 3 6 3 3 5 6 7 6]

In [5]: # Visualization of Binomial Distribution


from numpy import random
import matplotlib.pyplot as plt
import seaborn as sns
sns.distplot(random.binomial(n=10, p=0.5, size=1000), hist= True)
plt.show()

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Practical 2.ipynb 1/7


4/22/24, 11:02 AM Practical 2

2. Example Consider a random experiment of tossing a


biased coin 6 times where the probability of getting a
head is 0.6. If ‘getting a head’ is considered as
‘success’ then, the binomial distribution table will
contain the probability of x successes for each
possible value of x.
In [6]: from scipy.stats import binom
# setting the values
# of n and p
n = 6
p = 0.6
# defining the list of r values
r_values = list(range(n + 1))
# obtaining the mean and variance
mean, var = binom.stats(n, p)
# list of pmf values
dist = [binom.pmf(r, n, p) for r in r_values ]
# printing the table
print("r\tp(r)")
for i in range(n + 1):
print(str(r_values[i]) + "\t" + str(dist[i]))
# printing mean and variance
print("mean = "+str(mean))
print("variance = "+str(var))

r p(r)
0 0.0040960000000000015
1 0.03686400000000002
2 0.1382400000000001
3 0.2764800000000001
4 0.3110400000000001
5 0.1866240000000001
6 0.04665599999999999
mean = 3.5999999999999996
variance = 1.44

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Practical 2.ipynb 2/7


4/22/24, 11:02 AM Practical 2

In [7]: from scipy.stats import binom


import matplotlib.pyplot as plt
# setting the values
# of n and p
n = 6
p = 0.6
# defining list of r values
r_values = list(range(n + 1))
# list of pmf values
dist = [binom.pmf(r, n, p) for r in r_values ]
# plotting the graph
plt.bar(r_values, dist)
plt.show()

b. Poisson Random Variable

1. Example If someone eats twice a day what is the


probability he will eat thrice? It has two parameters:
lam - rate or known number of occurrences e.g. 2 for
above problem. size - The shape of the returned
array.Generate a random 1x10 distribution for
occurrence 2
In [8]: from numpy import random
x = random.poisson(lam=2, size=10)
print(x)

[4 2 1 0 2 2 2 1 4 1]

In [10]: # Visualization of Poisson Distribution


from numpy import random
import matplotlib.pyplot as plt
import seaborn as sns

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Practical 2.ipynb 3/7


4/22/24, 11:02 AM Practical 2

In [11]: sns.distplot(random.poisson(lam=2, size=1000), kde=False)


plt.show()

2. Example of frequencies of hurricanes. Assume that


when we have data on observing hurricanes over a
period of 20 years. We find that the average number of
hurricanes per year is 7 ¶
In [1]: import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import poisson
k = np.arange(0, 21)
print(k)

[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]

In [16]: # Poisson PMF (Probability mass function)

In [2]: pmf = poisson.pmf(k, mu=7)


pmf = np.round(pmf, 5)
print(pmf)

[9.1000e-04 6.3800e-03 2.2340e-02 5.2130e-02 9.1230e-02 1.2772e-01


1.4900e-01 1.4900e-01 1.3038e-01 1.0140e-01 7.0980e-02 4.5170e-02
2.6350e-02 1.4190e-02 7.0900e-03 3.3100e-03 1.4500e-03 6.0000e-04
2.3000e-04 9.0000e-05 3.0000e-05]

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Practical 2.ipynb 4/7


4/22/24, 11:02 AM Practical 2

In [3]: for val, prob in zip(k,pmf):


print(f"k-value {val} has probability = {prob}")

k-value 0 has probability = 0.00091


k-value 1 has probability = 0.00638
k-value 2 has probability = 0.02234
k-value 3 has probability = 0.05213
k-value 4 has probability = 0.09123
k-value 5 has probability = 0.12772
k-value 6 has probability = 0.149
k-value 7 has probability = 0.149
k-value 8 has probability = 0.13038
k-value 9 has probability = 0.1014
k-value 10 has probability = 0.07098
k-value 11 has probability = 0.04517
k-value 12 has probability = 0.02635
k-value 13 has probability = 0.01419
k-value 14 has probability = 0.00709
k-value 15 has probability = 0.00331
k-value 16 has probability = 0.00145
k-value 17 has probability = 0.0006
k-value 18 has probability = 0.00023
k-value 19 has probability = 9e-05
k-value 20 has probability = 3e-05

In [15]: plt.plot(k, pmf, marker='o')


plt.xlabel('k')
plt.ylabel('Probability')
plt.show()

In [17]: # Poisson CDF (Cumulative Distribution function)

In [4]: cdf = poisson.cdf(k, mu=7)


cdf = np.round(cdf, 3)
print(cdf)

[0.001 0.007 0.03 0.082 0.173 0.301 0.45 0.599 0.729 0.83 0.901 0.947
0.973 0.987 0.994 0.998 0.999 1. 1. 1. 1. ]

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Practical 2.ipynb 5/7


4/22/24, 11:02 AM Practical 2

In [19]: for val, prob in zip(k,cdf):


print(f"k-value {val} has probability = {prob}")

k-value 0 has probability = 0.001


k-value 1 has probability = 0.007
k-value 2 has probability = 0.03
k-value 3 has probability = 0.082
k-value 4 has probability = 0.173
k-value 5 has probability = 0.301
k-value 6 has probability = 0.45
k-value 7 has probability = 0.599
k-value 8 has probability = 0.729
k-value 9 has probability = 0.83
k-value 10 has probability = 0.901
k-value 11 has probability = 0.947
k-value 12 has probability = 0.973
k-value 13 has probability = 0.987
k-value 14 has probability = 0.994
k-value 15 has probability = 0.998
k-value 16 has probability = 0.999

In [20]: plt.plot(k, cdf, marker='o')


plt.xlabel('k')
plt.ylabel('Cumulative Probability')
plt.show()

c. Hypergeometric Random Variable

1. Example Aces in a Five-Card Poker Hand. The


number of aces in a five-card poker hand has the
hypergeometric distribution with population size 52,
four good elements in the population, and a simple
random sample size of 5
In [22]: import scipy.stats as stats

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Practical 2.ipynb 6/7


4/22/24, 11:02 AM Practical 2

In [23]: k = np.arange(5)
N = 52 # population size
G = 4 # number of good elements in population
n = 5 # simple random sample size
stats.hypergeom.pmf(k, N, G, n)

Out[23]: array([6.58841998e-01, 2.99473636e-01, 3.99298181e-02, 1.73607905e-03,


1.84689260e-05])

In [25]: a = np.round(stats.hypergeom.pmf(k, N, G, n), 3)

In [27]: a

Out[27]: array([0.659, 0.299, 0.04 , 0.002, 0. ])

In [26]: plt.plot(k, a, marker='o')



plt.show()

In [ ]: ​

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Practical 2.ipynb 7/7

You might also like