0% found this document useful (0 votes)
5 views17 pages

keeratsi_HW8

The document discusses statistical analysis of extreme events and peak discharge values using various distributions such as Gumbel, Weibull, and GEV. It includes data fitting, error calculations, and visualizations of probability density functions (PDF) and cumulative distribution functions (CDF). The analysis concludes that the Gumbel distribution provides the best fit for peak annual discharge values, with a calculated 100-year event discharge of 130110.13 cfs.

Uploaded by

keeratsi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views17 pages

keeratsi_HW8

The document discusses statistical analysis of extreme events and peak discharge values using various distributions such as Gumbel, Weibull, and GEV. It includes data fitting, error calculations, and visualizations of probability density functions (PDF) and cumulative distribution functions (CDF). The analysis concludes that the Gumbel distribution provides the best fit for peak annual discharge values, with a calculated 100-year event discharge of 130110.13 cfs.

Uploaded by

keeratsi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

import math

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import scipy.stats as stats

from google.colab import drive


drive.mount('/content/drive')

APPLDIR = 'drive/MyDrive/Data Analysis/HW8/'

Mounted at /content/drive

Problem 1
The return period of the extreme event defines the interval of time between events of that
magnitude or greater. If the return period is the same as the design lifetime, then the
probability of the extreme event occuring in the lifetime is theoretically 100%. In other
words, for some technology with a given design lifetime, we should expect extreme events
with return periods less than or equal to the design lifetime to occur, and thus design
accordingly.

Problem 2
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import pandas as pd
import scipy

biv = pd.read_csv(APPLDIR + 'bivariate_data_xy-1.csv')

plt.scatter(biv['x'], biv['y'])

<matplotlib.collections.PathCollection at 0x7f0ec02e9d90>
# Fit a bivariate normal distribution

x = biv['x'].values
y= biv['y'].values

mu_x = np.mean(x)
mu_y = np.mean(y)
cov_xy = np.cov(x,y)
mu = [mu_x, mu_y]

print('mu:', mu)
print('cov:', cov_xy)

mu: [np.float64(2.1451007015642416), np.float64(-1.4775267035869502)]


cov: [[ 0.20984158 -0.13670113]
[-0.13670113 0.22063963]]

x_list = np.arange(0, 3, 0.1)


y_list = np.arange(-3, 0, 0.1)
X_list, Y_list = np.meshgrid(x_list, y_list)
XY_pts = np.dstack((X_list, Y_list))

# PDF values
f_xy = scipy.stats.multivariate_normal.pdf(XY_pts, mean=mu,
cov=cov_xy)
# CDF values
F_xy = scipy.stats.multivariate_normal.cdf(XY_pts, mean=mu,
cov=cov_xy)

fig, ax = plt.subplots(2,2, squeeze=False, gridspec_kw={'wspace':0.25,


'hspace':0.2})
# ----- PDF -----
# 3D Plot
ax[0,0].remove()
ax[0,0] = fig.add_subplot(2,2,1, projection='3d')
ax[0,0].plot_surface(X_list, Y_list, f_xy, cmap='viridis',
edgecolor='k',alpha=0.7)
ax[0,0].set_xlabel('x')
ax[0,0].set_ylabel('y')
ax[0,0].set_zlabel('PDF', rotation=90)
# Density Plot
ax[0,1].scatter(X_list, Y_list, f_xy, color='k')
ax[0,1].contour(X_list, Y_list, f_xy, levels=10)
ax[0,1].set_xlabel('x')
ax[0,1].set_ylabel('y')
# ----- CDF -----
# 3D plot
ax[1,0].remove()
ax[1,0] = fig.add_subplot(2,2,3, projection='3d')
ax[1,0].plot_surface(X_list, Y_list, F_xy, cmap='viridis',
edgecolor='k',alpha=0.7)
ax[1,0].set_xlabel('x')
ax[1,0].set_ylabel('y')
ax[1,0].set_zlabel('CDF', rotation=90)
# Density Plot
ax[1,1].scatter(X_list, Y_list, F_xy, color='k')
ax[1,1].contour(X_list, Y_list, F_xy, levels=10)
ax[1,1].set_xlabel('x')
ax[1,1].set_ylabel('y')
fig.set_size_inches(10,8)
plt.show()
# F(X=2.5, Y=-1) = P(X < 2.5, Y < -1)

P_e = scipy.stats.multivariate_normal.cdf([2.5, -1], mean=mu,


cov=cov_xy)

fig, ax = plt.subplots(2,2, squeeze=False, gridspec_kw={'wspace':0.25,


'hspace':0.2})
# ----- PDF -----
# 3D Plot
ax[0,0].remove()
ax[0,0] = fig.add_subplot(2,2,1, projection='3d')
ax[0,0].plot_surface(X_list, Y_list, f_xy, cmap='viridis',
edgecolor='k',alpha=0.7)
ax[0,0].set_xlabel('x')
ax[0,0].set_ylabel('y')
ax[0,0].set_zlabel('PDF', rotation=90)
# Density Plot
ax[0,1].scatter(X_list, Y_list, f_xy, color='k')
ax[0,1].contour(X_list, Y_list, f_xy, levels=10)
ax[0,1].set_xlabel('x')
ax[0,1].set_ylabel('y')
# ----- CDF -----
# 3D plot
ax[1,0].remove()
ax[1,0] = fig.add_subplot(2,2,3, projection='3d')
ax[1,0].plot_surface(X_list, Y_list, F_xy, cmap='viridis',
edgecolor='k',alpha=0.7)
ax[1,0].scatter(2.5,-1,P_e, color='red', s=100)
ax[1,0].set_xlabel('x')
ax[1,0].set_ylabel('y')
ax[1,0].set_zlabel('CDF', rotation=90)
# Density Plot
ax[1,1].scatter(X_list, Y_list, F_xy, color='k')
ax[1,1].contour(X_list, Y_list, F_xy, levels=10)
ax[1,1].set_xlabel('x')
ax[1,1].set_ylabel('y')
fig.set_size_inches(10,8)
plt.show()
print('The probability of P(X<2.5,Y<-1) is', np.round(P_e,2))

The probability of P(X<2.5,Y<-1) is 0.63

Problem 3
Parts a and b)
peak_discharge = pd.read_csv(APPLDIR +
'allegheny_annual_discharge_peak-1.csv')

plt.scatter(peak_discharge.year, peak_discharge.discharge_cfs)

<matplotlib.collections.PathCollection at 0x7f0eb98c3d90>
# Fitting peak discharge vals to Gumbel
peak_vals = peak_discharge['discharge_cfs'].values
peak_vals_sorted = np.sort(peak_vals)
p_x = ((np.arange(0,len(peak_vals)) + 1) / (len(peak_vals) + 1))

h_values = [1.5, 1.25, 1.0, 0.75, 0.5, 0.25]

h_gumbel_max_error = []

for h in h_values:
gumbel_mean, gumbel_std =
scipy.stats.gumbel_r.fit(peak_vals,method='MLE')
# Error calculation
F_x = scipy.stats.gumbel_r.cdf(peak_vals_sorted,
loc=gumbel_mean,scale=gumbel_std)
max_error = np.max(np.abs(p_x**h - F_x**h))
h_gumbel_max_error.append(np.round(max_error,3))

# Weibull

h_weibull_max_error = []

for h in h_values:
weibull_beta, weibull_epsilon, weibull_sigma =
scipy.stats.weibull_min.fit(peak_vals, floc=0)
# Error calculation
F_x = scipy.stats.weibull_min.cdf(peak_vals_sorted,
weibull_beta,loc=weibull_epsilon, scale=weibull_sigma)
max_error = np.max(np.abs(p_x**h - F_x**h))
h_weibull_max_error.append(np.round(max_error,3))

# GEV

h_gev_max_error = []

for h in h_values:
gev_delta, gev_mu, gev_sigma = scipy.stats.genextreme.fit(peak_vals)
# Error calculation
F_x = scipy.stats.genextreme.cdf(peak_vals_sorted,
gev_delta,loc=gev_mu, scale=gev_sigma)
max_error = np.max(np.abs(p_x**h - F_x**h))
h_gev_max_error.append(np.round(max_error,3))

# Frechet

h_frechet_max_error = []

for h in h_values:
frechet_gamma, frechet_mu, frechet_sigma =
scipy.stats.invweibull.fit(peak_vals)
# Error calculation
F_x = scipy.stats.invweibull.cdf(peak_vals_sorted,
frechet_gamma,loc=frechet_mu, scale=frechet_sigma)
max_error = np.max(np.abs(p_x**h - F_x**h))
h_frechet_max_error.append(np.round(max_error,3))

fig, ax = plt.subplots(1,4, squeeze=False, gridspec_kw={'wspace':0.3,


'hspace':0.3})

# Data
ax[0,0].scatter(peak_vals_sorted, p_x)
ax[0,1].scatter(peak_vals_sorted, p_x)
ax[0,2].scatter(peak_vals_sorted, p_x)
ax[0,3].scatter(peak_vals_sorted, p_x)
xline = np.arange(peak_vals.min()-0.1, peak_vals.max()+1, 0.1) #
estimates values along the line
# Gumbel CDF
Fxline_gumbel =
scipy.stats.gumbel_r.cdf(xline,loc=gumbel_mean,scale=gumbel_std) #
converts those values to CDF
ax[0,0].plot(xline, Fxline_gumbel, color='k')
ax[0,0].set_title('Gumbel (Type I)', fontsize=14)
# Frechet CDF
Fxline_frechet =
scipy.stats.invweibull.cdf(xline,frechet_gamma,loc=frechet_mu,
scale=frechet_sigma)
ax[0,1].plot(xline, Fxline_frechet, color='k')
ax[0,1].set_title('Frechet (Type II)', fontsize=14)
# Weibull CDF
Fxline_weibull =
scipy.stats.weibull_min.cdf(xline,weibull_beta,loc=weibull_epsilon,
scale=weibull_sigma)
ax[0,2].plot(xline, Fxline_weibull, color='k')
ax[0,2].set_title('Weibull (Type III)', fontsize=14)
# GEV CDF
Fxline_gev = scipy.stats.genextreme.cdf(xline,gev_delta,loc=gev_mu,
scale=gev_sigma)
ax[0,3].plot(xline, Fxline_gev, color='k')
ax[0,3].set_title('GEV', fontsize=14)
fig.set_size_inches(10,4)
plt.show()

max_error_df = pd.DataFrame({'h': h_values, 'Gumbel':


h_gumbel_max_error,
'Frechet': h_frechet_max_error,
'Weibull': h_weibull_max_error,
'GEV': h_gev_max_error})

max_error_df

{"summary":"{\n \"name\": \"max_error_df\",\n \"rows\": 6,\n


\"fields\": [\n {\n \"column\": \"h\",\n \"properties\":
{\n \"dtype\": \"number\",\n \"std\":
0.46770717334674267,\n \"min\": 0.25,\n \"max\": 1.5,\n
\"num_unique_values\": 6,\n \"samples\": [\n 1.5,\n
1.25,\n 0.25\n ],\n \"semantic_type\": \"\",\n
\"description\": \"\"\n }\n },\n {\n \"column\":
\"Gumbel\",\n \"properties\": {\n \"dtype\": \"number\",\n
\"std\": 0.02650597416935787,\n \"min\": 0.048,\n
\"max\": 0.122,\n \"num_unique_values\": 5,\n
\"samples\": [\n 0.073,\n 0.122,\n 0.06\n
],\n \"semantic_type\": \"\",\n \"description\": \"\"\n
}\n },\n {\n \"column\": \"Frechet\",\n
\"properties\": {\n \"dtype\": \"number\",\n \"std\":
0.06571453416102101,\n \"min\": 0.55,\n \"max\": 0.719,\
n \"num_unique_values\": 6,\n \"samples\": [\n
0.6,\n 0.641,\n 0.55\n ],\n
\"semantic_type\": \"\",\n \"description\": \"\"\n }\
n },\n {\n \"column\": \"Weibull\",\n \"properties\":
{\n \"dtype\": \"number\",\n \"std\":
0.02013620288601271,\n \"min\": 0.112,\n \"max\":
0.163,\n \"num_unique_values\": 6,\n \"samples\": [\n
0.129,\n 0.12,\n 0.163\n ],\n
\"semantic_type\": \"\",\n \"description\": \"\"\n }\
n },\n {\n \"column\": \"GEV\",\n \"properties\": {\n
\"dtype\": \"number\",\n \"std\": 0.06826248359579853,\n
\"min\": 0.559,\n \"max\": 0.741,\n
\"num_unique_values\": 6,\n \"samples\": [\n 0.63,\n
0.67,\n 0.559\n ],\n \"semantic_type\": \"\",\n
\"description\": \"\"\n }\n }\n ]\
n}","type":"dataframe","variable_name":"max_error_df"}

The Gumbel distribution has the lowest maximum error across the range of h-values. This
implies that Gumbel works best for fitting any part of the peak annual discharge values
across the sample distribution.

Part c)
gumbel_100yr =
scipy.stats.gumbel_r.ppf(.99,loc=gumbel_mean,scale=gumbel_std)

print(f'The 100-yr event is {gumbel_100yr:.2f} cfs')

The 100-yr event is 130110.13 cfs

Part d)
# Repeating analysis for years prior to 1965

peak_vals = peak_discharge[peak_discharge['year'] < 1965]


['discharge_cfs'].values
peak_vals_sorted = np.sort(peak_vals)
p_x = ((np.arange(0,len(peak_vals)) + 1) / (len(peak_vals) + 1))

h_values = [1.5, 1.25, 1.0, 0.75, 0.5, 0.25]

# Gumbel
h_gumbel_max_error = []

for h in h_values:
gumbel_mean, gumbel_std =
scipy.stats.gumbel_r.fit(peak_vals,method='MLE')
# Error calculation
F_x = scipy.stats.gumbel_r.cdf(peak_vals_sorted,
loc=gumbel_mean,scale=gumbel_std)
max_error = np.max(np.abs(p_x**h - F_x**h))
h_gumbel_max_error.append(np.round(max_error,3))

# Weibull

h_weibull_max_error = []

for h in h_values:
weibull_beta, weibull_epsilon, weibull_sigma =
scipy.stats.weibull_min.fit(peak_vals, floc=0)
# Error calculation
F_x = scipy.stats.weibull_min.cdf(peak_vals_sorted,
weibull_beta,loc=weibull_epsilon, scale=weibull_sigma)
max_error = np.max(np.abs(p_x**h - F_x**h))
h_weibull_max_error.append(np.round(max_error,3))

# GEV

h_gev_max_error = []

for h in h_values:
gev_delta, gev_mu, gev_sigma = scipy.stats.genextreme.fit(peak_vals)
# Error calculation
F_x = scipy.stats.genextreme.cdf(peak_vals_sorted,
gev_delta,loc=gev_mu, scale=gev_sigma)
max_error = np.max(np.abs(p_x**h - F_x**h))
h_gev_max_error.append(np.round(max_error,3))

# Frechet

h_frechet_max_error = []

for h in h_values:
frechet_gamma, frechet_mu, frechet_sigma =
scipy.stats.invweibull.fit(peak_vals)
# Error calculation
F_x = scipy.stats.invweibull.cdf(peak_vals_sorted,
frechet_gamma,loc=frechet_mu, scale=frechet_sigma)
max_error = np.max(np.abs(p_x**h - F_x**h))
h_frechet_max_error.append(np.round(max_error,3))
fig, ax = plt.subplots(1,4, squeeze=False, gridspec_kw={'wspace':0.3,
'hspace':0.3})
fig.suptitle('Pre-1965')

# Data
ax[0,0].scatter(peak_vals_sorted, p_x)
ax[0,1].scatter(peak_vals_sorted, p_x)
ax[0,2].scatter(peak_vals_sorted, p_x)
ax[0,3].scatter(peak_vals_sorted, p_x)
xline = np.arange(peak_vals.min()-0.1, peak_vals.max()+1, 0.1) #
estimates values along the line
# Gumbel CDF
Fxline_gumbel =
scipy.stats.gumbel_r.cdf(xline,loc=gumbel_mean,scale=gumbel_std) #
converts those values to CDF
ax[0,0].plot(xline, Fxline_gumbel, color='k')
ax[0,0].set_title('Gumbel (Type I)', fontsize=14)
# Frechet CDF
Fxline_frechet =
scipy.stats.invweibull.cdf(xline,frechet_gamma,loc=frechet_mu,
scale=frechet_sigma)
ax[0,1].plot(xline, Fxline_frechet, color='k')
ax[0,1].set_title('Frechet (Type II)', fontsize=14)
# Weibull CDF
Fxline_weibull =
scipy.stats.weibull_min.cdf(xline,weibull_beta,loc=weibull_epsilon,
scale=weibull_sigma)
ax[0,2].plot(xline, Fxline_weibull, color='k')
ax[0,2].set_title('Weibull (Type III)', fontsize=14)
# GEV CDF
Fxline_gev = scipy.stats.genextreme.cdf(xline,gev_delta,loc=gev_mu,
scale=gev_sigma)
ax[0,3].plot(xline, Fxline_gev, color='k')
ax[0,3].set_title('GEV', fontsize=14)
fig.set_size_inches(10,4)
plt.show()
max_error_df = pd.DataFrame({'h': h_values, 'Gumbel':
h_gumbel_max_error,
'Frechet': h_frechet_max_error,
'Weibull': h_weibull_max_error,
'GEV': h_gev_max_error})

max_error_df

{"summary":"{\n \"name\": \"max_error_df\",\n \"rows\": 6,\n


\"fields\": [\n {\n \"column\": \"h\",\n \"properties\":
{\n \"dtype\": \"number\",\n \"std\":
0.46770717334674267,\n \"min\": 0.25,\n \"max\": 1.5,\n
\"num_unique_values\": 6,\n \"samples\": [\n 1.5,\n
1.25,\n 0.25\n ],\n \"semantic_type\": \"\",\n
\"description\": \"\"\n }\n },\n {\n \"column\":
\"Gumbel\",\n \"properties\": {\n \"dtype\": \"number\",\n
\"std\": 0.04025750447639131,\n \"min\": 0.058,\n
\"max\": 0.173,\n \"num_unique_values\": 6,\n
\"samples\": [\n 0.094,\n 0.084,\n 0.173\n
],\n \"semantic_type\": \"\",\n \"description\": \"\"\n
}\n },\n {\n \"column\": \"Frechet\",\n
\"properties\": {\n \"dtype\": \"number\",\n \"std\":
0.06722325986343318,\n \"min\": 0.452,\n \"max\":
0.631,\n \"num_unique_values\": 6,\n \"samples\": [\n
0.516,\n 0.559,\n 0.452\n ],\n
\"semantic_type\": \"\",\n \"description\": \"\"\n }\
n },\n {\n \"column\": \"Weibull\",\n \"properties\":
{\n \"dtype\": \"number\",\n \"std\":
0.012832251036613437,\n \"min\": 0.054,\n \"max\":
0.091,\n \"num_unique_values\": 6,\n \"samples\": [\n
0.091,\n 0.082,\n 0.07\n ],\n
\"semantic_type\": \"\",\n \"description\": \"\"\n }\
n },\n {\n \"column\": \"GEV\",\n \"properties\": {\n
\"dtype\": \"number\",\n \"std\": 0.0721544639413714,\n
\"min\": 0.466,\n \"max\": 0.663,\n
\"num_unique_values\": 6,\n \"samples\": [\n 0.558,\n
0.602,\n 0.466\n ],\n \"semantic_type\": \"\",\
n \"description\": \"\"\n }\n }\n ]\
n}","type":"dataframe","variable_name":"max_error_df"}

For discharge prior to 1965, the Weibull performs better for the tails, while Gumbell and
Weibull are comparable in performance when considering the whole range of the sample
distribution. We'll choose the Weibull.

weibull_100yr = scipy.stats.weibull_min.ppf(.99,
weibull_beta,loc=weibull_epsilon, scale=weibull_sigma)

print(f'The 100-yr event is {weibull_100yr:.2f} cfs')

The 100-yr event is 134777.35 cfs

# Repeating analysis for years after 1965

peak_vals = peak_discharge[peak_discharge['year'] > 1965]


['discharge_cfs'].values
peak_vals_sorted = np.sort(peak_vals)
p_x = ((np.arange(0,len(peak_vals)) + 1) / (len(peak_vals) + 1))

h_values = [1.5, 1.25, 1.0, 0.75, 0.5, 0.25]

# Gumbel

h_gumbel_max_error = []

for h in h_values:
gumbel_mean, gumbel_std =
scipy.stats.gumbel_r.fit(peak_vals,method='MLE')
# Error calculation
F_x = scipy.stats.gumbel_r.cdf(peak_vals_sorted,
loc=gumbel_mean,scale=gumbel_std)
max_error = np.max(np.abs(p_x**h - F_x**h))
h_gumbel_max_error.append(np.round(max_error,3))

# Weibull

h_weibull_max_error = []

for h in h_values:
weibull_beta, weibull_epsilon, weibull_sigma =
scipy.stats.weibull_min.fit(peak_vals, floc=0)
# Error calculation
F_x = scipy.stats.weibull_min.cdf(peak_vals_sorted,
weibull_beta,loc=weibull_epsilon, scale=weibull_sigma)
max_error = np.max(np.abs(p_x**h - F_x**h))
h_weibull_max_error.append(np.round(max_error,3))

# GEV

h_gev_max_error = []

for h in h_values:
gev_delta, gev_mu, gev_sigma = scipy.stats.genextreme.fit(peak_vals)
# Error calculation
F_x = scipy.stats.genextreme.cdf(peak_vals_sorted,
gev_delta,loc=gev_mu, scale=gev_sigma)
max_error = np.max(np.abs(p_x**h - F_x**h))
h_gev_max_error.append(np.round(max_error,3))

# Frechet

h_frechet_max_error = []

for h in h_values:
frechet_gamma, frechet_mu, frechet_sigma =
scipy.stats.invweibull.fit(peak_vals)
# Error calculation
F_x = scipy.stats.invweibull.cdf(peak_vals_sorted,
frechet_gamma,loc=frechet_mu, scale=frechet_sigma)
max_error = np.max(np.abs(p_x**h - F_x**h))
h_frechet_max_error.append(np.round(max_error,3))

fig, ax = plt.subplots(1,4, squeeze=False, gridspec_kw={'wspace':0.3,


'hspace':0.3})
fig.suptitle('Post-1965')

# Data
ax[0,0].scatter(peak_vals_sorted, p_x)
ax[0,1].scatter(peak_vals_sorted, p_x)
ax[0,2].scatter(peak_vals_sorted, p_x)
ax[0,3].scatter(peak_vals_sorted, p_x)
xline = np.arange(peak_vals.min()-0.1, peak_vals.max()+1, 0.1) #
estimates values along the line
# Gumbel CDF
Fxline_gumbel =
scipy.stats.gumbel_r.cdf(xline,loc=gumbel_mean,scale=gumbel_std) #
converts those values to CDF
ax[0,0].plot(xline, Fxline_gumbel, color='k')
ax[0,0].set_title('Gumbel (Type I)', fontsize=14)
# Frechet CDF
Fxline_frechet =
scipy.stats.invweibull.cdf(xline,frechet_gamma,loc=frechet_mu,
scale=frechet_sigma)
ax[0,1].plot(xline, Fxline_frechet, color='k')
ax[0,1].set_title('Frechet (Type II)', fontsize=14)
# Weibull CDF
Fxline_weibull =
scipy.stats.weibull_min.cdf(xline,weibull_beta,loc=weibull_epsilon,
scale=weibull_sigma)
ax[0,2].plot(xline, Fxline_weibull, color='k')
ax[0,2].set_title('Weibull (Type III)', fontsize=14)
# GEV CDF
Fxline_gev = scipy.stats.genextreme.cdf(xline,gev_delta,loc=gev_mu,
scale=gev_sigma)
ax[0,3].plot(xline, Fxline_gev, color='k')
ax[0,3].set_title('GEV', fontsize=14)
fig.set_size_inches(10,4)
plt.show()

max_error_df = pd.DataFrame({'h': h_values, 'Gumbel':


h_gumbel_max_error,
'Frechet': h_frechet_max_error,
'Weibull': h_weibull_max_error,
'GEV': h_gev_max_error})

max_error_df

{"summary":"{\n \"name\": \"max_error_df\",\n \"rows\": 6,\n


\"fields\": [\n {\n \"column\": \"h\",\n \"properties\":
{\n \"dtype\": \"number\",\n \"std\":
0.46770717334674267,\n \"min\": 0.25,\n \"max\": 1.5,\n
\"num_unique_values\": 6,\n \"samples\": [\n 1.5,\n
1.25,\n 0.25\n ],\n \"semantic_type\": \"\",\n
\"description\": \"\"\n }\n },\n {\n \"column\":
\"Gumbel\",\n \"properties\": {\n \"dtype\": \"number\",\n
\"std\": 0.015501612819316578,\n \"min\": 0.056,\n
\"max\": 0.101,\n \"num_unique_values\": 6,\n
\"samples\": [\n 0.076,\n 0.071,\n 0.101\n
],\n \"semantic_type\": \"\",\n \"description\": \"\"\n
}\n },\n {\n \"column\": \"Frechet\",\n
\"properties\": {\n \"dtype\": \"number\",\n \"std\":
0.0782672771128948,\n \"min\": 0.494,\n \"max\": 0.703,\
n \"num_unique_values\": 6,\n \"samples\": [\n
0.639,\n 0.675,\n 0.494\n ],\n
\"semantic_type\": \"\",\n \"description\": \"\"\n }\
n },\n {\n \"column\": \"Weibull\",\n \"properties\":
{\n \"dtype\": \"number\",\n \"std\":
0.008318653737234168,\n \"min\": 0.076,\n \"max\":
0.096,\n \"num_unique_values\": 6,\n \"samples\": [\n
0.096,\n 0.094,\n 0.083\n ],\n
\"semantic_type\": \"\",\n \"description\": \"\"\n }\
n },\n {\n \"column\": \"GEV\",\n \"properties\": {\n
\"dtype\": \"number\",\n \"std\": 0.0803683187995535,\n
\"min\": 0.499,\n \"max\": 0.716,\n
\"num_unique_values\": 6,\n \"samples\": [\n 0.65,\n
0.685,\n 0.499\n ],\n \"semantic_type\": \"\",\
n \"description\": \"\"\n }\n }\n ]\
n}","type":"dataframe","variable_name":"max_error_df"}

For years afer 1965, the Gumbel distribution generally fits better than the others.

gumbel_100yr =
scipy.stats.gumbel_r.ppf(.99,loc=gumbel_mean,scale=gumbel_std)

print(f'The 100-yr event is {gumbel_100yr:.2f} cfs')

The 100-yr event is 89643.48 cfs

You might also like