100% found this document useful (1 vote)
116 views47 pages

EDA Dumps 2 PDF

The document summarizes data from a CSV file containing EDA dump information. It includes 2141621 rows and 15 columns of data on time arrived, truck, destination, tons, distance, travel time and other variables. Summary statistics are provided on the tonnage, distance, times and other numeric fields. The data is then grouped and counted by factors like truck, destination, origin, shovel and material type. A subset of the data is identified where distance or travel time are zero.

Uploaded by

Samuel Lambrecht
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
116 views47 pages

EDA Dumps 2 PDF

The document summarizes data from a CSV file containing EDA dump information. It includes 2141621 rows and 15 columns of data on time arrived, truck, destination, tons, distance, travel time and other variables. Summary statistics are provided on the tonnage, distance, times and other numeric fields. The data is then grouped and counted by factors like truck, destination, origin, shovel and material type. A subset of the data is identified where distance or travel time are zero.

Uploaded by

Samuel Lambrecht
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

EDA Dumps

January 7, 2021

[1]: from IPython.core.display import display, HTML


display(HTML("<style>.container { width:100% !important; }</style>"))

<IPython.core.display.HTML object>

[2]: import pandas as pd


import numpy as np
import seaborn as sns
import datetime
from matplotlib import pyplot as plt
from pandas import Timestamp, Series, date_range
import warnings
warnings.filterwarnings('ignore')
pd.set_option('display.max_rows', 100)

[3]: def full_print(data):


with pd.option_context('display.max_rows', None, 'display.max_columns',␣
,→None): # more options can be specified also
display(tposOD)

pd.option_context('display.max_rows', 100, 'display.max_columns', 100)

[4]: path = "C:/Users/Samuel/Documents/Machine_Learning/datasets/Dumps/"


fname = path + "Dumps.csv"

[5]: fname

[5]: 'C:/Users/Samuel/Documents/Machine_Learning/datasets/Dumps/Dumps.csv'

[6]: df = pd.read_csv(fname, sep =";")

[7]: df.shape

[7]: (2141621, 15)

[8]: df.columns

1
[8]: Index(['TimeArrived', 'Truck', 'Destino', 'Origen', 'Shovel', 'Tronadura',
'Tons', 'Distancia', 'TpoViaje', 'TpoEsperaDump', 'TpoDump',
'TipoMIneral', 'OpCAEX', 'OpSHV', 'ExtraLoad'],
dtype='object')

[9]: df.describe()

[9]: Tons Distancia TpoViaje TpoEsperaDump TpoDump \


count 2.141621e+06 2.141621e+06 2.141621e+06 2.141621e+06 2.141621e+06
mean 3.029798e+02 3.738749e+03 1.031167e+01 2.084086e+00 1.589107e+00
std 1.985622e+01 2.340538e+03 6.047821e+00 3.391814e+00 8.192434e-01
min 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
25% 2.960000e+02 2.009000e+03 5.916667e+00 0.000000e+00 1.000000e+00
50% 3.060000e+02 3.125000e+03 9.383333e+00 7.500000e-01 1.400000e+00
75% 3.104576e+02 5.122000e+03 1.361667e+01 2.766667e+00 2.000000e+00
max 4.000000e+02 9.494100e+04 2.443833e+02 3.010667e+02 3.000000e+00

ExtraLoad
count 2.141621e+06
mean 3.782182e-04
std 1.944416e-02
min 0.000000e+00
25% 0.000000e+00
50% 0.000000e+00
75% 0.000000e+00
max 1.000000e+00

[10]: df.head(5)

[10]: TimeArrived Truck Destino Origen Shovel \


0 2020-12-22 06:28:04.000 CA64 CH-02 F7R1-2757-02/MP2 CF10
1 2020-12-22 06:20:01.000 CA103 CH-02 STOC-2930-00/MM1 PA05
2 2020-12-22 06:11:17.000 CA61 CH-02 STOC-2930-00/MM1 PA05
3 2020-12-22 06:07:05.000 CA94 CH-02 STOC-2930-00/MM1 PA05
4 2020-12-22 05:54:04.000 CA70 CH-02 STOC-2930-00/MM1 PA05

Tronadura Tons Distancia TpoViaje TpoEsperaDump TpoDump \


0 F7R1-2757-02 324.0 3875 16.950000 0.0 0.933333
1 STOC-2930-00 296.0 1752 7.733333 0.0 0.983333
2 STOC-2930-00 287.0 1752 7.733333 0.0 0.950000
3 STOC-2930-00 269.0 1752 7.733333 0.0 0.900000
4 STOC-2930-00 318.0 1752 7.733333 0.0 0.183333

TipoMIneral OpCAEX OpSHV ExtraLoad


0 Media Ley Prim 8689 6668 0
1 Mineral Media 3015 4243 0
2 Mineral Media 2751 4243 0

2
3 Mineral Media 0985 4243 0
4 Mineral Media 6574 4243 0

[11]: df = df[df['ExtraLoad']!=1]

[12]: df.shape

[12]: (2140811, 15)

[13]: del df['ExtraLoad']

df['Fase'] = df['Origen'].str[:4]
df['Banco'] = df['Origen'].str[5:9]
df['Malla'] = df['Origen'].str[10:12]
df['Otro'] = df['Origen'].str[-3:]
df['velocidad'] = np.where(df['TpoViaje']>0, round(df['Distancia'] /␣
,→df['TpoViaje'] / 16.6666666666666666,2), -1)

del df['Tronadura']

[14]: df = df.rename(columns={'TipoMIneral': 'TipoMineral'})

[15]: df.groupby('Truck').size()

[15]: Truck
CA01 164
CA08 174
CA09 233
CA100 27982
CA101 28238
CA102 27168
CA103 20625
CA104 14670
CA105 14209
CA106 12201
CA107 11845
CA11 168
CA13 180
CA28 2705
CA29 4440
CA30 9785
CA50 41859
CA51 41032
CA52 38715
CA53 42359
CA54 41910
CA55 42527

3
CA56 41851
CA57 41484
CA58 43558
CA59 43856
CA60 41336
CA61 44554
CA62 41522
CA63 44779
CA64 43870
CA65 44751
CA66 43018
CA67 41648
CA68 40109
CA69 44803
CA70 42125
CA71 19395
CA72 42134
CA73 43380
CA74 44336
CA76 44294
CA77 43453
CA78 41129
CA79 43274
CA80 41352
CA81 42595
CA82 44235
CA83 44035
CA84 42904
CA85 44005
CA86 46061
CA87 43986
CA88 40653
CA89 41732
CA90 33116
CA91 34768
CA92 34935
CA93 30465
CA94 30156
CA95 30864
CA96 31620
CA97 31924
CA98 32285
CA99 31250
CF10 14
CF11 8
dtype: int64

4
[16]: df.groupby('Destino').size()

[16]: Destino
CH-02 1057559
CH-1 1083252
dtype: int64

[17]: df.groupby('Origen').size()

[17]: Origen
CASE-3020-00/MM1 443
CASE-3020-70/MM1 572
CH-APIRES/MM1 51
CH_3-3065-00/MM1 6
DERR-3230-00/MM1 125
...
STOC-SEC-00/MM1 149
STOC-SECU-00/MM1 1179
STOCK-SEC-00/MM1 224
STOCK_PEL-01/MM1 67
VENT-3027-00/MM1 190
Length: 5990, dtype: int64

[18]: df.groupby('Shovel').size()

[18]: Shovel
BH09 16886
CF02 30522
CF08 69420
CF09 36800
CF10 117978
CF11 123392
PA01 114913
PA03 206792
PA04 360493
PA05 343258
PA06 52401
PA07 451091
PA10 179858
PA11 37007
dtype: int64

[19]: df.groupby('TipoMineral').size()

[19]: TipoMineral
As Alto Prim 272
As Medio Prim 1431

5
Baja Ley 19457
Baja Ley Prim 1638
Esteril 411
Media Ley Prim 21883
Min As Alto 10798
Min As Medio 13325
Min Cal Alto 22259
Min Cal Medio 40696
Mineral Alta 96077
Mineral Media 1903909
Mineral Stock 8650
Nieve 5
dtype: int64

[20]: df.groupby('OpCAEX').size()

[20]: OpCAEX
0001 30
0002 1
0006 2
0013 1
0022 1
...
9850 6824
9917 952
9922 602
9949 7086
mmsunk 11376
Length: 456, dtype: int64

[21]: df.groupby('OpSHV').size()

[21]: OpSHV
0001 267
0002 2
0005 3
0006 4
0011 51
...
9670 9110
9826 35094
9850 1150
9922 3903
mmsunk 52213
Length: 121, dtype: int64

[22]: dfZeros = df[(df['Distancia']==0)|(df['TpoViaje']==0)]

6
[23]: dfZeros

[23]: TimeArrived Truck Destino Origen Shovel \


1531 2020-12-19 11:47:14.000 CA52 CH-02 F10N-3260-05/MM2 PA10
2254 2020-12-18 02:21:51.000 CA64 CH-1 F10N-3260-01/MB1 PA10
2686 2020-12-17 11:17:23.000 CA93 CH-1 F9SE-3335-08/MM4 PA04
3414 2020-12-15 22:08:39.000 CA86 CH-02 F7R1-2757-05/MP6 PA05
3448 2020-12-15 15:32:11.000 CA83 CH-02 F10N-3260-07/MF1 PA07
... ... ... ... ... ...
2141111 2010-12-01 16:00:43.000 CA57 CH-02 F5SW-2885-04/MM2 CF09
2141187 2010-12-01 13:41:16.000 CA09 CH-02 F5SW-2885-04/MM2 CF09
2141195 2010-12-01 13:21:28.000 CA01 CH-02 F5SW-2885-04/MM2 CF09
2141209 2010-12-01 12:52:38.000 CA09 CH-02 F5SW-2885-04/MM2 CF09
2141302 2010-12-01 09:32:49.000 CA61 CH-1 F6NE-3245-14/MM1 PA03

Tons Distancia TpoViaje TpoEsperaDump TpoDump \


1531 305.7373 0 0.0 0.000000 0.116667
2254 305.7373 0 0.6 0.000000 0.000000
2686 304.0000 0 0.0 0.000000 1.433333
3414 309.0000 0 0.0 0.000000 0.000000
3448 305.7373 0 0.0 0.000000 0.166667
... ... ... ... ... ...
2141111 304.0000 0 0.0 1.716667 3.000000
2141187 244.0000 0 0.0 0.000000 1.466667
2141195 213.7500 0 0.0 0.000000 2.233333
2141209 258.0000 0 0.0 0.000000 2.316667
2141302 304.0000 0 0.0 0.000000 0.000000

TipoMineral OpCAEX OpSHV Fase Banco Malla Otro velocidad


1531 Mineral Media 8675 2550 F10N 3260 05 MM2 -1.0
2254 Min As Medio 8689 1180 F10N 3260 01 MB1 0.0
2686 Mineral Media 5789 5329 F9SE 3335 08 MM4 -1.0
3414 Media Ley Prim 7334 4243 F7R1 2757 05 MP6 -1.0
3448 As Medio Prim 8863 1750 F10N 3260 07 MF1 -1.0
... ... ... ... ... ... ... ... ...
2141111 Mineral Media 7997 9232 F5SW 2885 04 MM2 -1.0
2141187 Mineral Media 6225 1022 F5SW 2885 04 MM2 -1.0
2141195 Mineral Media 0390 1022 F5SW 2885 04 MM2 -1.0
2141209 Mineral Media 6225 1022 F5SW 2885 04 MM2 -1.0
2141302 Mineral Media 9108 mmsunk F6NE 3245 14 MM1 -1.0

[24555 rows x 18 columns]

[24]: df = df[(df['Distancia']!=0)&(df['TpoViaje']!=0)]

[25]: df.shape

7
[25]: (2116256, 18)

[26]: df['velocidad'].min()

[26]: 0.56

[27]: dfveloc = df[(df['velocidad']<2 )|(df['velocidad']>60)]

[28]: dfveloc

[28]: TimeArrived Truck Destino Origen Shovel \


1142370 2015-05-24 04:12:53.000 CA69 CH-02 F8NE-3035-17/MM2 PA07
1142405 2015-05-24 03:16:55.000 CA56 CH-02 F8NE-3035-17/MM2 PA07
1142425 2015-05-24 02:50:12.000 CA72 CH-02 F8NE-3035-17/MM2 PA07
1142432 2015-05-24 02:45:28.000 CA81 CH-1 F8NE-3035-17/MM2 PA07
1142487 2015-05-24 01:33:27.000 CA83 CH-02 F8NE-3035-17/MM2 PA07
... ... ... ... ... ...
1363478 2014-05-25 05:22:03.000 CA58 CH-1 STOC-3095-02/MM1 CF08
1363484 2014-05-25 05:11:17.000 CA55 CH-1 STOC-3095-02/MM1 CF08
1363490 2014-05-25 05:03:23.000 CA69 CH-1 STOC-3095-02/MM1 CF08
1363494 2014-05-25 04:53:53.000 CA58 CH-1 STOC-3095-02/MM1 CF08
1363497 2014-05-25 04:45:31.000 CA55 CH-1 STOC-3095-02/MM1 CF08

Tons Distancia TpoViaje TpoEsperaDump TpoDump TipoMineral \


1142370 308.0 1886 0.60 0.716667 0.033333 Mineral Media
1142405 296.0 1886 0.60 4.883333 0.366667 Mineral Media
1142425 305.0 1886 0.60 4.466667 1.000000 Mineral Media
1142432 308.0 2133 1.25 0.000000 0.950000 Mineral Media
1142487 300.0 1886 0.60 3.766667 0.250000 Mineral Media
... ... ... ... ... ... ...
1363478 294.0 1920 1.20 0.933333 2.000000 Mineral Media
1363484 321.0 1920 1.20 1.183333 2.000000 Mineral Media
1363490 295.0 1920 1.20 0.400000 2.000000 Mineral Media
1363494 311.0 1920 1.20 0.000000 1.366667 Mineral Media
1363497 298.0 1920 1.20 0.233333 2.000000 Mineral Media

OpCAEX OpSHV Fase Banco Malla Otro velocidad


1142370 5321 0427 F8NE 3035 17 MM2 188.60
1142405 8420 1750 F8NE 3035 17 MM2 188.60
1142425 1842 1750 F8NE 3035 17 MM2 188.60
1142432 0333 1750 F8NE 3035 17 MM2 102.38
1142487 8016 1750 F8NE 3035 17 MM2 188.60
... ... ... ... ... ... ... ...
1363478 5614 8657 STOC 3095 02 MM1 96.00
1363484 1249 8657 STOC 3095 02 MM1 96.00
1363490 8863 8657 STOC 3095 02 MM1 96.00
1363494 5614 8657 STOC 3095 02 MM1 96.00

8
1363497 1249 8657 STOC 3095 02 MM1 96.00

[2463 rows x 18 columns]

[29]: df = df[(df['velocidad']>=2 )&(df['velocidad']<=60)]

[30]: df['velocidad'].hist(bins=40)

[30]: <AxesSubplot:>

[31]: df['Distancia'].min()

[31]: 71

[32]: df['Distancia'].hist(bins=40)

[32]: <AxesSubplot:>

9
[33]: df[(df['Distancia']>70000)]

[33]: TimeArrived Truck Destino Origen Shovel Tons \


496679 2018-07-20 17:47:32.000 CA73 CH-1 F7R1-2855-60/MM2 PA10 311.0

Distancia TpoViaje TpoEsperaDump TpoDump TipoMineral OpCAEX \


496679 94941 244.3833 0.0 0.0 Mineral Media 3915

OpSHV Fase Banco Malla Otro velocidad


496679 mmsunk F7R1 2855 60 MM2 23.31

[34]: # Matreiz de correlación valores numericos


sns.set_theme(style="white")
corr = df.corr()
mask = np.triu(np.ones_like(corr, dtype=bool))
f, ax = plt.subplots(figsize=(11, 9))
cmap = sns.diverging_palette(230, 20, as_cmap=True)
sns.heatmap(corr, mask=mask, cmap=cmap, vmax=.3, center=0, square=True,␣
,→linewidths=.5,annot=True, cbar_kws={"shrink": .5})

[34]: <AxesSubplot:>

10
[35]: df['Tons'].hist(bins=40)

[35]: <AxesSubplot:>

11
[36]: df[(df['Origen']=='F10N-3260-05/MM2') & (df['Destino']=='CH-02') &␣
,→(df['Distancia']!=0) & (df['OpCAEX']=='8675')]

[36]: TimeArrived Truck Destino Origen Shovel Tons \


1492 2020-12-19 12:44:22.000 CA52 CH-02 F10N-3260-05/MM2 PA10 335.0

Distancia TpoViaje TpoEsperaDump TpoDump TipoMineral OpCAEX \


1492 3305 9.533334 0.0 1.083333 Mineral Media 8675

OpSHV Fase Banco Malla Otro velocidad


1492 2550 F10N 3260 05 MM2 20.8

[37]: df.describe()

[37]: Tons Distancia TpoViaje TpoEsperaDump TpoDump \


count 2.113793e+06 2.113793e+06 2.113793e+06 2.113793e+06 2.113793e+06
mean 3.029493e+02 3.780687e+03 1.044349e+01 2.084809e+00 1.591700e+00
std 1.981255e+01 2.322065e+03 5.970199e+00 3.382245e+00 8.164145e-01
min 0.000000e+00 7.100000e+01 1.333333e-01 0.000000e+00 0.000000e+00
25% 2.960000e+02 2.046000e+03 6.050000e+00 0.000000e+00 1.000000e+00
50% 3.060000e+02 3.162000e+03 9.500000e+00 7.666667e-01 1.400000e+00
75% 3.104576e+02 5.139000e+03 1.368333e+01 2.766667e+00 2.000000e+00
max 4.000000e+02 9.494100e+04 2.443833e+02 3.010667e+02 3.000000e+00

velocidad

12
count 2.113793e+06
mean 2.147781e+01
std 3.917369e+00
min 2.460000e+00
25% 1.869000e+01
50% 2.259000e+01
75% 2.456000e+01
max 5.903000e+01

[38]: df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2113793 entries, 0 to 2141620
Data columns (total 18 columns):
# Column Dtype
--- ------ -----
0 TimeArrived object
1 Truck object
2 Destino object
3 Origen object
4 Shovel object
5 Tons float64
6 Distancia int64
7 TpoViaje float64
8 TpoEsperaDump float64
9 TpoDump float64
10 TipoMineral object
11 OpCAEX object
12 OpSHV object
13 Fase object
14 Banco object
15 Malla object
16 Otro object
17 velocidad float64
dtypes: float64(5), int64(1), object(12)
memory usage: 306.4+ MB

[39]: wrong=df[df['Tons']==0]

[40]: wrong

[40]: TimeArrived Truck Destino Origen Shovel Tons \


1518420 2013-09-18 19:08:39.000 CA71 CH-1 F7NE-3275-07/MM1 PA05 0.0
1518421 2013-09-18 19:03:03.000 CA97 CH-1 F7NE-3275-07/MM1 PA05 0.0
1518423 2013-09-18 18:49:22.000 CA67 CH-1 F7NE-3275-07/MM1 PA05 0.0
1518425 2013-09-18 18:46:55.000 CA83 CH-1 F7NE-3275-07/MM1 PA05 0.0
1518426 2013-09-18 18:46:16.000 CA82 CH-1 F7DE-3275-04/MM1 PA07 0.0

13
... ... ... ... ... ... ...
2140687 2010-12-02 07:35:08.000 CA82 CH-02 F5SW-2900-01/MM1 CF09 0.0
2140704 2010-12-02 06:57:31.000 CA82 CH-02 F5SW-2900-01/MM1 CF09 0.0
2140743 2010-12-02 05:46:36.000 CA82 CH-02 F5SW-2900-01/MM1 CF09 0.0
2140797 2010-12-02 03:54:29.000 CA82 CH-02 F5SW-2900-01/MM1 CF09 0.0
2140818 2010-12-02 03:18:38.000 CA82 CH-02 F5SW-2900-01/MM1 CF09 0.0

Distancia TpoViaje TpoEsperaDump TpoDump TipoMineral OpCAEX \


1518420 4060 9.533334 8.933333 3.000000 Mineral Media 9458
1518421 4060 10.133330 9.816667 3.000000 Mineral Media 6275
1518423 4060 9.533334 16.883330 3.000000 Mineral Media 1950
1518425 4060 10.133330 8.133333 3.000000 Mineral Media 1086
1518426 4597 10.383330 1.583333 3.000000 Mineral Media 3310
... ... ... ... ... ... ...
2140687 1208 5.650000 1.383333 3.000000 Mineral Media 7841
2140704 1208 5.650000 1.533333 3.000000 Mineral Media 7841
2140743 1208 5.650000 0.000000 1.733333 Mineral Media 7841
2140797 1208 5.650000 2.150000 3.000000 Mineral Media 7841
2140818 1208 5.650000 1.166667 3.000000 Mineral Media 7841

OpSHV Fase Banco Malla Otro velocidad


1518420 8715 F7NE 3275 07 MM1 25.55
1518421 8715 F7NE 3275 07 MM1 24.04
1518423 8715 F7NE 3275 07 MM1 25.55
1518425 8715 F7NE 3275 07 MM1 24.04
1518426 1180 F7DE 3275 04 MM1 26.56
... ... ... ... ... ... ...
2140687 8715 F5SW 2900 01 MM1 12.83
2140704 8715 F5SW 2900 01 MM1 12.83
2140743 8715 F5SW 2900 01 MM1 12.83
2140797 8715 F5SW 2900 01 MM1 12.83
2140818 8715 F5SW 2900 01 MM1 12.83

[4680 rows x 18 columns]

[41]: df = df[df['Tons']!=0]

[42]: df['TimeArrived']= pd.to_datetime(df['TimeArrived'])

[43]: df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2109113 entries, 0 to 2141620
Data columns (total 18 columns):
# Column Dtype
--- ------ -----
0 TimeArrived datetime64[ns]

14
1 Truck object
2 Destino object
3 Origen object
4 Shovel object
5 Tons float64
6 Distancia int64
7 TpoViaje float64
8 TpoEsperaDump float64
9 TpoDump float64
10 TipoMineral object
11 OpCAEX object
12 OpSHV object
13 Fase object
14 Banco object
15 Malla object
16 Otro object
17 velocidad float64
dtypes: datetime64[ns](1), float64(5), int64(1), object(11)
memory usage: 305.7+ MB

[44]: df=df[df['Tons']!=0]

[45]: df.shape

[45]: (2109113, 18)

[46]: nroTrucks = df['Truck'].nunique()


nroOrigens = df['Origen'].nunique()
nroShovels = df['Shovel'].nunique()
nroTipoMin = df['TipoMineral'].nunique()
nroOpTruck = df['OpCAEX'].nunique()
nroOpShvs = df['OpSHV'].nunique()
nroFases = df['Fase'].nunique()
nroBancos = df['Banco'].nunique()
print("Camiones: %d Origenes: %d Palas: %d Tipo Mineral: %d Opers Trucks: %d␣
,→Opers Palas: %d Nro Fases: %d Nro Bancos: %d"%

(nroTrucks, nroOrigens, nroShovels, nroTipoMin, nroOpTruck, nroOpShvs,␣


,→nroFases, nroBancos))

Camiones: 65 Origenes: 5935 Palas: 14 Tipo Mineral: 14 Opers Trucks: 456 Opers
Palas: 121 Nro Fases: 32 Nro Bancos: 110

[47]: # -- Se revisa ahora las variables numéricas que podrían aportar tendencia ---
# -- como Tons, Distancia, TpoViaje, TpoEsperaDump, TpoDump

# -- selecciona las variables numericas --

15
numericals = df[['Tons', 'Distancia', 'TpoViaje', 'TpoEsperaDump', 'TpoDump',␣
,→'velocidad']]

#-- Prepara seaborn---


sns.set(style='whitegrid', rc={"grid.linewidth": 0.1})
sns.set_context("paper", font_scale=1.25)
color = sns.color_palette("Set2", 6)

# -- Genera grilla para graficos y los plotea ---


fig, ax = plt.subplots(1,6)
fig.set_size_inches(30, 10)
sns.boxplot(y=numericals['Tons'],palette=color, ax=ax[0])
sns.boxplot(y=numericals['Distancia'], palette=color, ax=ax[1])
sns.boxplot(y=numericals['TpoViaje'], palette=color, ax=ax[2])
sns.boxplot(y=numericals['TpoEsperaDump'],palette=color, ax=ax[3])
sns.boxplot(y=numericals['TpoDump'],palette=color, ax=ax[4])
sns.boxplot(y=numericals['velocidad'],palette=color, ax=ax[5])
fig.show()

[48]: plt.figure(figsize=[10,8])
n, bins, patches = plt.hist(x=numericals['Tons'], bins=40,␣
,→color='#0504aa',alpha=0.7, rwidth=0.85)

plt.grid(axis='y', alpha=0.75)
plt.xlabel('Value',fontsize=15)
plt.ylabel('Frequency',fontsize=15)
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.ylabel('Frequency',fontsize=15)
plt.title('Normal Distribution Histogram',fontsize=15)
plt.show()

16
[49]: plt.figure(figsize=[10,8])
n, bins, patches = plt.hist(x=numericals['Distancia'], bins=40,␣
,→color='#0504aa',alpha=0.7, rwidth=0.85)

plt.grid(axis='y', alpha=0.75)
plt.xlabel('Value',fontsize=15)
plt.ylabel('Frequency',fontsize=15)
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.ylabel('Frequency',fontsize=15)
plt.title('Normal Distribution Histogram',fontsize=15)
plt.show()

17
[50]: plt.figure(figsize=[10,8])
n, bins, patches = plt.hist(x=numericals['TpoViaje'], bins=40,␣
,→color='#0504aa',alpha=0.7, rwidth=0.85)

plt.grid(axis='y', alpha=0.75)
plt.xlabel('Value',fontsize=15)
plt.ylabel('Frequency',fontsize=15)
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.ylabel('Frequency',fontsize=15)
plt.title('Normal Distribution Histogram',fontsize=15)
plt.show()

18
[51]: plt.figure(figsize=[10,8])
n, bins, patches = plt.hist(x=numericals['TpoEsperaDump'], bins=80,␣
,→color='#0504aa',alpha=0.7, rwidth=0.85)

plt.grid(axis='y', alpha=0.75)
plt.xlabel('Value',fontsize=15)
plt.ylabel('Frequency',fontsize=15)
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.ylabel('Frequency',fontsize=15)
plt.title('Normal Distribution Histogram',fontsize=15)
plt.show()

19
[52]: plt.figure(figsize=[10,8])
n, bins, patches = plt.hist(x=numericals['velocidad'], bins=40,␣
,→color='#0504aa',alpha=0.7, rwidth=0.85)

plt.grid(axis='y', alpha=0.75)
plt.xlabel('Value',fontsize=15)
plt.ylabel('Frequency',fontsize=15)
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.ylabel('Frequency',fontsize=15)
plt.title('Normal Distribution Histogram',fontsize=15)
plt.show()

20
[53]: # -- Buscamos algunas distribuciones en variables categoricas y algunas␣
,→numericas (quick view)

# -- Definimos una función para arreglar el gráfico


# -- para er numeros en distribución,
# -- se agregan los labels a cada barra
def barchart_prep (counts, ax):
rects = ax.patches
labels = counts.values
for rect, label in zip(rects, labels):
height = rect.get_height()
ax.text(rect.get_x() + rect.get_width()/2, height + 10, label,␣
,→ha='center', va='bottom', fontsize=9, rotation = 90)

def plotSetupAndShow(x, mainT, xLa, yLa):


plt.figure(figsize=(20,12))
ax = sns.barplot(x.index, x.values, alpha=0.8, palette="deep")
plt.title(mainT, fontsize = 12)

21
plt.ylabel(yLa, fontsize=9)
plt.xlabel(xLa, fontsize=9)
plt.tick_params(labelsize=10)
plt.xticks(rotation = 90)
barchart_prep(x, ax)
plt.show()

[54]: # COnteo de viajes desde cada banco


x = df['Banco'].value_counts()

plotSetupAndShow(x,mainT='Distribución de Cargas desde Banco',xLa='Banco␣


,→ID',yLa='# de Cargas')

[55]: # --> Numero de Cargas por Camión


x=df['Truck'].value_counts()

#x=x.sort_index()

# -- Preparamos el gráfico y lo mostramos


plotSetupAndShow(x,mainT='Distribución de Cargas por Camion',xLa='Truck␣
,→ID',yLa='# Descargas')

22
[56]: # --> Tonelaje alimentado por Camión
x=round(df.groupby('Truck')['Tons'].sum()/1e3,2)
x=x.sort_values(ascending=False)

# -- Preparamos el gráfico y lo mostramos


plotSetupAndShow(x,mainT='Tonelaje Alimentado por Camion',xLa='Truck␣
,→ID',yLa='Tons [KTons]')

23
[57]: # --> Número de Cargas por Pala
x=df['Shovel'].value_counts()
#x=x.sort_index()

# -- Preparamos el gráfico y lo mostramos


plotSetupAndShow(x,mainT='Distribución de Cargas por Camion',xLa='Shovel␣
,→ID',yLa='# Cargas')

24
[58]: # --> Tonelaje cargado por Pala
x=round(df.groupby('Shovel')['Tons'].sum()/1e6,2)
x=x.sort_values(ascending=False)

# -- Preparamos el gráfico y lo mostramos


plotSetupAndShow(x,mainT='Tonelaje Cargado por Pala',xLa='Pala ID',yLa='Tons␣
,→[MTons]')

25
[59]: # --> Tonelaje Alimentado segun el Tipo MIneral
x=round(df.groupby('TipoMineral')['Tons'].sum()/1e6,2)
x=x.sort_values(ascending=False)

# -- Preparamos el gráfico y lo mostramos


plotSetupAndShow(x,mainT='Tonelaje Alimentado por Tipo Mineral',xLa='Tipo␣
,→Mineral',yLa='Tons [MTons]')

26
[60]: # Tonelaje alimentado desde cada fase a la fecha
x=round(df.groupby('Fase')['Tons'].sum()/1e6,2)
x=x.sort_values(ascending=False)

# -- Preparamos el gráfico y lo mostramos


plotSetupAndShow(x,mainT='Tonelaje Alimentado desde cada␣
,→Fase',xLa='Fases',yLa='Tons [MTons]')

27
[61]: # Tonelaje alimentado desde cada Banco a la fecha
x=round(df.groupby('Banco')['Tons'].sum()/1e3,0)
x=x.sort_values(ascending=False)

# -- Preparamos el gráfico y lo mostramos


plotSetupAndShow(x,mainT='Tonelaje Alimentado desde cada␣
,→Banco',xLa='Bancos',yLa='Tons [MTons]')

28
[62]: s = df.groupby(['Shovel']).TimeArrived.agg({'min','max'})

[63]: s

[63]: min max


Shovel
BH09 2017-04-28 06:19:11 2020-12-08 02:25:52
CF02 2010-12-01 20:02:54 2013-06-14 02:13:47
CF08 2010-12-02 18:49:10 2017-12-12 11:04:53
CF09 2010-12-01 11:17:48 2013-10-04 11:18:22
CF10 2013-05-24 14:17:26 2020-12-22 06:28:04
CF11 2013-10-10 12:19:15 2020-12-08 22:26:22
PA01 2011-03-17 11:54:01 2018-06-28 10:10:07
PA03 2010-11-30 21:16:15 2020-04-21 02:58:22
PA04 2010-11-30 21:12:22 2020-12-21 06:42:53
PA05 2010-12-01 16:35:21 2020-12-22 06:20:01
PA06 2010-11-30 21:00:00 2013-10-27 21:57:07
PA07 2013-02-02 03:16:39 2020-12-19 21:16:25
PA10 2018-02-14 10:48:40 2020-12-22 04:06:49
PA11 2018-05-10 21:31:39 2020-12-21 18:10:47

[64]: s['nrodays'] = s['max']-s['min']


s['nrodays'] = s['nrodays']/datetime.timedelta(days=1)

[65]: s.reset_index()

29
[65]: Shovel min max nrodays
0 BH09 2017-04-28 06:19:11 2020-12-08 02:25:52 1319.837975
1 CF02 2010-12-01 20:02:54 2013-06-14 02:13:47 925.257558
2 CF08 2010-12-02 18:49:10 2017-12-12 11:04:53 2566.677581
3 CF09 2010-12-01 11:17:48 2013-10-04 11:18:22 1038.000394
4 CF10 2013-05-24 14:17:26 2020-12-22 06:28:04 2768.674051
5 CF11 2013-10-10 12:19:15 2020-12-08 22:26:22 2616.421609
6 PA01 2011-03-17 11:54:01 2018-06-28 10:10:07 2659.927847
7 PA03 2010-11-30 21:16:15 2020-04-21 02:58:22 3429.237581
8 PA04 2010-11-30 21:12:22 2020-12-21 06:42:53 3673.396192
9 PA05 2010-12-01 16:35:21 2020-12-22 06:20:01 3673.572685
10 PA06 2010-11-30 21:00:00 2013-10-27 21:57:07 1062.039664
11 PA07 2013-02-02 03:16:39 2020-12-19 21:16:25 2877.749838
12 PA10 2018-02-14 10:48:40 2020-12-22 04:06:49 1041.720937
13 PA11 2018-05-10 21:31:39 2020-12-21 18:10:47 955.860509

[66]: s.sort_values(by=['min'])

[66]: min max nrodays


Shovel
PA06 2010-11-30 21:00:00 2013-10-27 21:57:07 1062.039664
PA04 2010-11-30 21:12:22 2020-12-21 06:42:53 3673.396192
PA03 2010-11-30 21:16:15 2020-04-21 02:58:22 3429.237581
CF09 2010-12-01 11:17:48 2013-10-04 11:18:22 1038.000394
PA05 2010-12-01 16:35:21 2020-12-22 06:20:01 3673.572685
CF02 2010-12-01 20:02:54 2013-06-14 02:13:47 925.257558
CF08 2010-12-02 18:49:10 2017-12-12 11:04:53 2566.677581
PA01 2011-03-17 11:54:01 2018-06-28 10:10:07 2659.927847
PA07 2013-02-02 03:16:39 2020-12-19 21:16:25 2877.749838
CF10 2013-05-24 14:17:26 2020-12-22 06:28:04 2768.674051
CF11 2013-10-10 12:19:15 2020-12-08 22:26:22 2616.421609
BH09 2017-04-28 06:19:11 2020-12-08 02:25:52 1319.837975
PA10 2018-02-14 10:48:40 2020-12-22 04:06:49 1041.720937
PA11 2018-05-10 21:31:39 2020-12-21 18:10:47 955.860509

[67]: shnop = s.query("max < '2020-01-01 00:00:00'")

[68]: shsiop= s.query("max >= '2020-01-01 00:00:00'")

[69]: shsiop

[69]: min max nrodays


Shovel
BH09 2017-04-28 06:19:11 2020-12-08 02:25:52 1319.837975
CF10 2013-05-24 14:17:26 2020-12-22 06:28:04 2768.674051
CF11 2013-10-10 12:19:15 2020-12-08 22:26:22 2616.421609
PA03 2010-11-30 21:16:15 2020-04-21 02:58:22 3429.237581

30
PA04 2010-11-30 21:12:22 2020-12-21 06:42:53 3673.396192
PA05 2010-12-01 16:35:21 2020-12-22 06:20:01 3673.572685
PA07 2013-02-02 03:16:39 2020-12-19 21:16:25 2877.749838
PA10 2018-02-14 10:48:40 2020-12-22 04:06:49 1041.720937
PA11 2018-05-10 21:31:39 2020-12-21 18:10:47 955.860509

[70]: shnop

[70]: min max nrodays


Shovel
CF02 2010-12-01 20:02:54 2013-06-14 02:13:47 925.257558
CF08 2010-12-02 18:49:10 2017-12-12 11:04:53 2566.677581
CF09 2010-12-01 11:17:48 2013-10-04 11:18:22 1038.000394
PA01 2011-03-17 11:54:01 2018-06-28 10:10:07 2659.927847
PA06 2010-11-30 21:00:00 2013-10-27 21:57:07 1062.039664

[71]: t = df.groupby(['Truck']).TimeArrived.agg({'min','max'})

[72]: t = t.sort_values(by=['min'])

[73]: t['nrodays'] = t['max']-t['min']


t['nrodays'] = t['nrodays']/datetime.timedelta(days=1)

[74]: t.reset_index()

[74]: Truck min max nrodays


0 CA74 2010-11-30 21:00:00 2020-12-21 17:11:59 3673.841655
1 CA29 2010-11-30 21:00:00 2013-09-09 16:21:49 1013.806817
2 CA72 2010-11-30 21:00:00 2020-12-21 10:48:21 3673.575243
3 CA09 2010-11-30 21:06:09 2010-12-19 07:14:48 18.422674
4 CA81 2010-11-30 21:12:22 2020-12-21 19:19:09 3673.921377
5 CA80 2010-11-30 21:13:13 2020-12-21 16:35:54 3673.807419
6 CA64 2010-11-30 21:16:15 2020-12-22 06:28:04 3674.383206
7 CA79 2010-11-30 21:17:27 2020-12-21 19:10:45 3673.912014
8 CA85 2010-11-30 21:17:34 2020-12-21 16:21:31 3673.794410
9 CA52 2010-11-30 21:23:02 2020-12-21 18:54:27 3673.896817
10 CA86 2010-11-30 21:25:05 2020-12-22 05:49:53 3674.350556
11 CA59 2010-11-30 21:30:14 2020-12-22 03:57:52 3674.269190
12 CA89 2010-11-30 21:33:03 2020-12-20 10:15:30 3672.529479
13 CA51 2010-11-30 21:34:21 2020-12-20 19:17:07 3672.904699
14 CA68 2010-11-30 21:36:49 2020-12-22 03:53:06 3674.261308
15 CA50 2010-11-30 21:41:03 2020-12-21 10:22:45 3673.528958
16 CA62 2010-11-30 21:43:44 2020-12-20 22:47:39 3673.044387
17 CA67 2010-11-30 21:49:07 2020-12-21 16:44:02 3673.788137
18 CA54 2010-11-30 22:09:25 2020-12-22 01:49:16 3674.152674
19 CA13 2010-11-30 23:02:22 2010-12-09 16:51:05 8.742164
20 CA83 2010-12-01 00:42:20 2020-12-22 05:32:54 3674.201782

31
21 CA56 2010-12-01 01:44:16 2020-12-22 03:59:33 3674.093947
22 CA01 2010-12-01 01:55:19 2010-12-10 05:39:18 9.155544
23 CA11 2010-12-01 02:00:42 2010-12-09 16:50:16 8.617755
24 CA08 2010-12-01 02:07:33 2010-12-22 08:11:40 21.252859
25 CA88 2010-12-01 02:15:29 2020-12-21 20:49:40 3673.773738
26 CA57 2010-12-01 02:15:44 2020-12-21 11:46:54 3673.396644
27 CA77 2010-12-01 02:31:17 2020-12-22 03:39:02 3674.047049
28 CA69 2010-12-01 03:36:37 2020-12-21 17:27:51 3673.577245
29 CA61 2010-12-01 04:21:01 2020-12-22 06:11:17 3674.076574
30 CA82 2010-12-01 07:01:44 2020-12-21 12:43:30 3673.237338
31 CA76 2010-12-01 09:01:08 2020-12-21 15:52:58 3673.285995
32 CA66 2010-12-01 11:52:39 2020-12-21 22:21:19 3673.436574
33 CA73 2010-12-01 12:58:59 2020-12-21 16:21:07 3673.140370
34 CA28 2010-12-01 13:30:34 2012-08-18 13:45:48 626.010579
35 CA84 2010-12-01 15:52:36 2020-12-21 18:01:34 3673.089560
36 CA70 2010-12-01 18:45:41 2020-12-22 05:54:04 3673.464155
37 CA30 2010-12-01 19:48:35 2014-01-12 09:51:17 1137.585208
38 CA71 2010-12-02 08:43:35 2014-09-05 20:40:26 1373.497813
39 CA55 2010-12-02 09:17:31 2020-12-21 14:23:13 3672.212292
40 CA63 2010-12-02 10:38:06 2020-12-21 15:57:07 3672.221539
41 CA60 2010-12-02 21:37:12 2020-12-21 17:24:42 3671.824653
42 CA53 2010-12-07 08:03:34 2020-12-21 19:22:04 3667.471181
43 CA78 2010-12-08 21:00:00 2020-12-21 22:30:23 3666.062766
44 CA65 2010-12-18 00:49:27 2020-12-21 18:43:38 3656.745961
45 CA87 2011-01-06 05:58:38 2020-12-19 19:20:36 3635.556921
46 CA58 2011-01-09 13:24:52 2020-12-21 17:41:15 3634.178044
47 CA90 2012-09-07 20:06:14 2020-12-22 00:21:21 3027.177164
48 CA91 2012-09-18 13:45:51 2020-12-20 22:45:26 3015.374711
49 CA92 2012-10-04 10:37:24 2020-12-21 16:24:01 3000.240706
50 CA93 2013-03-16 19:56:07 2020-12-21 19:37:07 2836.986806
51 CA94 2013-03-30 22:58:31 2020-12-22 06:07:05 2823.297616
52 CA95 2013-04-13 18:51:47 2020-12-22 05:23:18 2809.438553
53 CA96 2013-04-23 19:01:56 2020-12-19 23:44:18 2797.196088
54 CA97 2013-05-01 22:21:31 2020-12-21 12:43:53 2790.598866
55 CA98 2013-05-19 18:55:12 2020-12-21 20:48:26 2773.078634
56 CA99 2013-06-05 19:40:51 2020-12-22 00:39:38 2756.207488
57 CA100 2014-02-02 23:03:13 2020-12-22 05:21:28 2514.262674
58 CA101 2014-02-12 13:15:51 2020-12-22 03:24:19 2504.589213
59 CA102 2014-02-20 19:35:55 2020-12-22 04:02:50 2496.352025
60 CA103 2015-04-23 00:08:14 2020-12-22 06:20:01 2070.258183
61 CA105 2017-04-26 19:09:10 2020-12-22 03:11:45 1335.335127
62 CA104 2017-04-27 01:14:59 2020-12-22 02:18:24 1335.044039
63 CA107 2017-07-26 22:44:57 2020-12-22 00:37:49 1244.078380
64 CA106 2017-10-12 04:00:52 2020-12-21 18:19:53 1166.596539

[75]: trsiop = t.query("max >= '2020-01-01 00:00:00'")

32
[76]: trnop = t.query("max < '2020-01-01 00:00:00'")

[77]: trnop

[77]: min max nrodays


Truck
CA29 2010-11-30 21:00:00 2013-09-09 16:21:49 1013.806817
CA09 2010-11-30 21:06:09 2010-12-19 07:14:48 18.422674
CA13 2010-11-30 23:02:22 2010-12-09 16:51:05 8.742164
CA01 2010-12-01 01:55:19 2010-12-10 05:39:18 9.155544
CA11 2010-12-01 02:00:42 2010-12-09 16:50:16 8.617755
CA08 2010-12-01 02:07:33 2010-12-22 08:11:40 21.252859
CA28 2010-12-01 13:30:34 2012-08-18 13:45:48 626.010579
CA30 2010-12-01 19:48:35 2014-01-12 09:51:17 1137.585208
CA71 2010-12-02 08:43:35 2014-09-05 20:40:26 1373.497813

[78]: trsiop

[78]: min max nrodays


Truck
CA74 2010-11-30 21:00:00 2020-12-21 17:11:59 3673.841655
CA72 2010-11-30 21:00:00 2020-12-21 10:48:21 3673.575243
CA81 2010-11-30 21:12:22 2020-12-21 19:19:09 3673.921377
CA80 2010-11-30 21:13:13 2020-12-21 16:35:54 3673.807419
CA64 2010-11-30 21:16:15 2020-12-22 06:28:04 3674.383206
CA79 2010-11-30 21:17:27 2020-12-21 19:10:45 3673.912014
CA85 2010-11-30 21:17:34 2020-12-21 16:21:31 3673.794410
CA52 2010-11-30 21:23:02 2020-12-21 18:54:27 3673.896817
CA86 2010-11-30 21:25:05 2020-12-22 05:49:53 3674.350556
CA59 2010-11-30 21:30:14 2020-12-22 03:57:52 3674.269190
CA89 2010-11-30 21:33:03 2020-12-20 10:15:30 3672.529479
CA51 2010-11-30 21:34:21 2020-12-20 19:17:07 3672.904699
CA68 2010-11-30 21:36:49 2020-12-22 03:53:06 3674.261308
CA50 2010-11-30 21:41:03 2020-12-21 10:22:45 3673.528958
CA62 2010-11-30 21:43:44 2020-12-20 22:47:39 3673.044387
CA67 2010-11-30 21:49:07 2020-12-21 16:44:02 3673.788137
CA54 2010-11-30 22:09:25 2020-12-22 01:49:16 3674.152674
CA83 2010-12-01 00:42:20 2020-12-22 05:32:54 3674.201782
CA56 2010-12-01 01:44:16 2020-12-22 03:59:33 3674.093947
CA88 2010-12-01 02:15:29 2020-12-21 20:49:40 3673.773738
CA57 2010-12-01 02:15:44 2020-12-21 11:46:54 3673.396644
CA77 2010-12-01 02:31:17 2020-12-22 03:39:02 3674.047049
CA69 2010-12-01 03:36:37 2020-12-21 17:27:51 3673.577245
CA61 2010-12-01 04:21:01 2020-12-22 06:11:17 3674.076574
CA82 2010-12-01 07:01:44 2020-12-21 12:43:30 3673.237338
CA76 2010-12-01 09:01:08 2020-12-21 15:52:58 3673.285995
CA66 2010-12-01 11:52:39 2020-12-21 22:21:19 3673.436574

33
CA73 2010-12-01 12:58:59 2020-12-21 16:21:07 3673.140370
CA84 2010-12-01 15:52:36 2020-12-21 18:01:34 3673.089560
CA70 2010-12-01 18:45:41 2020-12-22 05:54:04 3673.464155
CA55 2010-12-02 09:17:31 2020-12-21 14:23:13 3672.212292
CA63 2010-12-02 10:38:06 2020-12-21 15:57:07 3672.221539
CA60 2010-12-02 21:37:12 2020-12-21 17:24:42 3671.824653
CA53 2010-12-07 08:03:34 2020-12-21 19:22:04 3667.471181
CA78 2010-12-08 21:00:00 2020-12-21 22:30:23 3666.062766
CA65 2010-12-18 00:49:27 2020-12-21 18:43:38 3656.745961
CA87 2011-01-06 05:58:38 2020-12-19 19:20:36 3635.556921
CA58 2011-01-09 13:24:52 2020-12-21 17:41:15 3634.178044
CA90 2012-09-07 20:06:14 2020-12-22 00:21:21 3027.177164
CA91 2012-09-18 13:45:51 2020-12-20 22:45:26 3015.374711
CA92 2012-10-04 10:37:24 2020-12-21 16:24:01 3000.240706
CA93 2013-03-16 19:56:07 2020-12-21 19:37:07 2836.986806
CA94 2013-03-30 22:58:31 2020-12-22 06:07:05 2823.297616
CA95 2013-04-13 18:51:47 2020-12-22 05:23:18 2809.438553
CA96 2013-04-23 19:01:56 2020-12-19 23:44:18 2797.196088
CA97 2013-05-01 22:21:31 2020-12-21 12:43:53 2790.598866
CA98 2013-05-19 18:55:12 2020-12-21 20:48:26 2773.078634
CA99 2013-06-05 19:40:51 2020-12-22 00:39:38 2756.207488
CA100 2014-02-02 23:03:13 2020-12-22 05:21:28 2514.262674
CA101 2014-02-12 13:15:51 2020-12-22 03:24:19 2504.589213
CA102 2014-02-20 19:35:55 2020-12-22 04:02:50 2496.352025
CA103 2015-04-23 00:08:14 2020-12-22 06:20:01 2070.258183
CA105 2017-04-26 19:09:10 2020-12-22 03:11:45 1335.335127
CA104 2017-04-27 01:14:59 2020-12-22 02:18:24 1335.044039
CA107 2017-07-26 22:44:57 2020-12-22 00:37:49 1244.078380
CA106 2017-10-12 04:00:52 2020-12-21 18:19:53 1166.596539

[79]: print("Palas INS: %d Palas OOS: %d Camiones INS: %d, Camiones OOS: %d"%(shsiop.
,→shape[0], shnop.shape[0], trsiop.shape[0], trnop.shape[0]))

Palas INS: 9 Palas OOS: 5 Camiones INS: 56, Camiones OOS: 9

[80]: # 'OpCAEX'
optr = df.groupby(['OpCAEX']).TimeArrived.agg({'min','max'})
optrSI = optr.query("max >= '2020-01-01 00:00:00'")
optrNO = optr.query("max < '2020-01-01 00:00:00'")

[81]: optr['nrodays'] = optr['max']-optr['min']


optr['nrodays'] = optr['nrodays']/datetime.timedelta(days=1)

[82]: optrSI

[82]: min max


OpCAEX

34
0001 2011-02-15 14:54:35 2020-11-22 18:34:36
0002 2020-06-07 18:09:24 2020-06-07 18:09:24
0046 2010-12-22 21:47:05 2020-12-08 23:54:46
0069 2020-07-16 03:39:24 2020-12-15 14:57:52
0070 2010-11-30 21:34:21 2020-12-04 20:31:05
... ... ...
9782 2012-01-07 01:44:05 2020-12-15 15:06:47
9795 2010-12-05 12:23:17 2020-12-21 16:32:17
9922 2017-04-13 17:33:45 2020-12-19 16:46:58
9949 2014-06-15 02:21:40 2020-12-21 06:33:56
mmsunk 2010-11-30 21:00:00 2020-12-20 21:00:00

[293 rows x 2 columns]

[83]: optrNO

[83]: min max


OpCAEX
0006 2015-10-17 18:50:04 2015-10-17 19:08:20
0013 2017-10-30 14:49:44 2017-10-30 14:49:44
0022 2012-05-06 06:33:57 2012-05-06 06:33:57
0079 2011-04-28 15:31:29 2013-01-03 20:53:00
0080 2011-09-30 03:54:24 2017-05-03 18:44:00
... ... ...
9691 2012-09-20 12:14:09 2013-05-29 18:25:37
9694 2010-11-30 21:30:14 2019-02-13 08:21:14
9826 2010-12-22 16:55:40 2015-03-13 16:27:26
9850 2010-12-08 16:20:17 2019-01-28 20:38:24
9917 2014-06-06 14:02:48 2015-05-26 07:42:54

[163 rows x 2 columns]

[84]: # 'OpSHV',
opsh = df.groupby(['OpSHV']).TimeArrived.agg({'min','max'})
opshSI = opsh.query("max >= '2020-01-01 00:00:00'")
opshNO = opsh.query("max < '2020-01-01 00:00:00'")

[85]: opsh['nrodays'] = opsh['max']-opsh['min']


opsh['nrodays'] = opsh['nrodays']/datetime.timedelta(days=1)

[86]: opshSI

[86]: min max


OpSHV
0001 2014-12-22 17:21:10 2020-12-02 18:14:50
0005 2020-09-01 03:35:48 2020-09-01 03:51:40
0046 2010-12-08 12:13:01 2020-11-27 17:02:33

35
0070 2012-10-08 23:09:09 2020-12-06 17:14:07
0157 2010-12-02 18:49:10 2020-12-15 06:55:41
0406 2020-01-23 00:56:37 2020-07-12 22:22:23
0467 2020-06-13 17:36:52 2020-06-13 19:37:10
0676 2010-12-09 16:53:34 2020-12-11 05:05:30
0812 2017-06-17 22:58:31 2020-11-28 23:50:59
0954 2013-07-23 19:44:24 2020-11-30 13:57:30
0964 2011-01-07 07:02:03 2020-12-15 10:04:53
1022 2010-12-01 11:17:48 2020-12-21 06:42:53
1149 2011-03-07 05:34:21 2020-11-30 22:03:52
1155 2012-10-27 10:15:05 2020-12-15 20:48:52
1180 2010-12-02 21:25:07 2020-12-22 04:06:49
1222 2016-09-16 10:16:25 2020-12-07 11:52:40
1228 2016-05-02 23:13:52 2020-12-08 07:22:54
1317 2011-12-03 00:43:44 2020-05-26 12:37:37
1375 2017-07-27 23:16:29 2020-12-12 23:20:33
1750 2010-12-23 14:42:31 2020-12-15 20:51:10
1815 2011-07-02 03:18:08 2020-12-21 16:55:54
2044 2010-12-01 09:40:19 2020-12-21 13:29:17
2550 2010-12-01 16:35:21 2020-12-21 20:52:42
2902 2010-12-02 04:22:54 2020-12-21 07:06:44
3185 2010-12-09 01:46:45 2020-12-15 15:12:07
3655 2016-02-04 18:22:21 2020-12-08 20:42:31
3673 2016-09-12 02:16:46 2020-07-27 20:50:16
3707 2019-09-18 21:38:16 2020-11-01 11:27:32
3902 2016-08-11 01:09:39 2020-02-19 02:39:04
4055 2013-12-09 07:28:04 2020-11-28 23:00:54
4091 2010-12-08 13:57:11 2020-12-15 08:39:52
4243 2010-11-30 21:34:21 2020-12-22 06:20:01
4344 2017-10-10 10:49:13 2020-11-28 19:08:59
4374 2012-12-13 17:52:13 2020-12-13 05:38:57
5001 2010-12-03 15:28:42 2020-12-21 20:48:26
5329 2010-12-03 16:58:36 2020-12-21 09:34:55
5384 2010-12-07 21:25:14 2020-12-15 13:39:03
5418 2015-09-12 04:48:40 2020-08-21 11:26:51
5469 2010-12-07 21:33:38 2020-11-22 16:38:49
5477 2011-01-12 12:49:04 2020-12-21 05:43:43
5865 2010-12-21 21:27:15 2020-12-01 16:33:55
5920 2012-05-02 01:42:07 2020-12-19 19:17:14
6423 2010-12-01 09:42:21 2020-12-21 19:44:07
6584 2011-01-03 12:10:45 2020-05-11 17:04:00
6668 2016-12-11 17:54:11 2020-12-22 06:28:04
6696 2017-06-17 20:44:32 2020-11-28 20:54:06
6733 2010-12-08 16:05:42 2020-12-15 08:37:11
6852 2012-04-20 08:39:26 2020-12-20 08:50:06
7377 2018-09-30 20:38:55 2020-12-08 01:49:23
7381 2011-02-13 10:00:30 2020-11-24 05:43:09

36
7514 2010-12-08 09:18:12 2020-03-09 08:48:07
7648 2010-12-01 09:33:31 2020-12-21 18:43:38
7948 2020-09-04 21:58:03 2020-09-06 23:18:11
8256 2016-03-09 00:53:53 2020-12-08 22:26:22
8261 2020-06-18 07:28:44 2020-08-30 14:02:04
8715 2010-12-01 21:30:09 2020-04-27 16:56:50
8882 2012-07-06 16:02:43 2020-12-15 13:56:53
8974 2011-08-12 02:01:02 2020-12-18 02:56:43
8997 2017-04-28 11:04:30 2020-12-07 20:50:47
9108 2010-12-02 16:43:03 2020-12-16 20:53:24
9579 2014-04-13 07:32:17 2020-12-15 02:43:32
9670 2013-03-25 08:03:30 2020-11-24 05:58:10
9922 2017-04-28 06:19:11 2020-12-16 13:55:10
mmsunk 2010-11-30 21:00:00 2020-12-21 21:21:29

[87]: opshNO

[87]: min max


OpSHV
0002 2015-08-02 20:37:25 2015-08-02 20:42:19
0006 2016-04-30 14:40:42 2017-05-05 13:58:51
0011 2013-12-11 13:35:41 2014-02-09 12:51:39
0036 2011-02-19 05:44:08 2014-08-13 15:18:08
0079 2010-12-22 21:32:12 2013-01-04 13:49:17
0094 2018-07-19 11:56:34 2018-07-19 11:58:35
0376 2011-10-24 19:43:43 2012-02-12 20:50:53
0427 2010-12-09 04:07:38 2015-08-04 15:31:31
0774 2011-03-16 13:23:10 2011-11-18 01:03:25
0875 2010-11-30 21:34:24 2016-01-30 16:48:34
0985 2017-10-25 21:39:44 2018-01-18 08:53:43
0999 2012-04-15 16:31:35 2012-04-15 17:34:13
1202 2011-01-18 21:21:44 2016-07-30 04:51:47
1569 2010-12-09 11:55:36 2019-01-01 16:20:35
1593 2010-12-02 12:51:52 2012-07-31 06:08:33
1812 2015-06-01 03:32:11 2015-06-01 06:21:34
2407 2012-12-17 19:16:46 2016-02-05 18:04:55
2431 2016-11-20 16:40:26 2016-11-20 18:05:32
2519 2010-12-08 15:01:09 2017-08-24 08:53:33
2536 2017-09-14 09:49:12 2017-10-02 03:14:04
2577 2010-12-24 11:12:44 2015-07-05 20:08:16
3201 2012-11-14 10:52:35 2015-02-09 11:32:38
3217 2011-01-12 15:48:27 2013-04-20 00:40:05
3473 2011-08-14 01:34:47 2011-08-14 04:00:34
3568 2012-07-02 11:34:40 2012-10-17 11:27:02
3943 2017-10-15 10:00:22 2018-07-07 07:24:49
3957 2011-01-02 03:25:11 2013-04-29 02:12:53
4626 2017-01-24 09:26:46 2017-01-24 09:59:32

37
5254 2016-03-14 16:03:47 2016-03-14 16:27:24
6039 2010-12-19 09:22:10 2013-02-21 05:48:48
6310 2015-10-24 05:15:08 2017-10-24 01:19:36
6373 2010-12-01 20:02:54 2018-06-10 17:44:20
6458 2013-08-16 11:56:54 2018-08-26 08:47:59
6574 2017-11-14 09:41:18 2017-12-25 00:43:21
6586 2010-12-01 21:40:35 2018-11-25 05:51:14
6825 2010-12-25 03:18:45 2017-12-17 06:46:31
7278 2017-11-11 01:26:04 2018-11-23 20:28:25
7283 2012-05-11 00:59:14 2016-07-18 08:59:36
7451 2011-01-21 17:36:22 2019-09-07 12:39:29
7513 2016-01-16 21:27:50 2016-01-16 21:40:37
7645 2010-11-30 21:17:34 2010-12-02 08:49:15
7657 2012-12-08 01:13:44 2019-01-25 05:23:55
7945 2017-12-08 18:22:43 2017-12-24 03:30:07
8070 2011-09-15 10:27:37 2017-12-13 08:57:16
8225 2015-11-09 04:59:55 2016-05-08 16:38:19
8428 2010-12-02 21:55:18 2012-01-12 16:51:52
8542 2010-12-09 04:26:44 2017-12-29 08:56:09
8657 2010-12-07 21:45:03 2018-12-17 18:48:14
8719 2010-12-20 19:00:35 2015-09-23 05:51:33
8751 2012-11-04 20:36:36 2013-10-18 23:38:21
8880 2012-03-15 22:42:39 2012-03-16 08:57:47
8987 2011-06-29 21:55:41 2011-06-30 00:34:02
9232 2010-12-01 15:33:32 2018-04-30 06:17:39
9342 2011-01-22 02:37:22 2017-08-25 04:59:27
9378 2011-04-16 12:33:11 2016-07-05 15:43:01
9826 2010-12-09 00:14:26 2019-12-15 16:49:59
9850 2012-02-17 02:34:14 2017-11-06 20:49:08

[88]: print("Opers CAEX INS: %d Opers CAEX OOS: %d Opers SHOVEL INS: %d Opers SHOVEL␣
,→OOS: %d"%(optrSI.shape[0], optrNO.shape[0], opshSI.shape[0], opshNO.shape[0]))

Opers CAEX INS: 293 Opers CAEX OOS: 163 Opers SHOVEL INS: 64 Opers SHOVEL OOS:
57

[89]: velop = df.groupby(['OpCAEX']).velocidad.agg({'min','max','mean','median'})

[90]: velop

[90]: min mean median max


OpCAEX
0001 14.57 21.840690 22.11 26.58
0002 20.59 20.590000 20.59 20.59
0006 24.61 24.610000 24.61 24.61
0013 25.27 25.270000 25.27 25.27
0022 25.91 25.910000 25.91 25.91

38
... ... ... ... ...
9850 2.65 21.264105 22.48 27.83
9917 2.65 21.798007 22.44 43.34
9922 13.80 20.139268 21.01 25.82
9949 10.47 20.733605 21.33 27.33
mmsunk 2.65 21.325989 22.33 47.08

[456 rows x 4 columns]

[91]: idxmi = velop[['min']].idxmin()

[92]: idxmi

[92]: min 0390


dtype: object

[93]: print(velop[idxmi[0]:])

min mean median max


OpCAEX
0390 2.46 21.147653 22.49 51.77
0406 2.69 21.669481 22.80 56.25
0437 2.65 21.882526 22.66 32.75
0451 20.28 23.793333 25.09 26.01
0465 2.69 21.312389 22.57 47.08
... ... ... ... ...
9850 2.65 21.264105 22.48 27.83
9917 2.65 21.798007 22.44 43.34
9922 13.80 20.139268 21.01 25.82
9949 10.47 20.733605 21.33 27.33
mmsunk 2.65 21.325989 22.33 47.08

[428 rows x 4 columns]

[94]: idxma = velop[['max']].idxmax()

[95]: idxma

[95]: max 3568


dtype: object

[96]: print(velop[idxma[0]:])

min mean median max


OpCAEX
3568 3.91 21.800806 22.860 59.03
3594 13.33 19.525833 20.855 25.03
3625 2.65 21.242113 22.180 56.11

39
3646 6.55 21.565922 23.030 32.21
3655 2.67 21.467263 22.770 27.50
... ... ... ... ...
9850 2.65 21.264105 22.480 27.83
9917 2.65 21.798007 22.440 43.34
9922 13.80 20.139268 21.010 25.82
9949 10.47 20.733605 21.330 27.33
mmsunk 2.65 21.325989 22.330 47.08

[302 rows x 4 columns]

[97]: fases = df.groupby(['Fase']).TimeArrived.agg({'min','max'})

[98]: fases['nrodays'] = fases['max']-fases['min']


fases['nrodays'] = fases['nrodays']/datetime.timedelta(days=1)

[99]: fases.sort_values(by=['min'], ascending=True)

[99]: min max nrodays


Fase
STOC 2010-11-30 21:00:00 2020-12-22 06:20:01 3674.388900
F6SE 2010-11-30 21:12:22 2012-04-20 10:49:50 506.567685
F6NE 2010-11-30 21:16:15 2015-03-15 06:25:52 1565.381678
F5SW 2010-12-01 11:17:48 2014-01-03 19:49:44 1129.355509
F6SW 2010-12-06 09:47:38 2015-10-23 08:55:34 1781.963843
F4NN 2010-12-08 16:05:42 2011-05-06 17:59:22 149.078935
F7NN 2011-01-06 05:42:42 2013-05-01 20:04:09 846.598229
DERR 2011-02-13 20:13:21 2011-02-25 21:46:42 12.064826
F7NE 2011-03-17 11:54:01 2015-11-30 09:45:10 1718.910521
F6NW 2011-10-14 06:57:34 2011-10-14 06:57:34 0.000000
F7NW 2012-03-12 20:32:23 2014-01-24 13:10:38 682.693229
CH-A 2012-03-26 23:46:59 2012-03-27 08:07:16 0.347419
SM-A 2012-03-27 11:49:17 2015-08-31 09:57:33 1251.922407
POR- 2012-07-20 20:18:13 2012-07-25 09:01:53 4.530324
F7DE 2012-12-09 21:27:31 2015-10-04 14:41:57 1028.718356
F9NW 2013-06-14 12:12:48 2014-01-07 13:58:11 207.073183
POLI 2013-11-26 09:37:03 2014-01-08 00:52:48 42.635937
F10N 2014-01-15 04:32:32 2020-12-22 04:06:49 2532.982141
F7DW 2014-05-18 19:04:30 2016-05-22 12:57:16 734.744977
F8NE 2014-08-28 01:24:40 2017-01-02 22:16:55 858.869618
F9PD 2014-09-29 09:11:28 2018-01-15 21:47:33 1204.525058
F7DN 2014-11-03 21:52:06 2014-11-04 09:24:54 0.481111
F8SE 2015-05-31 10:02:43 2015-06-13 22:26:59 13.516852
RELL 2015-08-27 12:57:20 2015-09-07 20:51:13 11.329086
VENT 2015-12-18 19:04:06 2016-09-26 04:15:45 282.383090
F7R1 2016-01-03 14:07:16 2020-12-22 06:28:04 1814.681111
CASE 2016-08-04 12:46:50 2016-09-29 12:08:39 55.973484

40
OP04 2016-09-21 19:35:24 2016-09-21 19:35:24 0.000000
CH_3 2016-12-16 14:21:18 2017-01-28 14:28:25 43.004942
F10E 2018-02-16 08:04:53 2018-12-31 21:32:59 318.561181
F9SE 2018-05-19 01:11:38 2020-12-21 18:10:47 947.707743
F11W 2020-02-13 21:17:45 2020-08-21 22:19:37 190.042963

[100]: fases = fases.sort_values(by=['nrodays'],ascending=False)


fig = plt.figure(figsize=(16,9))
ax = fig.add_axes([0,0,1,1])
y=fases['nrodays']
x=fases.index
ax.bar(x,y)
plt.xticks(rotation=90)
plt.show()

[101]: bancos = df.groupby(['Banco']).TimeArrived.agg({'min','max'})

[102]: bancos['nrodays'] = bancos['max']-bancos['min']


bancos['nrodays'] = bancos['nrodays']/datetime.timedelta(days=1)

[103]: bancos.sort_values(by=['min'], ascending=True)

[103]: min max nrodays


Banco
3095 2010-11-30 21:00:00 2016-03-18 12:27:16 1934.643935
3110 2010-11-30 21:12:22 2016-09-21 19:35:24 2121.932662

41
3245 2010-11-30 21:16:15 2015-04-06 10:33:13 1587.553449
2885 2010-12-01 11:17:48 2018-05-18 03:46:39 2724.686701
3275 2010-12-01 16:35:21 2020-12-18 14:07:54 3669.897604
... ... ... ...
2772 2020-07-26 19:26:40 2020-10-09 13:23:50 74.748032
2765 2020-08-13 23:57:56 2020-11-17 06:52:28 95.287870
3150 2020-09-09 11:34:35 2020-12-09 18:59:09 91.308727
2757 2020-11-14 00:59:21 2020-12-22 06:28:04 38.228275
3210 2020-12-08 22:17:28 2020-12-13 18:12:53 4.830150

[110 rows x 3 columns]

[104]: bancos = bancos.sort_values(by=['nrodays'],ascending=False)


fig = plt.figure(figsize=(16,9))
ax = fig.add_axes([0,0,1,1])
y=bancos['nrodays']
x=bancos.index
ax.bar(x,y)
plt.xticks(rotation = 90, fontsize=9)
plt.show()

[105]: df.head(10)

[105]: TimeArrived Truck Destino Origen Shovel Tons \


0 2020-12-22 06:28:04 CA64 CH-02 F7R1-2757-02/MP2 CF10 324.0
1 2020-12-22 06:20:01 CA103 CH-02 STOC-2930-00/MM1 PA05 296.0

42
2 2020-12-22 06:11:17 CA61 CH-02 STOC-2930-00/MM1 PA05 287.0
3 2020-12-22 06:07:05 CA94 CH-02 STOC-2930-00/MM1 PA05 269.0
4 2020-12-22 05:54:04 CA70 CH-02 STOC-2930-00/MM1 PA05 318.0
5 2020-12-22 05:49:53 CA86 CH-1 STOC-2930-00/MM1 PA05 261.0
6 2020-12-22 05:46:01 CA103 CH-02 STOC-2930-00/MM1 PA05 294.0
7 2020-12-22 05:32:54 CA83 CH-1 F7R1-2757-02/MP2 CF10 305.0
8 2020-12-22 05:29:24 CA103 CH-02 STOC-2930-00/MM1 PA05 272.0
9 2020-12-22 05:23:18 CA95 CH-02 STOC-2930-00/MM1 PA05 314.0

Distancia TpoViaje TpoEsperaDump TpoDump TipoMineral OpCAEX OpSHV \


0 3875 16.950000 0.000000 0.933333 Media Ley Prim 8689 6668
1 1752 7.733333 0.000000 0.983333 Mineral Media 3015 4243
2 1752 7.733333 0.000000 0.950000 Mineral Media 2751 4243
3 1752 7.733333 0.000000 0.900000 Mineral Media 0985 4243
4 1752 7.733333 0.000000 0.183333 Mineral Media 6574 4243
5 2018 9.033334 0.800000 1.916667 Mineral Media 2628 4243
6 1752 7.733333 0.000000 0.950000 Mineral Media 3015 4243
7 4031 17.383330 0.016667 1.000000 Media Ley Prim 6107 6668
8 1752 7.733333 0.000000 1.000000 Mineral Media 3015 4243
9 1752 7.733333 0.050000 1.200000 Mineral Media 5671 4243

Fase Banco Malla Otro velocidad


0 F7R1 2757 02 MP2 13.72
1 STOC 2930 00 MM1 13.59
2 STOC 2930 00 MM1 13.59
3 STOC 2930 00 MM1 13.59
4 STOC 2930 00 MM1 13.59
5 STOC 2930 00 MM1 13.40
6 STOC 2930 00 MM1 13.59
7 F7R1 2757 02 MP2 13.91
8 STOC 2930 00 MM1 13.59
9 STOC 2930 00 MM1 13.59

[106]: df['Year'] = pd.DatetimeIndex(df['TimeArrived']).year


df['Month'] = pd.DatetimeIndex(df['TimeArrived']).month
df['Day'] = pd.DatetimeIndex(df['TimeArrived']).day

[107]: df

[107]: TimeArrived Truck Destino Origen Shovel Tons \


0 2020-12-22 06:28:04 CA64 CH-02 F7R1-2757-02/MP2 CF10 324.0
1 2020-12-22 06:20:01 CA103 CH-02 STOC-2930-00/MM1 PA05 296.0
2 2020-12-22 06:11:17 CA61 CH-02 STOC-2930-00/MM1 PA05 287.0
3 2020-12-22 06:07:05 CA94 CH-02 STOC-2930-00/MM1 PA05 269.0
4 2020-12-22 05:54:04 CA70 CH-02 STOC-2930-00/MM1 PA05 318.0
... ... ... ... ... ... ...
2141616 2010-11-30 21:12:22 CA81 CH-1 F6SE-3110-03/MM1 PA04 304.0

43
2141617 2010-11-30 21:06:09 CA09 CH-02 STOC-3095-02/MM1 PA06 212.0
2141618 2010-11-30 21:00:00 CA72 CH-02 STOC-3095-02/MM1 PA06 304.0
2141619 2010-11-30 21:00:00 CA29 CH-1 STOC-3095-02/MM1 PA06 303.0
2141620 2010-11-30 21:00:00 CA74 CH-1 STOC-3095-02/MM1 PA06 304.0

Distancia TpoViaje TpoEsperaDump TpoDump ... OpCAEX OpSHV \


0 3875 16.950000 0.000000 0.933333 ... 8689 6668
1 1752 7.733333 0.000000 0.983333 ... 3015 4243
2 1752 7.733333 0.000000 0.950000 ... 2751 4243
3 1752 7.733333 0.000000 0.900000 ... 0985 4243
4 1752 7.733333 0.000000 0.183333 ... 6574 4243
... ... ... ... ... ... ... ...
2141616 1123 3.566667 1.783333 3.000000 ... 1222 mmsunk
2141617 1255 4.300000 0.000000 0.633333 ... mmsunk mmsunk
2141618 1255 4.300000 0.000000 0.000000 ... mmsunk mmsunk
2141619 1533 5.433333 0.000000 0.000000 ... mmsunk mmsunk
2141620 1533 5.433333 0.000000 0.000000 ... mmsunk mmsunk

Fase Banco Malla Otro velocidad Year Month Day


0 F7R1 2757 02 MP2 13.72 2020 12 22
1 STOC 2930 00 MM1 13.59 2020 12 22
2 STOC 2930 00 MM1 13.59 2020 12 22
3 STOC 2930 00 MM1 13.59 2020 12 22
4 STOC 2930 00 MM1 13.59 2020 12 22
... ... ... ... ... ... ... ... ...
2141616 F6SE 3110 03 MM1 18.89 2010 11 30
2141617 STOC 3095 02 MM1 17.51 2010 11 30
2141618 STOC 3095 02 MM1 17.51 2010 11 30
2141619 STOC 3095 02 MM1 16.93 2010 11 30
2141620 STOC 3095 02 MM1 16.93 2010 11 30

[2109113 rows x 21 columns]

[108]: tviaje = df.groupby(['Year']).TpoViaje.agg({'min','max','mean','median'})

[109]: tviaje

[109]: min mean median max


Year
2010 1.100000 6.941381 8.500000 72.01667
2011 0.700000 6.681072 6.366667 52.25000
2012 0.283333 8.802006 7.466667 80.16666
2013 0.183333 11.392940 11.333330 64.33334
2014 0.333333 7.269753 6.466667 83.01667
2015 0.133333 7.722635 6.316667 139.51670
2016 0.233333 7.735175 6.383333 95.10000
2017 0.283333 11.209873 9.116667 109.58330

44
2018 0.500000 13.896893 14.433330 244.38330
2019 0.933333 14.440410 13.916670 176.15000
2020 0.766667 16.033147 14.283330 125.40000

[110]: tfase = df.groupby(['Year','Fase']).TpoViaje.agg({'min','max','mean','median'})

[111]: tfase.sort_index(inplace=True)

[112]: tfase

[112]: min mean median max


Year Fase
2010 F4NN 8.650000 8.942965 8.966666 10.933330
F5SW 1.100000 9.474975 9.516666 72.016670
F6NE 3.733333 9.547163 9.116667 39.300000
F6SE 1.933333 5.514640 3.616667 36.333330
F6SW 2.100000 4.166666 4.166666 6.233333
STOC 1.466667 4.684810 4.700000 12.000000
2011 DERR 8.566667 12.466398 13.250000 15.650000
F4NN 1.416667 9.713199 9.416667 18.283330
F5SW 1.100000 10.106184 10.016670 43.233330
F6NE 0.700000 6.621954 6.366667 52.250000
F6NW 9.583333 9.583333 9.583333 9.583333
F6SE 1.100000 3.178918 2.950000 23.933330
F6SW 3.100000 3.100000 3.100000 3.100000
F7NE 20.500000 20.500000 20.500000 20.500000
F7NN 14.750000 15.994101 15.133330 27.900000
STOC 0.816667 5.239502 5.216667 14.850000
2012 CH-A 1.650000 2.120833 2.250000 2.333333
F5SW 1.100000 11.213214 11.300000 37.483330
F6NE 0.283333 5.241083 5.316667 80.166660
F6SE 1.500000 2.874076 2.800000 7.866667
F7DE 4.200000 15.194060 12.283330 54.583330
F7NE 1.116667 14.049476 12.816670 63.766670
F7NN 0.950000 13.637557 12.316670 55.833330
F7NW 3.683333 6.250000 6.250000 8.816667
POR- 5.016667 16.290007 16.016670 33.316670
SM-A 1.116667 4.031312 3.433333 21.333330
STOC 0.850000 5.222534 4.666667 36.700000
2013 F5SW 1.200000 12.580138 12.683330 39.883340
F6NE 1.316667 8.495041 8.400000 29.500000
F7DE 0.183333 11.152819 10.983330 48.950000
F7NE 0.400000 11.903489 11.716670 64.333340
F7NN 3.983333 9.343864 8.266666 25.600000
F7NW 9.316667 13.170000 10.716670 23.300000
F9NW 1.916667 22.639686 22.833330 32.266670
POLI 6.633333 9.353703 7.816667 22.783330

45
SM-A 1.883333 6.092091 5.883333 29.850000
STOC 2.766667 8.649688 5.950000 33.500000
2014 F10N 0.716667 24.766733 23.216670 83.016670
F5SW 13.550000 13.785632 13.550000 14.150000
F6NE 1.483333 9.569675 9.616667 56.116660
F7DE 0.333333 7.046135 6.766667 68.233330
F7DN 7.316667 7.810378 7.916667 8.683333
F7DW 0.783333 8.711853 8.900000 51.083330
F7NE 0.733333 5.384588 5.033333 69.316670
F7NW 8.350000 16.672223 8.350000 33.316670
F8NE 1.033333 6.064710 5.366667 54.616660
F9NW 11.100000 24.024053 24.500000 25.466670
F9PD 4.783333 4.783333 4.783333 4.783333
POLI 8.700000 9.100000 9.300000 9.300000
SM-A 1.033333 3.691833 3.516667 48.066670
STOC 0.516667 8.411961 6.966667 70.633330
2015 F10N 0.133333 25.657162 24.466670 139.516700
F6NE 1.266667 8.804618 8.733334 52.366660
F6SW 1.083333 10.169895 9.500000 73.100000
F7DE 1.000000 7.402948 6.516667 68.833340
F7DW 0.366667 8.083917 7.283333 96.983330
F7NE 2.600000 9.686597 12.133330 61.466670
F8NE 0.533333 5.845189 4.966667 89.016670
F8SE 1.083333 3.940164 3.766667 6.833333
RELL 4.283333 6.715823 6.566667 9.250000
SM-A 3.000000 3.594136 3.433333 5.633333
STOC 0.333333 7.812250 4.366667 80.983330
VENT 3.566667 3.841667 3.841667 4.116667
2016 CASE 1.116667 3.005451 3.000000 13.850000
CH_3 3.416667 3.416667 3.416667 3.416667
F10N 0.750000 22.022711 20.050000 95.100000
F7DW 1.033333 6.361630 6.166667 60.916670
F7R1 0.233333 6.216798 6.183333 74.533330
F8NE 1.050000 6.903034 6.716667 64.416660
OP04 22.016670 22.016670 22.016670 22.016670
STOC 0.733333 7.906810 6.400000 64.383330
VENT 2.966667 3.685379 3.566667 5.550000
2017 CH_3 3.416667 3.416667 3.416667 3.416667
F10N 0.583333 18.562358 17.683330 109.583300
F7R1 0.283333 8.205488 8.183333 73.200000
F8NE 6.166667 8.351394 7.933333 39.833330
STOC 0.416667 6.112060 4.266667 48.550000
2018 F10E 8.366667 16.549612 16.050000 60.233330
F10N 1.200000 16.624713 15.966670 93.750000
F7R1 1.366667 12.088320 12.166670 244.383300
F9PD 10.900000 10.900000 10.900000 10.900000
F9SE 1.950000 13.513334 13.783330 28.566670

46
STOC 0.500000 4.525952 3.916667 64.233330
2019 F10N 0.933333 13.978458 13.633330 176.150000
F7R1 2.733333 14.786083 14.600000 116.816700
F9SE 2.866667 25.697110 24.566670 110.633300
STOC 1.166667 5.108507 4.100000 82.566670
2020 F10N 1.200000 13.014895 13.500000 59.416670
F11W 10.383330 13.137500 13.816670 14.533330
F7R1 1.200000 16.658428 16.500000 80.516670
F9SE 1.200000 27.364230 27.083330 125.400000
STOC 0.766667 4.576475 3.383333 80.783330

[113]: tposOD = df.groupby(['Origen','Destino']).TpoViaje.


,→agg({'min','mean','median','max'})

[114]: display(tposOD)

min mean median max


Origen Destino
CASE-3020-00/MM1 CH-02 1.116667 2.790721 2.583333 9.983334
CH-1 2.916667 3.155601 3.100000 10.200000
CASE-3020-70/MM1 CH-02 2.466667 2.652976 2.483333 10.650000
CH-1 2.916667 3.164917 3.000000 13.850000
CH-APIRES/MM1 CH-02 1.650000 1.950000 1.950000 2.250000
... ... ... ... ...
STOCK-SEC-00/MM1 CH-1 7.583333 8.665728 7.616667 64.383330
STOCK_PEL-01/MM1 CH-02 5.816667 12.328886 12.633330 14.566670
CH-1 7.200000 13.111856 13.183330 14.016670
VENT-3027-00/MM1 CH-02 2.966667 3.358839 3.566667 5.500000
CH-1 3.516667 3.906229 4.116667 5.550000

[11098 rows x 4 columns]

[ ]:

47

You might also like