L03 - Data Analysis Procedure
L03 - Data Analysis Procedure
The processing of experimental data recorded on the data sheet is based on the
random character of traffic streams. Thus, the random variable X is defined.
The random variables used in traffic engineering are of two types:
• Discrete random variable (V.A.D.), for instance, vehicles arriving at a certain
location;
• Continuous random variables (V.A.C.), for instance, vehicle speed recorded at a
certain point along the roadway or space headways between vehicles in a traffic
stream, etc.
Table E.3
pag /xi 0 1 2 3 4 5 6 7 8
1 3 10 11 16 6 1 1 0 0
2 6 10 6 11 11 6 3 1 0
3 3 9 16 9 9 4 2 1 1
4 7 14 6 7 11 4 3 2 0
5 3 5 9 13 10 8 2 4 0
6 0 6 15 10 11 8 3 1 0
7 3 3 15 7 8 4 2 0 0
Ni N0=25 N1=57 N2=78 N3=73 N4=66 N5=35 N6=16 N7=9 N8=1
A discrete random variable may be described through a range of values, as follows:
⎛x x2 . . . xn ⎞
X = ⎜⎜ 1 ⎟,
⎝ p1 p2 . . . pn ⎟⎠
where xi is the value of the random variable argument, X, and pi is the probability of
argument occurrence or fi, which represents the occurrence relative frequency, on condition
n
that 0≥pi = fi ≤1, for any i value and ∑p
i =1
i =1.
or
⎛x x2 . . . xn ⎞
X = ⎜⎜ 1 ⎟⎟ ,
⎝ N1 N 2 . . . N n ⎠
n
with Ni≥0, natural numbers, for any i value and ∑ Ni =N.
i =1
The table 4 comprises the results of recordings done on ……. at hour…… at time interval
of ∆t=10 seconds.
Table E.4:
xi Ni fi f∑> f∑< x=xi*fi s²=(xi-x)²*fi P(X=x) Ni Ni²/Ni
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
0 25 0,069444 0,069444 1 0 0,576222 0,056031 20,17126 30,98468
1 57 0,158333 0,227778 0,930556 0,158333 0,559944 0,161562 58,16232 55,86091
2 78 0,216667 0,444444 0,772222 0,433333 0,167999 0,232837 83,82146 72,58285
3 73 0,202778 0,647222 0,555556 0,608333 0,002893 0,223619 80,50302 66,19628
4 66 0,183333 0,830556 0,352778 0,733333 0,229745 0,161014 57,96489 75,14894
5 35 0,097222 0,927778 0,169444 0,486111 0,436727 0,092713 33,37662 36,70233
6 16 0,044444 0,972222 0,072222 0,266667 0,432486 0,04447 16,00933 15,99068
7 10 0,027778 1 0,027778 0,194444 0,471384 0,018276 6,579478 15,19877
∑ 360 1 1 2,880556 2,8774 356,5884 368,6654
n
3600 sec
From table it is noticed that ∑N
i =0
i =
10 sec
= 360 int ervals
Table E.6
Input data Empirical data processing Theoretical data processing
xi Ni fi f∑> f∑< x=xi*fi s²=(xi-x)²*fi P(X=x) Ni Ni²/Ni
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
0 25 0,069444 0,069444 1 0 0,576222 0,056031 20,17126 30,98468
1 57 0,158333 0,227778 0,930556 0,158333 0,559944 0,161562 58,16232 55,86091
2 78 0,216667 0,444444 0,772222 0,433333 0,167999 0,232837 83,82146 72,58285
3 73 0,202778 0,647222 0,555556 0,608333 0,002893 0,223619 80,50302 66,19628
4 66 0,183333 0,830556 0,352778 0,733333 0,229745 0,161014 57,96489 75,14894
5 35 0,097222 0,927778 0,169444 0,486111 0,436727 0,092713 33,37662 36,70233
6 16 0,044444 0,972222 0,072222 0,266667 0,432486 0,04447 16,00933 15,99068
7 10 0,027778 1 0,027778 0,194444 0,471384 0,018276 6,579478 15,19877
∑ 360 1- 2,880556 2,8774 356,5884 368,6654
0,9
0,8
0,7
0,6
f>
0,5
f
f<
0,4
0,3
0,2
0,1
0
0 1 2 3 4 5 6 7 8
xi
Table E.8.
Input data Empirical data processing Theoretical data processing
xi Ni fi f∑> f∑< x=xi*fi s²=(xi-x)²*fi P(X=x) Ni Ni²/Ni
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
0 25 0,069444 0,069444 1 0 0,576222 0,056031 20,17126 30,98468
1 57 0,158333 0,227778 0,930556 0,158333 0,559944 0,161562 58,16232 55,86091
2 78 0,216667 0,444444 0,772222 0,433333 0,167999 0,232837 83,82146 72,58285
3 73 0,202778 0,647222 0,555556 0,608333 0,002893 0,223619 80,50302 66,19628
4 66 0,183333 0,830556 0,352778 0,733333 0,229745 0,161014 57,96489 75,14894
5 35 0,097222 0,927778 0,169444 0,486111 0,436727 0,092713 33,37662 36,70233
6 16 0,044444 0,972222 0,072222 0,266667 0,432486 0,04447 16,00933 15,99068
7 10 0,027778 1 0,027778 0,194444 0,471384 0,018276 6,579478 15,19877
∑ 360 1- 2,880556 2,8774 356,5884 368,6654
Calculus of random variable dispersion, D2 or s2 or σ2. In our example noted with s2.
Comparing the mean with the dispersion.
Mean = 2,88055>Dispersion = 2,8774
Choosing the distribution model (random distribution)
Since Mean>Dispersion the binomial distribution is adopted.
Determining the probabilities that describe the proposed model (binomial distribution–column
8).
Table E.9.
Input data Empirical data processing Theoretical data processing
xi Ni fi f∑> f∑< x=xi*fi s²=(xi-x)²*fi P(X=x) Ni Ni²/Ni
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
0 25 0,069444 0,069444 1 0 0,576222 0,056031 20,17126 30,98468
1 57 0,158333 0,227778 0,930556 0,158333 0,559944 0,161562 58,16232 55,86091
2 78 0,216667 0,444444 0,772222 0,433333 0,167999 0,232837 83,82146 72,58285
3 73 0,202778 0,647222 0,555556 0,608333 0,002893 0,223619 80,50302 66,19628
4 66 0,183333 0,830556 0,352778 0,733333 0,229745 0,161014 57,96489 75,14894
5 35 0,097222 0,927778 0,169444 0,486111 0,436727 0,092713 33,37662 36,70233
6 16 0,044444 0,972222 0,072222 0,266667 0,432486 0,04447 16,00933 15,99068
7 10 0,027778 1 0,027778 0,194444 0,471384 0,018276 6,579478 15,19877
∑ 360 1- 2,880556 2,8774 0,990523 356,5884 368,6654
Table E.10.
Input data Empirical data processing Theoretical data processing
xi Ni fi f∑> f∑< x=xi*fi s²=(xi-x)²*fi P(X=x) Ni’ Ni²/Ni
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
0 25 0,069444 0,069444 1 0 0,576222 0,056031 20,17126 30,98468
1 57 0,158333 0,227778 0,930556 0,158333 0,559944 0,161562 58,16232 55,86091
2 78 0,216667 0,444444 0,772222 0,433333 0,167999 0,232837 83,82146 72,58285
3 73 0,202778 0,647222 0,555556 0,608333 0,002893 0,223619 80,50302 66,19628
4 66 0,183333 0,830556 0,352778 0,733333 0,229745 0,161014 57,96489 75,14894
5 35 0,097222 0,927778 0,169444 0,486111 0,436727 0,092713 33,37662 36,70233
6 16 0,044444 0,972222 0,072222 0,266667 0,432486 0,04447 16,00933 15,99068
7 10 0,027778 1 0,027778 0,194444 0,471384 0,018276 6,579478 15,19877
∑ 360 1- 2,880556 2,8774 0,990523 356,5884 368,6654
PROJECT EXEMPLE
The model, i.e. binomial distribution, is validated with the testing criterion noted „ Chi-
square Criterion” or „ χ2 criterion”.
For this very aim there is determined the absolute theoretical frequency in column (9) in
the above table, calculated with the relation:
N i' = P ( X = x ) ⋅ N
To verify the accuracy of calculations, i.e. whether the sum of probabilities calculated in
the preceding column reaches the value 1, the sum of absolute theoretical frequencies will
reach the value corresponding to N, the total number of observations
The next step is to determine the χ2 value given by the sum of squared empirical
frequencies deviations as compared to the theoretical frequencies, with the following equation:
χ 2
=∑
n(N i − N i' )
2
=∑
n N i2
−N .
i =0 N i' i =0 N i'
N i2
In column (10) there are determined only the ratios , which, summed up in the last line,
N i'
will allow the determination of the χ2criterion value.
Table E.11.
Input data Empirical data processing Theoretical data processing
xi Ni fi f∑> f∑< x=xi*fi s²=(xi-x)²*fi P(X=x) Ni’ Ni²/Ni’
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
0 25 0,069444 0,069444 1 0 0,576222 0,056031 20,17126 30,98468
1 57 0,158333 0,227778 0,930556 0,158333 0,559944 0,161562 58,16232 55,86091
2 78 0,216667 0,444444 0,772222 0,433333 0,167999 0,232837 83,82146 72,58285
3 73 0,202778 0,647222 0,555556 0,608333 0,002893 0,223619 80,50302 66,19628
4 66 0,183333 0,830556 0,352778 0,733333 0,229745 0,161014 57,96489 75,14894
5 35 0,097222 0,927778 0,169444 0,486111 0,436727 0,092713 33,37662 36,70233
6 16 0,044444 0,972222 0,072222 0,266667 0,432486 0,04447 16,00933 15,99068
7 10 0,027778 1 0,027778 0,194444 0,471384 0,018276 6,579478 15,19877
∑ 360 1- 2,880556 2,8774 0,990523 356,5884 368,6654
• Comparison of distributions obtained for the three time headways, for each recording.
The traffic values obtained allow at the same time to draw up graphs necessary for
comparisons making reference to:
• importance of various categories of vehicles for each traffic lane, in case of a
recording;
• importance of various categories of vehicles for the same traffic lane, for recordings
conducted on different days and at different hours;
• analysis of arrival times of vehicles designated to public transportation (for example
the frequency of buses or trolleybuses).
Table E.12: χ2 values for a certain safety threshold δ.
ν χ 02,995 χ 02,99 χ 02,975 χ 02,95 χ 02,05 χ 02, 025 χ 02, 01 χ 02, 005 ν
Interpreting graphs and conclusions with regard to vehicles arrival times at a point.
Students will come up with personal conclusions, according to their capacity to analyze
and synthesize the data collected; students will partake in team discussions and will interpret
the results so as to draw up a common report.