0% found this document useful (0 votes)
9 views

%5B1%5D+Random+Variables+and+Exploratory+Data+Analysis

The document discusses random phenomena in civil and environmental engineering, emphasizing the variability and unpredictability of certain processes. It introduces key statistical concepts such as random variables, populations, samples, and measures of central tendency and dispersion, which are essential for analyzing data. Additionally, it explains how statistical methods can quantify uncertainty and draw inferences about populations based on sample data.

Uploaded by

Mrs Aamir
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

%5B1%5D+Random+Variables+and+Exploratory+Data+Analysis

The document discusses random phenomena in civil and environmental engineering, emphasizing the variability and unpredictability of certain processes. It introduces key statistical concepts such as random variables, populations, samples, and measures of central tendency and dispersion, which are essential for analyzing data. Additionally, it explains how statistical methods can quantify uncertainty and draw inferences about populations based on sample data.

Uploaded by

Mrs Aamir
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

RANDOM VARIABLES AND EXPLORATORY DATA ANALYSIS

1 RANDOM PHENOMENA
Many processes that are encountered in civil and environmental engineering disciplines are
subject to chance in that they exhibit substantial variability in time and/or space that cannot be
fully explained by physical laws. Variability means that successive observations of a system do
not produce the same results. Often, when we refer to these phenomena, we use the term random.
The term random is common in geophysical sciences and engineering and it conveys the idea of
the occurrence of a phenomenon that is uncertain. To put it another way, the occurrences of the
phenomena are not predictable with certainty. For instance, the occurrence of speed of vehicles
at a highway location is of a random nature since the outcome of an individual occurrence of
such event cannot be determined with certainty. Likewise, if we refer to the occurrence of
streamflow in a river, the flow volume and discharge cannot be determined with certainty at any
time or location.
In order to properly describe and analyze random phenomena mathematically, it is necessary to
define additional terminology such as random events and random variables. To talk about
random events, it is useful to introduce the concept of random experiments and sample space.
Consider the measurement of the speed of vehicles passing a specific location at a given time as
an experiment. The outcome varies from measurement to measurement. Thus, the measurements
can be considered to have a random component. An experiment that can result in different
outcomes, even though it is repeated in the same manner every time, is called a random
experiment. The set of all possible outcomes of an experiment is called the sample space for the
experiment.
In a random experiment, a variable whose value can change from one replicate of the experiment
to another is referred to as a random variable. A random variable is discrete if its possible values
come from a discrete set. For example, gender and race are discrete random variables. Note that
the set of possible values for a discrete random variable may be infinite, e.g., the set of all
integers is a discrete set. A random variable is continuous if its values come from interval(s)
(either finite or infinite) of real numbers. For example, speed of vehicles at a highway location is
a continuous random variable with possible values that may vary in the [10, 140] mph range.
In an experiment, a measurement is usually denoted by a variable, e.g. X. An uppercase letter is
used to denote a variable. For example, in the traffic example, X = speed of vehicles passing the
specified location. The measured value of a variable is denoted by a lowercase letter, e.g., x. In
the traffic data example shown in Fig 1, 𝑋1 = 65.7 𝑚𝑝ℎ. Thus, the sample of 35 measurements
may be denoted as 𝑋 = {𝑋1 , 𝑋2 , … , 𝑋35 } = {65.7,66.7, … ,54.9} 𝑚𝑝ℎ. The number of random
measurements (or observations) is called sample size, which may be denoted by N.

Page | 1
CIVE 203 Class Notes - For resident students only - Do not distribute
Fig 1. Illustration of N random observations of speed of vehicles

2 POPULATION VERSUS SAMPLE


The field of statistics deal with methods for drawing inferences about the properties of a
population based on the properties of a sample from the population. However, statistics go
beyond merely representing the properties of the population. Statistical measures are also used to
quantify uncertainty in knowledge about the population. Statistical methods are employed to
collect and analyze data to make decisions, solve problems, and design systems. In simple terms,
statistics is the science of data.
The fundamental concept in statistics is the population, which refers to a set of events (or
objects) whose measurable outcomes and properties are of interest. A population consists of the
set of all possible outcomes for a random variable. A sample is a subset of the population that is
collected via laboratory experiments or monitoring. Populations of interest can be finite and
enumerated explicitly. For example, we may be interested in the number of vehicles passing
through a certain highway intersection per minute. Populations can be also infinite, as in the
speed of vehicles passing through a road intersection.

3 DEFINITION OF STATISTICS AND PROBABILITY


Probability provides a theoretical underpinning for statistical methods. Probability deals with
methods for quantifying the likelihood of an event given known properties of the population. For
example, one may use probability methods to compute the likelihood of annual maximum speed
of vehicles at a highway location exceeding 100 mph, given that mean and standard deviation of
annual maximum speed at the location are 90 mph and 12 mph, respectively.
Conversely, statistics deals with methods for drawing inferences about the properties (e.g., mean
or variance) of a population based on a given sample. For example, one may postulate that mean
chloride concentration in a drinking water well exceeds the maximum safe drinking water level
(e.g., 50 milligrams per liter). Suppose once had collected 35 samples of water from the well and
measured chloride concentration in each sample. One could ask whether, based on the

Page | 2
CIVE 203 Class Notes - For resident students only - Do not distribute
information in the sample, the mean of the population is greater than 50 mg/L, which would be a
hypothesis test.

4 BASIC CONCEPTS OF STATISTICS


When independent experiments are conducted repeatedly, as in flipping a coin, the relative
frequency of events often appears to approach a limit even though the outcomes of individual
experiments remain uncertain and are defined by chance. This effect is called statistical
regularity. In laboratory experiments, one can develop confidence that statistical regularity is
presented by repeating experiments under nearly identical conditions. However, much of the data
that arise in civil and environmental engineering are observational rather than experimental. For
these data, statistical regularity cannot be demonstrated by repetition of the same experiment. For
example, we cannot repeat the experiment of a severe drought or flood. Thus, the justification for
the use of statistics and probability in civil and environmental engineering rests, in most cases,
on the insight that statistical methods provide into the expected magnitude and variability of
future observations.
The temporal and spatial variability of random processes that are commonly encountered in
engineering systems may be characterized by statistical analysis of empirical data (observations).
For this purpose, several statistical methods are available to measure various properties of a
random sample:
• Central tendency: mean, median, mode
• Dispersion: range, standard deviation, variance, coefficient of variation
• Asymmetry: skewness coefficient, kurtosis coefficient
For instance, the sample variance is one of the most important statistical characteristics of a
random sample, which provides some relevant information about the variability of the data.
Other statistics such as temporal and spatial correlations are important for describing the degree
of association and dependence that observations taken at various points in time and space may
possess. While sample statistics are important, often the frequency distribution is also needed in
order to observe numerically or graphically how the data is distributed and to make frequency
(probability) statements about the data.

5 MEASURES OF CENTRAL TENDENCY

Sample Mean
The sample mean measures the central tendency of a given sample. If 𝑋 = {𝑋1 , 𝑋2 , … , 𝑋𝑁 }
represents a sample or a sequence (series) of observations, where N is the sample size or the
number of observations, the sample mean (𝑋̅) can be determined by:

Page | 3
CIVE 203 Class Notes - For resident students only - Do not distribute
𝑁
1
𝑋̅ = ∑ 𝑋𝑖
𝑁
𝑖=1

The sample mean 𝑋̅ is also referred to as the sample arithmetic mean.


Alternative measures of the mean are the geometric mean, the harmonic mean, and the root mean
square. The sample geometric mean (𝑋̅𝐺 ) is estimated as:
𝑁 1/𝑁

𝑋̅𝐺 = (𝑋1 𝑋2 … 𝑋𝑁 )1/𝑁 = (∏ 𝑋𝑖 )


𝑖=1

Likewise, the sample harmonic mean (𝑋̅𝐻 ) is estimated as:

1 𝑁
𝑋̅𝐻 = = ; 𝑋𝑖 > 0
1 1 1 1 1
( + + ⋯ + ∑𝑁
𝑁 𝑋1 𝑋2 𝑋𝑁 ) 𝑖=1 𝑋𝑖

The sample root mean square (𝑋̅𝑅 ) is determined as:


𝑁 1/2
1/2
1 1
𝑋̅𝑅 = [ (𝑋12 + 𝑋22 + ⋯ + 𝑋𝑁2 )] = [ ∑ 𝑋𝑖2 ]
𝑁 𝑁
𝑖=1

It may be shown that 𝑋̅𝐻 < 𝑋̅𝐺 < 𝑋̅. Also note that the geometric mean is equal to zero if at least
one of the data is zero. And if any value is zero the harmonic mean becomes indefinite.

Sample Weighted Arithmetic mean:


Samples of discrete random variables may contain repeated values. In these cases, each value is
weighted by the number of observations of each value (𝑁𝑗 ):
𝐾
1
𝑋̅ = ∑(𝑁𝑗 𝑋𝑗 )
𝑁
𝑗=1

where 𝐾 is the number of discrete options with ∑𝐾


𝑗=1 𝑁𝑗 = 𝑁.

Example 1: Compute sample mean for the following 35 random observations of speed of
vehicles at a road segment:

𝑋 = {65.7,66.7,67.8,72.2,67.0,68.2,68.6,65.5,67.4,64.4,70.2,66.7,68.9,70.1,70.2,70.6,69.0,70.3,
67.4, 68.8,67.4,66.5,61.5,69.1,71.0,66.4,68.6,68.3,70.9,70.6,72.5,66.9,57.4,54.4,54.9} mph

Page | 4
CIVE 203 Class Notes - For resident students only - Do not distribute
Solution:
𝑁
1 1
𝑋̅ = ∑ 𝑋𝑖 = (65.7 + 66.7 + ⋯ + 54.9) = 67.2 𝑚𝑝ℎ
𝑁 35
𝑖=1
𝑁 1/𝑁

𝑋̅𝐺 = (∏ 𝑋𝑖 ) = (65.7 × 66.7 × … × 54.9)1/35 = 67.1 𝑚𝑝ℎ


𝑖=1

𝑁 35
𝑋̅𝐻 = = = 66.9 𝑚𝑝ℎ
1 1 1 1
∑𝑁
𝑖=1 ( + 66.7 + ⋯ + )
𝑋𝑖 65.7 54.9

𝑁
1 1
𝑋̅𝑅 = √ ∑ 𝑋𝑖2 = √ (65.72 + 66.72 + ⋯ + 54.92 ) = 67.3 𝑚𝑝ℎ
𝑁 35
𝑖=1

Sample Median
The median is another measure of central tendency of a given sample. The sample median,
denoted by 𝑋𝑚 , is the value such that half of the values of the sample lie on either side of 𝑋𝑚 .
Let 𝑌1 < 𝑌2 < ⋯ < 𝑌𝑁 denote the ordered values (smallest to largest) of the random sample
𝑋1 , 𝑋2 , … , 𝑋𝑁 . The sample median is determined as:
𝑋𝑚 = 𝑌(𝑁+1)/2 if N is odd
1
𝑋𝑚 = 2 [𝑌(𝑁/2) + 𝑌(𝑁/2)+1 ] if N is even

Often the sample median is a preferred statistic over the sample mean because the former is not
affected by outlier observations.

Example 2: Compute sample median for the speed data in example 1.


Solution:
Sort the observations in an ascending order (smallest to largest) and find the middle
value(s):
𝑌 = {54.4, 54.9, 57.4, 61.5, 64.4, 65.5,65.7,66.4,66.5,66.7,66.7,66.9,67.0,67.4,67.4,67.4,
67.8,68.2,68.3,68.6,68.6,68.8,68.9,69.0,69.1,70.1,70.2,70.2,70.3,70.6,70.6,70.9,71.0,
72.2,72.5}
Since N = 35 is odd:
→ 𝑋𝑚 = 𝑦(𝑁+1)/2 = 𝑦18 = 68.2 𝑚𝑝ℎ

Page | 5
CIVE 203 Class Notes - For resident students only - Do not distribute
Sample Mode
The sample mode (𝑋̂) is most frequent observation. For continuous random variables, sample
mode may be obtained from the histogram of the empirical data.

Example 3: Compute sample mode for the speed data in example 1.


Solution:
Since speed of vehicles is a continuous random variable, compute the histogram of the
observations, and then find the center of the bin with the highest frequency:
→ 𝑋̂ = 68.0 𝑚𝑝ℎ
More information about histograms is presented in the Statistical Plots section.

6 MEASURES OF DISPERSION
The sample standard deviation (𝑠) measures the dispersion of sample values around the sample
mean. The sample variance is the square of the standard deviation and is denoted by 𝑠 2 . An
unbiased estimator of the sample standard deviation is estimated:
𝑁 1/2
1
𝑠=[ ∑(𝑋𝑖 − 𝑋̅)2 ]
𝑁−1
𝑖=1

where 𝑁 is the sample size and 𝑋̅ denotes the sample mean, while 𝑠 2 is also commonly used to
denote the unbiased sample variance.
Samples of discrete random variables may contain repeated values. In these cases, each value is
weighted by the number of observations of each value (𝑁𝑖 ):
1/2
𝐾
1 2
𝑠=[ ∑ 𝑁𝑗 (𝑋𝑗 − 𝑋̅) ]
𝑁−1
𝑗=1

Page | 6
CIVE 203 Class Notes - For resident students only - Do not distribute
where 𝐾 is the number of discrete options with ∑𝐾
𝑗=1 𝑁𝑗 = 𝑁.

The sample coefficient of variation is a dimensionless dispersion statistic that is equal to the
ratio of the sample standard deviation (𝑠) and the sample mean (𝑋̅), i.e.
𝑠
𝜂̂ = 𝐶𝑣 =
𝑋̅
The coefficient of variation gives a measure of the uncertainty of a sample relative to the mean.
When an ordered set of data is divided into four equal parts, the division points are called
quartiles. The first quartile (𝑄1) or lower quartile is a value that has proximally 25% of
observations below and approximately 75% of observations above it. The third quartile (𝑄3 ) or
upper quartile has proximally 75% of observations below its value. Similar to the sample
median, first and third quantiles of a sample may be obtained from the ordered sample values.
Other measures of dispersion or variability of a sample data includes the range (R), interquartile
range (IQR), and mean absolute deviation (MAD). The range, the difference between the
maximum and the minimum, is a crude measure of dispersion. Instead, the range of some
specific quantiles such as the 25% and 75% quantiles (i.e., the first and third quartiles,
respectively) may be used. The mean absolute deviation is the average of the absolute deviations
of the sample. These measures of dispersion are summarized below:
𝑅 = 𝑋𝑚𝑎𝑥 − 𝑋𝑚𝑖𝑛
𝐼𝑄𝑅 = 𝑄3 − 𝑄1
𝑁
1
𝑀𝐴𝐷 = ∑|𝑋𝑖 − 𝑋̅|
𝑁
𝑖=1

7 MEASURES OF ASYMMETRY
The sample skewness coefficient indicates the degree of asymmetry of the frequency distribution
of the sample data. It may be computed by:
∑𝑁 ̅ 3
𝑖=1(𝑋𝑖 − 𝑋 )
𝛾̂ =
𝑁 𝑠3
where 𝑁 and 𝑠 are the sample size and standard deviation, respectively. Division by the cube of
the sample standard deviation (s) gives a dimensionless measure. However, this equation is a
biased estimator of the population skewness coefficient. An unbiased sample skewness
coefficient is:
𝑁 ∑𝑁 ̅ 3
𝑖=1(𝑋𝑖 − 𝑋 )
𝛾̂ =
(𝑁 − 1)(𝑁 − 2) 𝑠 3
Samples of discrete random variables may contain repeated values. In these cases, each value is
weighted by the number of observations of each value (𝑛𝑖 ):

Page | 7
CIVE 203 Class Notes - For resident students only - Do not distribute
3
∑𝐾 ̅
𝑗=1 𝑁𝑗 (𝑋𝑗 − 𝑋 )
𝛾̂ =
𝑁 𝑠3
or (for unbiased estimator):

𝑁 ∑𝐾 ̅ 3
𝑗=1 𝑁𝑗 (𝑋𝑗 − 𝑋 )
𝛾̂ =
(𝑁 − 1)(𝑁 − 2) 𝑠3
where 𝐾 is the number of discrete options with ∑𝐾
𝑗=1 𝑁𝑗 = 𝑁.

The skewness coefficient has an important meaning since it gives an indication of the symmetry
of the distribution of the data. Symmetrical frequency distributions have small or negligible
sample skewness coefficient while asymmetrical distributions have large positive (skewed to the
left) or negative (skewed to the right) coefficients. A small value of |𝛾̂| may indicate that the
frequency distribution of the sample may be approximated by the normal distribution function
since  = 0 for the normal distribution.

No skew: |𝛾̂| ≈ 0 Positive skew: 𝛾̂ ≫ 0 Negative skew: 𝛾̂ ≪ 0


Fig 2. Illustration of the frequency distribution of random variable with different skewness coefficients

The sample kurtosis coefficient measures the peakedness or the flatness of the frequency
distribution near its mean. It can be estimated by:
∑𝑁 ̅ 4
𝑖=1(𝑋𝑖 − 𝑋 )
𝜅̂ =
𝑁 𝑠4
where 𝑁 and 𝑠 are the sample size and standard deviation, respectively. Division by 𝑠 4 gives a
dimensionless coefficient. This equation gives a biased estimator of the population kurtosis
coefficient. An unbiased estimator of the sample kurtosis coefficient is:
𝑁2 ∑𝑁 ̅ 4
𝑖=1(𝑋𝑖 − 𝑋 )
𝜅̂ =
(𝑁 − 1)(𝑁 − 2)(𝑁 − 3) 𝑠4
Figure 3 illustrates the frequency distribution of random variables with different kurtosis
coefficients.

Page | 8
CIVE 203 Class Notes - For resident students only - Do not distribute
Positive
kurtosis

Norma
Negative
kurtosis

Fig 3. Illustration of the frequency distribution of random variable with different kurtosis coefficients

A related coefficient called excess coefficient is defined by 𝜀̂ = 𝜅̂ − 3. For the Gaussian


(normal) distribution, 𝜅̂ = 3 and 𝜀̂ = 0. Positive values of 𝜀̂ indicate that a frequency distribution
is more peaked around its mean than the Gaussian distribution while negative values indicate that
the frequency distribution is more flat around its mean than the normal (note that for the normal
distribution 𝜅̂ = 3 and 𝜀̂ = 0).

Example 4: Compute sample variance, coefficient of variation, skewness coefficient, and


kurtosis coefficients for the speed data in example 1.
Solution:
Sample standard deviation:
𝑁 1/2
1
𝑠=[ ∑(𝑋𝑖 − 𝑋̅)2 ]
𝑁−1
𝑖=1

𝑁 1/2
1
=[ ∑(65.7 − 67.2)2 + (66.7 − 67.2)2 + ⋯ + (54.9 − 67.2)2 ]
35 − 1
𝑖=1

= 4.26 𝑚𝑝ℎ

Sample variance:
𝑠 2 = 18.17

Sample coefficient of variation:


𝑠
𝜂̂ = 𝐶𝑣 = = 0.06
𝑋̅

Page | 9
CIVE 203 Class Notes - For resident students only - Do not distribute
Sample skewness coefficient:
𝑁 ∑𝑁 ̅ 3
𝑖=1(𝑋𝑖 − 𝑋 )
𝛾̂ =
(𝑁 − 1)(𝑁 − 2) 𝑠3
𝐾
35
= ∑(65.7 − 67.2)3 + (66.7 − 67.2)3 + ⋯ + (54.9 − 67.2)3
34 × 33 × 4.263
𝑖=1

= −1.82
→ Sample distribution is heavily negative skewed.

Sample kurtosis coefficient:


𝑁 ∑𝑁 ̅ 4
𝑖=1(𝑋𝑖 − 𝑋 )
𝜅̂ =
(𝑁 − 1)(𝑁 − 2)(𝑁 − 3) 𝑠4
𝐾
352
= ∑(65.7 − 67.2)3 + (66.7 − 67.2)3 + ⋯ + (54.9 − 67.2)3
34 × 33 × 32 × 4.263
𝑖=1

= 6.47
→ Sample is highly peaked or flashy.

8 STATISTICAL VISUALIZATION

Scatter Plot
A scatter plot depicts values for two variables for a set of data. Data points are typically
displayed as markers with no line segments connecting them. Figure 5 shows an example of a
scatter plot for 20 observations of concrete compressive strength (y-axis) versus concrete density
(x-axis).

Fig 4. Scatter plot of compressive strength versus density of concrete

Page | 10
CIVE 203 Class Notes - For resident students only - Do not distribute
Time Series Plot
A time series is a graph in which the observations are displayed in order in which they occur (in
time): the y-axis denotes the observed values and the x-axis denotes the time (which could be
minutes, days, years, etc.).

Bar Graph
The occurrence of a discrete variable can be classified on a bar chart. In this type of graph, the
horizontal axis gives the values of the discrete variable and the occurrences are represented by
the height of the vertical lines.

Fig 5. Bar graph of speed of vehicles in Example 1

Histogram
If there are at least, say, 25 observations, one of the most common graphical form to depict the
frequency of observation is a histogram. To construct a histogram, the data are divided into
groups according to their magnitudes. The horizontal axis (x-axis) of the graph gives the
magnitude of classes while the vertical axis (y-axis) represents the number of observations in
each class (i.e., frequency). Histograms are used to determine the most common values (or
ranges) and symmetry in observed data. It is also common to re-scale the y-axis to show relative
frequency instead of number of occurrences. For each class, relative frequency is the number of
occurrences in the class divided by total number of observations.
Care should be given to number of classes used for constructing a histogram. Too many classes
will not give a clear picture, while too few classes will cause omission of important features. As
a rule of thumb, the number of classes should be between 5 and 25. An appropriate number of
classes can be obtained as follows:
𝑁𝑐 = 1 + 3.322 log10 (𝑁)
where N is the sample size. The number of classes may be adjusted to the closest lower integer.
For example, for 𝑁 = 35 → 𝑁𝑐 = 6.

Page | 11
CIVE 203 Class Notes - For resident students only - Do not distribute
Fig 6. Histogram of speed of vehicles in Example 1

Boxplot
A boxplot (Fig. 8) shows the three quartiles on a rectangular box, aligned either horizontally or
vertically. The box enclosed the interquartile range (IQR) with the left (or lower) edge at the first
quartile (Q1) and the right (or upper) edge at the third quartile (Q3). A line is drawn at the
second quartile (or median, which is the 50th percentile). Note on the figure below how the upper
and lower whiskers lines are drawn and how outliers are determined.

Fig 7. Boxplot explanation (from Montgomery and Runger, Applied Statistics and probability for Engineers, 7th
edition)

The boxplot of the data for example 1 is shown in Fig. 9.

Fig 8. Bar graph of speed of vehicles in Example 1

Page | 12
CIVE 203 Class Notes - For resident students only - Do not distribute
9 CROSS-CORRELATION COEFFICIENT
Consider two paired random samples 𝑋 = {𝑋1 , 𝑋2 , … , 𝑋𝑁 } and 𝑌 = {𝑌1 , 𝑌2 , … , 𝑌𝑁 }. For instance,
the 𝑋’s may represent annual precipitation over a drainage area and the 𝑌’s annual runoff at the
drainage outlet. The linear relationship between them may be investigated using cross-
correlation analysis. Specifically, the sample cross-correlation coefficient denoted by 𝜌̂
measures the linear association (dependence) between the samples 𝑋 and 𝑌 and is estimated by
∑𝑁 ̅ ̅
𝑖=1(𝑋𝑖 − 𝑋 ) (𝑌𝑖 − 𝑌)
𝜌̂ =
√∑𝑁 ̅ 2 √∑𝑁
𝑖=1(𝑋𝑖 − 𝑋 )
̅ 2
𝑖=1(𝑌𝑖 − 𝑌 )

where 𝑋̅ and 𝑌̅ are the sample means of 𝑋 and 𝑌, respectively. Often 𝑟 is used to denote the
sample cross-correlation coefficient. The cross-correlation coefficient is bounded by -1 and +1.
If 𝜌̂ is one in absolute value, then there is a perfect linear dependence between 𝑋 and 𝑌. A value
of zero on the other hand, means no linear dependence. A positive 𝜌̂ value indicated that the
value of 𝑌 increases as the values of 𝑋 increases. Conversely, a negative 𝜌̂ value indicated that
the value of 𝑌 decreases as the values of 𝑋 increases. When |𝜌̂| < 0.3, the dependence is weak,
while the dependence may be deemed as strong when |𝜌̂| > 0.7.

Example 5. Compute the correlation coefficient for the concrete data below:
Density 145.4 265.0 507.3 491.9 83.3 269.6 339.8 279.2 411.3 395.4
(kg/m^3)
Compressive Strength 27.7 48.8 5.3 72.6 4.5 22.8 37.2 52.6 34.4 77.7
(N/mm^2)
Density 210.6 287.4 58.5 591.2 141.9 108.4 254.8 159.0 319.3 236.7
(kg/m^3)
Compressive Strength 12.7 40.3 7.8 63.8 19.8 14.5 9.1 4.1 63.8 8.1
(N/mm^2)
Solutions:
𝑘𝑔
𝑁 = 20; 𝑋̅ = 277.8 ; 𝑌̅ = 31.38 𝑁/𝑚𝑚2
𝑚3
𝑁

𝑆𝑋𝑋 = ∑(𝑋𝑖 − 𝑋̅)2 = 405714


𝑖=1
𝑁

𝑆𝑌𝑌 = ∑(𝑌𝑖 − 𝑌̅)2 = 11423


𝑖=1
𝑁

𝑆𝑋𝑌 = ∑(𝑋𝑖 − 𝑋̅) (𝑌𝑖 − 𝑌̅) = 41817


𝑖=1

→ 𝜌̂ = 0.61

Page | 13
CIVE 203 Class Notes - For resident students only - Do not distribute

You might also like