Engineering Data Analysis Reviewer
Engineering Data Analysis Reviewer
CHAPTER 2A: OBTAINING AND ORGANIZATION Graphical Organization & Summarization of Data
OF DATA Frequency Table/Distribution - a systematic
Data arrangement of values grouped into class intervals.
Statistics is a tool for converting data into information. Frequency tables are used to summarize data so that
the frequency of each interval is clearly displayed and
the relative frequency of each interval can be easily
computed.
Class Interval - range of numbers defined arbitrarily by
the highest and lowest numbers in the class.
Frequency - the number of times a particular value or
phenomenon occurs
Midpoint - average of the upper and lower boundary of
a class
Relative Frequency - the proportion of all given values
that fall within the interval. Usually expressed in
percent.
Cumulative frequency - is the sum of the frequency for
that class and all the previous classes
Scatter Plot
● a graphic display of data points in a
two-dimensional plane.
● Each data point represents a single unit of Measures of Variability
observation on which two measurements, X This is a single number that represents the spread or
and Y, have been made. amount of dispersion in a set of data.
● The values of each of the measurements are Range - measures the total spread of a set of data and
scaled on the X and Y axes, respectively. is computed from only two numbers.
● Each data point is located in the plane at the Range = largest measurement - smallest
intersection of its associated X and Y values. measurement
Variance
● Variance is a numerical value that describes
the variability of observations from its
arithmetic mean.
● Variance measures how far the outcome
varies from the mean
● The variance equals the average of the sum of
all the squared deviations of the population.
Standard Deviation, Sd
● the square root of the variance.
● This measurement is very useful for describing Coefficient of Variation (CV)
the spread or dispersion of a set of data ● This indicates the degree of precision with
around the mean. which the treatments are compared and is a
● Measures how far the normal standard good index of the reliability of the experiment.
deviation is from the expected value. ● It expresses the experimental error as
● Indicates how much observations or the percentage of the mean, thus the higher the
individuals of a data set which differs from the CV values, the lower is the reliability of the
mean. experiment.
● Basically CV<10 is very good, 10-20 is good,
20-30 is acceptable, and CV>30 is not
acceptable.
● For field experiments CV of 30% is tolerable
and for laboratory/ clinical experiments 5% is
the limit.
● Acceptable CV depends on the different
factors: experimental designs, number of
replications and size, experimental materials,
parameters, etc.
Example:
Probability
● Study of random or nondeterministic
experiments
● We often frame probability in terms of a
random process giving rise to an outcome
● Probability is defined as a proportion, and it
always takes values between 0 and 1 (or 0%
and 100%, in percentage)
Compound Events
Special Event
Vizualizing Events
Computing Probability
Joint Probability
Conditional Probability
Compound Probability
Multiplicative Rule