Introduction - Sesh 1
Introduction - Sesh 1
LECTURE NOTES
Course MATH 4 – Engineering Data Analysis
Sem/AY First Semester/2023-2024
Module No. 1
Lesson Title Introduction and Role of Statistics to Engineering, Obtaining Data
Week
3
Duration
Date Week 1-3
This lesson will discuss the role of statistics in the engineering problem-solving
Description process. Discuss how variability affects the data collected and used for making
of the engineering decisions, the difference between enumerative and analytical studies.
Lesson Planning and conducting surveys as well as experiments will be emphasized in this
lesson.
Learning Outcomes
Intended Students should be able to meet the following intended learning outcomes:
Learning • To make the students understand the principles of statistics, the role of
Outcomes statistics to engineering
• To differentiate the three methods of collecting data, this is retrospective study,
observational study and designed experiment.
• To understand the planning & conducting surveys and experiments
Targets/ At the end of the lesson, students should be able to:
Objectives • differentiate retrospective study, observational study and designed experiment
• know the current situation of the Engineering problems in support the results
of analysing data and make final recommendations for a decisions
• apply the planning & conducting surveys and experiments in the actual
situation
Lecture Guide
1. Learning Guide Questions:
1. What are the principles of statistics? What is the role of statistics to
Engineering? What are the different steps in the engineering
method?
2. What are the three methods of collecting data? What are the
example and application of the methods?
3. What are the different steps in planning & conducting surveys and
experiments in the actual situation
-Organizing:
All collected data will be arranged in logical and chronicle order for viewing
and analyses. Datasets are normally organized in either ascending or
descending order.
-Summarizing:
Summarizing the data to offer an overview of the situation
-Presenting:
Develop a comprehensive way to present the dataset
-Analyzing:
To analyse the dataset for the intended application
Statistics is the science of decision making in a world full of uncertainties
Example of uncertainty in the world:
-A person’s daily routine
-The fate of human being
-Person’s career life
-The rise and fall of stock market
-The weather
-Global economy
-World politics
1. Descriptive Statistics
-collection and organization of data
-uses the data to provide descriptions of the population, either through
numerical calculations or graphs or tables
2. Inferential Statistics
-makes inference and predictions about a population based on a sample
of data taken from the population in question
-consist of generalizing from the samples to populations, performing
hypothesis testing, determining relationships among variables, and
making predictions
Types of Data
Primary Data- are data collected directly by the researcher himself. These are
first hand or original sources
-by mail of recording or recording forms via ordinary or special mails, courier
services, e – mail and fax to reach out distant data providers;
-by experimentation to find out cause and effect of a certain phenomenon; and
Sampling Design/Methods
1. Probability Sampling
a. Each of the units in the target population has the chance of being
included in the sample
b. Greater possibility of representative sample of the population
c. Conclusion derived from data gathered can be generalized for the
whole population
The steps in the engineering method are shown in Fig. 1-1. Notice that the
engineering method features a strong interplay between the problem, the
factors that may influence its solution, a model of the phenomenon, and
experimentation to verify the adequacy of the model and the proposed
solution to the problem. Steps 2–4 in Fig. 1-1 are enclosed in a box, indicating
that several cycles or iterations of these steps may be required to obtain the
connector may fail when it is installed in an engine. Eight prototype units are
produced and their pull-off forces measured (in pounds):
-The dot diagram is very useful plot for displaying a small body of data-
say up to about 20 observations
-This plot allows us to see easily two features of the data; the location,
or the middle, and the scatter or variability
-The engineer considers an alternate design and eight prototypes are built and
pull-off force measured.
Figure 1-2 presents a dot diagram of these data. The dot diagram is a very
useful plot for displaying a small body of data—say, up to about 20
observations. This plot allows us to see easily two features of the data; the
location, or the middle, and the scatter or variability. When the number of
observations is small, it is usually difficult to identify any specific patterns in
the variability, although the dot diagram is a convenient way to see any
unusual data features. The need for statistical thinking arises often in the
solution of engineering problems. Consider the engineer designing the
connector. From testing the prototypes, he knows that the average pull-off
force is 13.0 pounds. However, he thinks that this may be too low for the
intended application, so he decides to consider an alternative design with a
greater wall thickness, 18 inch. Eight prototypes of this design are built, and the
observed pull-off force measurements are 12.9, 13.7, 12.8, 13.9, 14.2, 13.2,
13.5, and 13.1. The average is 13.4. Results for both samples are plotted as dot
diagrams in Fig. 1-3, page 3. This display gives the impression that increasing
the wall thickness has led to an increase in pull-off force. However, there are
some obvious questions to ask. For instance, how do we know that another
sample of prototypes will not give different results? Is a sample of eight
prototypes adequate to give reliable results? If we use the test results obtained
so far to conclude that increasing the wall thickness increases the strength,
what risks are associated with this decision? For example, is it possible that the
apparent increase in pull-off force observed in the thicker prototypes is only
due to the inherent variability in the system and that increasing the thickness
of the part (and its cost) really has no effect on the pull-off force?
Often, physical laws (such as Ohm’s law and the ideal gas law) are applied to
help design products and processes. We are familiar with this reasoning from
general laws to specific cases. But it is also important to reason from a specific
set of measurements to more general cases to answer the previous questions.
This reasoning is from a sample (such as the eight connectors) to a population
(such as the connectors that will be sold to customers). The reasoning is
referred to as statistical inference. See Fig. 1-4. Historically, measurements
were obtained from a sample of people and generalized to a population, and
the terminology has remained. Clearly, reasoning based on measurements from
some objects to measurements on all objects can result in errors (called
sampling errors). However, if the sample is selected properly, these risks can
be quantified and an appropriate sample size can be determined.
1. Retrospective Study
This would use either all or a sample of the historical process data
archived over some period of time.
2. Observational Study
Collect relevant data from current operations without disturbing with
the system. The engineer observes the process or population,
disturbing it as little as possible, and records the quantities of interest.
Because these studies are usually conducted for a relatively short time
period, sometimes variables that are not routinely measured can be
included.
3. Designed Experiments
Disturb the system and observe the impacts. The engineer makes
Possible issues
-Missing data: records are often incomplete;
-Incompatible data: response may be hourly average, temperatures may be
instantaneous.
-Some factors may not have changed much, so we cannot detect their impact;
-Some factors may vary together, so we cannot separate their impacts
Most importantly, we do not know what else might have been changing, and
influencing the response.
Some improvement
-Data collection is more intensive than historical records, so no missing data
and variables can be measured on compatible time scales.
-But some factors may still not have changed much, and other factors may still
vary together, so we cannot detect or separate their impacts.
With more intensive effort, we can sometimes monitor other factors that might
influence the response.
Engineers choose two levels of each factor, a low level labelled “-“ and a high
level labelled “+”
If all other aspects of the distillation process are controlled, any differences in
the response for different treatments can be attributed to the differences in the
factor levels.
Advantages of surveys
They are efficient ways of collecting information from a large number of
people, they are relatively easy to administer, a wide variety of information can
be collected and they can be.
Disadvantages of surveys
It arise from the fact that they depend on the subjects motivation, honesty,
memory and ability to respond. Moreover, answer choices to survey questions
could lead to vague data.
Conducting a survey
“Why do we Experiments?”
Experiments are the basis of all theoretical predictions. Without experiments,
there would be no results, and without any tangible data, there is no basis for
any scientist or engineer to formulate a theory. The advancement of culture
and civilization depends on experiments which bring about new technology (P.
Cuadra)
1. Planning
It is important to carefully plan for the course of experimentation before
embarking upon the process of testing and data collection. A few of the
considerations to keep in mind at this stage are a thorough and precise
objective identifying the need to conduct the investigation, assessment of time
and resources available to achieve the objective and integration of prior
knowledge to the experimentation procedure. A team composed of individuals
from different disciplines related to the product or process should be used to
identify possible factors to investigate and the most appropriate response(s) to
2. Screening
Screening experiments are used to identify the important factors that affect
the process under investigation out of the large pool of potential factors. These
experiments are carried out in conjunction with prior knowledge of the
process to eliminate unimportant factors and focus attention on the key factors
that require further detailed analyses. Screening experiments are usually
efficient designs requiring few executions, where the focus is not on
interactions but on identifying the vital few factors.
3. Optimization
Once attention has been narrowed down to the important factors affecting the
process, the next step is to determine the best setting of these factors to
achieve the desired objective. Depending on the product or process under
investigation, this objective may be to either increase yield or decrease
variability or to find settings that achieve both at the same time.
4. Robustness Testing
Once the optimal settings of the factors have been determined, it is
important to make the product or process insensitive to variations that are
likely to be experienced in the application environment. These variations
result from changes in factors that affect the process but are beyond the
control of the analyst. Such factors (e.g. humidity, ambient temperature,
variation in material, etc.) are referred to as noise or uncontrollable factors. It
is important to identify such sources of variation and take measures to ensure
that the product or process is made insensitive (or robust) to these factors.
5. Verification
This final stage involves validation of the best settings by conducting a few
follow-up experimental runs to confirm that the process functions as desired
and all objectives are met.
1. Randomization
Allocation of the experimental material and the order of the runs of the
experiment performed are randomly determined. Statistical methods require
that observations (or errors) be independently disturbed random variables.
Randomization make this assumption valid. Randomization average out
effects of extraneous factors. Randomization can be done by computer
programs or random number tables.
2. Replication
3. Blocking
It helps in improving the precision of the experiment. It is used to reduce or
eliminate the variability transmitted from nuisance factor-factors that may
influence the response variable but in which we are not interested. Blocking
means putting similar experimental material in one block. And applying
treatments in each block. Two batches of raw material for the hardness testing
experiment.
Strategy of Experimentation