MATH019A Engineering Data Analysis: Engr. Jan Justine A. Razon Faculty, Cpe Department
MATH019A Engineering Data Analysis: Engr. Jan Justine A. Razon Faculty, Cpe Department
Engineering Data
Analysis
ENGR. JAN JUSTINE A. RAZON
FACULTY, CPE DEPARTMENT
MATH
019A
PRELIM TOPICS
1. OBTAINING DATA
1.1 Methods of Data Collection
1.2 Planning and Conducting Surveys
1.3. Introduction to Design Experiments
2. PROBABILITY
2.1 Relationship among Events MATH
019A
2.2 Rules of Probability
PRELIM TOPICS
3. Discrete Probability Distribution
3.1 Random Variables
3.2 Cumulative Distribution
3.3 Binomial Distribution
3.4 Poisson Distribution
4. Continuous Probability Distribution
4.1 Continuous Random Variables
4.2 Normal and Exponential MATH
Distribution 019A
METHODS OF DATA
COLLECTION
2. SECONDARY DATA
data which have been collected by someone else
and which have already been passed through the MATH
statistical process. 019A
METHODS OF DATA COLLECTION:
PRIMARY DATA
1. Observation
2. Interview
3. Questionnaire
4. Case Study
5. Survey
MATH
019A
METHODS OF DATA COLLECTION: PRIMARY DATA
OBSERVATION
ADVANTAGES DISADVANTAGES
Subjective bias eliminated Time consuming
Current information Limited information
Independent to respondent’s Unforeseen factors
variable MATH
019A
TYPES OF OBSERVATION
STRUCTURED and
UNSTRUCTURED
1. Structured Observation
when observation is done by characterizing style of recording
the observed information, standardized conditions of
observation , definition of the units to be observed , selection
of pertinent data of observation.
Example: An auditor performing inventory analysis in store
2. Unstructured Observation
when observation is done without any thought before
observation.
Example: Observing children playing with new toys. MATH
019A
TYPES OF OBSERVATION
PARTICIPANT and NON-PARTICIPANT
1. Participant
when the Observer is member of the group which he is
observing.
Advantages: 1. Observation of natural behavior
2. Closeness with the group
3. Better understanding
2. Non-participant
when observer is observing people without giving any
information to them.
Advantages: 1. Objectivity and neutrality MATH
2. More willingness of the respondent 019A
TYPES OF OBSERVATION
CONTROLLED and UNCONTROLLED
1. Controlled
when the observation takes place in natural condition. It is
done to get spontaneous picture of life and persons.
2. Uncontrolled
when observation takes place according to definite pre
arranged plans , with experimental procedure then it is
controlled observation generally done in laboratory under
controlled condition.
MATH
019A
METHODS OF DATA COLLECTION: PRIMARY DATA
INTERVIEW METHOD
INTERVIEW METHOD
This method of collecting data
involves presentation or oral-verbal
stimuli and reply in terms of oral-
verbal responses.
MATH
019A
TYPES OF INTERVIEW
• Personal interviews : The interviewer asks questions
generally in a face to face contact to the other person or
persons.
• Structured interviews : in this case, a set of pre- decided
questions are there.
• Unstructured interviews : in this case, we don’t follow a
system of pre-determined questions.
• Focused interviews : attention is focused on the given
experience of the respondent and its possible effects.
• Clinical interviews : concerned with broad underlying
feelings or motivations or with the course of individual’s life
experience, rather than with the effects of the specific
experience, as in the case of focused interview. MATH
019A
TYPES OF INTERVIEW
• Group interviews : a group of 6 to 8 individuals is
interviewed.
• Qualitative and quantitative interviews : divided on the
basis of subject matter i.e. whether qualitative or quantitative.
• Individual interviews : interviewer meets a single person
and interviews him.
• Selection interviews : done for the selection of people for
certain jobs.
• Depth interviews : it deliberately aims to elicit unconscious
as well as other types of material relating especially to
personality dynamics and motivations.
• Telephonic interviews : contacting samples on telephone.
MATH
019A
METHODS OF DATA COLLECTION: PRIMARY DATA
QUESTIONNAIRE METHOD
QUESTIONNAIRE
METHOD
This method of data collection is quite
popular, particularly in case of big
enquiries.
QUESTIONNAIRE METHOD
ADVANTAGES DISADVANTAGES
Low cost even if the geographical area Low rate of return of duly filled
is too large questionnaire.
Answers are in respondents word so free Slowest method of data collection.
from bias.
Adequate time to think for answers. Difficult to know if the expected
respondent have filled the form or it is
filled by someone else.
Non approachable respondents may be
conveniently contacted.
ADVANTAGES DISADVANTAGES
They are less costly and less time- They are subject to selection bias
consuming; they are advantageous
when exposure data is expensive or
hard to obtain.
They are advantageous when studying They generally do not allow
dynamic populations in which follow- calculation of incidence (absolute risk). MATH
up is difficult.
019A
METHODS OF DATA COLLECTION: PRIMARY DATA
SURVEY METHOD
SURVEY METHOD is one of the
common methods of diagnosing and
solving of social problems is that of
undertaking surveys.
ADVANTAGES DISADVANTAGES
Relatively easy to administer Respondents may not feel encouraged
to provide accurate, honest answers
Can be developed in less time Surveys with closed-ended questions
(compared to other data-collection may have a lower validity rate than
methods) other question types.
Cost-effective, but cost depends on Data errors due to question non-
survey mode responses may exist. MATH
019A
SECONDAY DATA:
SOURCES OF DATA
• Publications of Central, state , local government
• Technical and trade journals
• Books, Magazines, Newspaper
• Reports & publications of industry ,bank, stock
exchange
• Reports by research scholars, Universities, economist
• Public Records
MATH
019A
FACTORS TO BE CONSIDERED BEFORE
USING SECONDARY DATA
MATH
019A
SELECTION OF PROPER METHOD FOR
COLLECTION OF DATA
MATH
019A
DESIGNING A SURVEY
Surveys can take different forms. They can be used to ask only
one question or they can ask a series of questions. We can use
surveys to test out people’s opinions or to test a hypothesis.
MATH
019A
DESIGNING A SURVEY
Example:
1. Martha wants to construct a survey that shows which
sports students at her school like to play the most.
MATH
019A
DESIGNING A SURVEY
Step 1: GOAL
The goal of the survey is to find the answer to the question: “Which
sports do students at Martha’s school like to play the most?”
Step 2: POPULATION
A sample of the population would include a random sample of the
student population in Martha’s school. A good strategy would be to
randomly select students (using dice or a random number generator) as
they walk into an all-school assembly.
MATH
019A
DESIGNING A SURVEY
Step 3: METHODS
Face-to-face interviews are a good choice in this case. Interviews will be
easy to conduct since the survey consists of only one question which can
be quickly answered and recorded, and asking the question face to face
will help eliminate non-response bias.
Step 4: DATA
MATH
019A
DESIGNING A SURVEY
Example:
1. Juan wants to construct a survey that shows how many
hours per week the average student at his school works.
MATH
019A
DESIGNING A SURVEY
Step 1: GOAL
The goal of the survey is to find the answer to the question “How many
hours per week do you work?”
Step 2: POPULATION
Juan suspects that older students might work more hours per week than
younger students. He decides that a stratified sample of the student
population would be appropriate in this case. The strata are grade levels
9th through 12th. He would need to find out what proportion of the
students in his school are in each grade level, and then include the same
proportions in his sample.
MATH
019A
DESIGNING A SURVEY
Step 3: METHODS
Face-to-face interviews are a good choice in this case since the survey
consists of two short questions which can be quickly answered and
recorded.
Step 4: DATA
MATH
019A
THE BASIS OF CONDUCTING AN
EXPERIMENT
1. With an experiment, the researcher is trying to learn something
new about the world, an explanation of 'why' something happens.
MATH
019A
PROBABILITY
SAMPLE SPACE
The set of all possible outcomes of a statistical experiment is called the
sample space and is represented by the symbol S.
ELEMENT
Each outcome in a sample space is called an element or a member of
the sample space.
Example #1:
Consider the experiment of tossing a die. If we are interested in the
number that shows on the top face, the sample space would be
MATH
S = {1,2,3,4,5,6} 019A
PROBABILITY | Sample Spaces & Events
Example #2:
An experiment consists of flipping a coin and then flipping it a second time
if a head occurs. If a tail occurs on the first, flip, then a die is tossed once.
To list the elements of the sample space providing the most information, we
construct the tree diagram
S = {HH, HT, T1, T2, T3, T4, T5, T6}
MATH
019A
PROBABILITY | Sample Spaces & Events
EVENT
Is any collection of sample points called subset of a sample space
MATH
019A
PROBABILITY | Basic Rules
1. The complement of an event A with respect to S is the subset of all
elements of S that are not in A. We denote the complement of A by the
symbol A’.
2. The intersection of two events A and B, denoted by the symbol A
∩ B, is the event containing all elements that are common to A and B.
3. Two event A and B are mutually exclusive, or disjoint, if A ∩ B = Ø
that is, A and B have no elements in common.
4. The union of events A and B, denoted by A∪B, is the event containing
all the elements that belong to A or B or both.
S = {1,2,3,4,5,6,7,8,9,10}
A = {1,3,5,8,9}
B = {1,4,6,8,10}
MATH
019A
PROBABILITY | Basic Rules
Example #4.
If M = {x | 3 < x < 9} and N= {y | 5 < y < 12}, then
M U N = {z | 3 < z < 12}
VENN DIAGRAMS
A∩B=
B∩C =
A∪C=
B’ ∩ A =
A∩B∩C=
(A ∪ B) ∪ C =
MATH
019A
Counting Sample Points
1st Rule: If operations can be performed in n ways, and if for each of these
ways a second operation can be performed in n2 ways, then two operations
can be performed in n1n2 ways
Example#1:
How many 4-digit even number can be formed from 0, 1, 2, 5, 6, and 9 if
each digit can be used only once?
Example#2:
The number of permutations of letters a,b,c,d.
MATH
019A
Counting Sample Points
•3rd Rule: The number of permutation of n distinct object taken r at a time is
Example #3:
In one year, three awards (research, teaching, and service) will be given for
a class of 25 graduate students in a statistics department. If each student
can receive at most one award, how many possible selections are there?
Example #4:
A president and a treasurer are to be chosen from a student club consisting
of 50 people. How many different choices of officers are possible if
(a) there are no restrictions;
(b) A will serve only if he is president;
MATH
(c) B and C will serve together or not at all: 019A
(d) D and E will not serve together?
Counting Sample Points
•4th Rule: The number of distinct permutations of n things of which n1 are
one of a kind, n2 of a second kind, …, nk of nth kind is
MATH
019A
Counting Sample Points
•5th Rule: The number of combinations of n distinct objects taken r at a time
is
MATH
019A
SW#1(Prelim)
1. If S = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} and
A = {0, 2, 4, 6, 8}
B = {1, 3, 5, 7, 9}
C = {2, 3, 4, 5}
D = {1, 6, 7}, list all the elements of the sets corresponding
to the following events:
a. A ∪ C
b. A ∩ B
c. C’
d. (C’∩ D) ∪ B
e. (S ∩ C)’
f. A ∩ C ∩ D’
MATH
Draw venn diagram for each item. 019A
SW#1(Prelim)
2. The resumes of 2 male applicants for a college teaching position in chemistry are
placed in the same file as the resumes of 2 female applicants. Two positions become
available and the first, at the rank of assistant professor, is filled by selecting 1 of the 4
applicants at random. The second position, at the rank of instructor, is then filled by
selecting at random one of the remaining 3 applicants. Using the notation M2F1, for
example, to denote the simple event that the first position is filled by the second male
applicant and the second position is then filled by the first female applicant,
2. (a) How many three-digit numbers can be formed from the digits 0, 1, 2, 3, 4, 5, and 6, if
each digit can be used only once?
(b) How many of these are odd numbers?
(c) How many are greater than 330?
3. If a multiple-choice test consists of 5 questions each with 4 possible answers of which only 1
is correct,
(a) In how many different ways can a student check off one answer to each question?
(b) In how many different ways can a student check off one answer to each question and get
all the answers wrong?
4. Nine people are going on a skiing trip in 3 cars that hold 2, 4 and 5 passengers, respectively.
MATH
In how many ways is it possible to transport 9 people to the ski lodge, using all cars?
019A
Random Variables and Probability Distribution
A random variable is a function that associate a real number with each
element in the sample space.
X denotes a random variable
x denotes its cases
Types:
1. Discrete – if a sample space contains finite number of possibilities
2. Continuous – if a sample space contains infinite number of possibilities
Examples:
3. Number of automobiles accidents per year in Q.C.
4. Length of time to play 15 holes of golf
5. Amount of milk produced yearly by a particular cow MATH
6. Number of eggs laid each month by a hen 019A
7. Length of grain produced per hectare.
Discrete Probability Distribution
•- A discrete random variable assumes each of its values with a certain
possibility.
- The set of ordered pairs (x, f(x)) is a probability function, probability mass
function or probability distribution.
1. f(x) ≥ 0
2.
3. P(X=x) = f(x)
Example:
8 computers Find: the probability distribution of the number of
3 out of 8 defectives defectives
Randomly get 2 computers
MATH
019A
Cumulative Distribution Function
•The
cumulative distribution function F(x) of a discrete random variable x
with probability distribution f(x) is
Example:
Find the cdf of a random variable x using F(x), verify that f(2) = 3/8
Given: f(0) = 1/16 f(2) = 3/8 f(5) = 1/16
f(1) = 1/4 f(3) = 1/4
MATH
019A
Cumulative Distribution Function
MATH
019A
SW#3 (Prelim)
1. Given the probability distribution:
(i) What is the probability that the engineer incorrectly passes a day’s
production as acceptable if only 80% of the day’s DVD players actually
conform to specification?
(ii) What is the probability that the engineer unnecessarily requires the entire MATH
day’s production to be tested if in fact 90% of the DVD players conform to 019A
specifications?
Joint Probability Distribution
•- If X and Y are two discrete random variables, the probability distribution
for their simultaneous occurrence can be represented by a function with
values f(x,y) for any pair values of (x,y) within the range of the random
variables X and Y.
- The function f(x,y) is a joint probability distribution of probability mass
function of the discrete random variables X and Y
1. f(x,y) ≥ 0 for all (x,y)
2.
3. P(X = x, Y = y) = f(x,y)
for any region A, in the xy plane
P(X,Y) ∈ A =
MATH
019A
Joint Probability Distribution
Examples
1. A box contains 3 blue, 2 red, and 3 green refills. You are asked to get
two random refills at a time. Find:
a. f(x,y) if X = blue and Y = red
b. P[(X,Y) ∈ A] if the region is given by: {x| x+y ≤ 1}
2. The joint probability distribution of X and Y is given by:
X
f(x,y) 0 1 2 3
MATH
019A