0% found this document useful (0 votes)
2 views

EDA Prelim Week 1 Intro to Stats_16c27e051067d3f9fbba356c3eebb4fb-1

This document outlines the fundamental concepts of statistics, including definitions, types, and applications in various fields such as business, health, and education. It emphasizes the importance of mastering statistical methods for reading professional literature, conducting research, and making informed decisions. Additionally, it covers key terminologies, data classification, levels of measurement, and sampling techniques essential for effective data analysis.

Uploaded by

therealncle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

EDA Prelim Week 1 Intro to Stats_16c27e051067d3f9fbba356c3eebb4fb-1

This document outlines the fundamental concepts of statistics, including definitions, types, and applications in various fields such as business, health, and education. It emphasizes the importance of mastering statistical methods for reading professional literature, conducting research, and making informed decisions. Additionally, it covers key terminologies, data classification, levels of measurement, and sampling techniques essential for effective data analysis.

Uploaded by

therealncle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

1 Week 1 - Engineering Data Analysis

LEARNING OUTCOMES
At the end of the chapter, the students should be able to:
1. Define Statistics;
2. Differentiate descriptive and inferential Statistics;
3. Define the important terminologies in Statistics;
4. Cite examples of qualitative and quantitative data;
5. Enumerate and describe each of the scales of measurement; and
6. Identify the type of measurement in a given data.

INTRODUCTION
People use statistics as tools to understand information. Learning to understand statistics helps a person react
intelligently to statistical claims. Statistics are used in the fields of business, math, economics, accounting,
banking, government, astronomy, and the natural and social sciences.

For this week’s lesson, we will take a review of the basic concepts of statistical methods, importance of
statistics and the applications of statistics. Let us therefore take note of why students need statistics.

There are four simple reasons according to National Open University of Nigeria why students must develop
some mastery of the subject:
1.They must be able to read professional literature.

Students should never finish the extension of their skills in the


art of reading as this increase their vocabulary. But one cannot
read much of the literature in any specialized field in the social
sciences, particularly in the behavioral sciences and education
without encountering statistical symbols, concepts and ideas on
every interval. Therefore, persons who cannot read the average
research paper in their field with intelligence and with some
appreciation as to whether sound conclusions have been
reached are severely limited. This appreciation will require
3.some level are
Statistics of familiarity withof
essential part basic statisticaltraining.
professional ideas.

2. They must master techniques needed Trained psychologists and/or educators, as professionals they
in advanced courses. are, need statistical logic, statistical thinking operation. Since
their practice requires the common technical instruments such
In any laboratory course or experimental
as tests and scales, psychologists and educators depend upon
analysis, results cannot be treated or
statistical background in their administration and in the
reports written without at least minimal
interpretation of their results. You should note that using these
statistical operations, even a field survey
tests and scales without knowledge of the statistical reasoning
or the checking of a report involves
upon which they depend is like the medical diagnostician using
inevitable statistical steps.
clinical tests without the knowledge of physiology and pathology.
____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.
2 Week 1 - Engineering Data Analysis

4. Statistics are basic to research activities.

According to Guilford and Fruchter (1981) the extent that the psychologist or educator intends to keep alive
his research interests and research activities; he must as a matter of necessity lean upon his knowledge and
skills in statistical methods. Let it therefore be emphasized that in any professional fields where there are
still so many unknowns as in the behavioral sciences, the advancement of those professionals and of the
competence of their members depends to a high degree upon the continued research attitude and
research efforts of those members.

Advantages of Statistics in Research


1.They permit the most exact kind of description.

The goal of science is the description of phenomena. But description


can complete and accurate or useful to anybody who can understand
it, when he reads the symbols in terms of which those phenomena
are described. Mathematics and/or statistics are a part of the
descriptive language and an outgrowth of our verbal symbols
particularly adapted to the efficient kind of description which the
scientist demands.

2. They make us to be definite and exact in our procedure and in our


thinking.

Statistical operations direct our methods to be definite, statistical


logic makes it imperative to be right.

3. They enable us to summarize our results in a meaningful and


convenient form.

Most observations taken are bewildering and meaningless, but


statistics provide an unrivaled device for bringing order out of chaos
They
and enable us the
for seeing to draw general
general pictureconclusions.
in one’s result.

____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.
3 Week 1 - Engineering Data Analysis

4. There is a process of extracting conclusions from data-based researches.


This process is carried out according to acceptable rules. Therefore, by means of statistical steps we
can say about how much faith should be placed to any conclusion and how far we may extend our
generalization.

5.They enable us to predict.


Statistics are used to predict how much of a thing will happen under such conditions we know and
have measured. Statistical methods will also tell us about how much margin of error to allow in
making predictions. It is not only making predictions, but we can also know how much faith to place
in them.

6.They enable us to analyze some of the causal factors underlying complex and otherwise bewildering
events.
It is generally true in social sciences, psychology and education that any event or outcome is a
resultant of numerous causal factors. Since it is not easy to manage people and their affairs
sufficiently in experiments the best thing to do is to make a statistical study on the basis of the
findings we can predict.

Uses of Statistics

____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.
4 Week 1 - Engineering Data Analysis

Application of Statistics in the Other Fields

Business and Industry – Manufacturing


Build products and deliver services that
satisfy consumers and increase the
corporation’s profit margin

Engineering
Make a consistent product, detect
problems, minimize waste, and predict
product life in electronics, chemicals,
aerospace, pollution control,
construction, and other industries

Statistical Computing
Work in software design and
development, testing, quality
assurance, technical support, education,
marketing, and sales to develop code
that is both user-friendly and sufficiently
complex

Health and Medicine


Epidemiology
Work on calculating cancer incidence
rates, monitor disease outbreaks, and
monitor changes in health-related
behaviors such as smoking and physical
activity

____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.
5 Week 1 - Engineering Data Analysis

Health and Medicine


Public Health
Prevent disease, prolong life, and
promote health through organized
community efforts, including sanitation,
hygiene education, diagnoses, and
preventative treatment

Health and Medicine


Pharmacology
Work in drug discovery, development,
approval, and marketing, to ensure the
validity and accuracy of findings at all
stages of the process

Learning
Education
Teach K-12 through post-graduate
students, assess teacher effectiveness,
or develop statistical models to
represent student learning

Social Statistics
Law
Analyze data in court cases, including
DNA evidence, salary discrepancies,
discrimination law suits, and disease
clusters

____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.
6 Week 1 - Engineering Data Analysis

Statistics
It is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions.

TWO BRANCHES OF STATISTICS


I. Descriptive Statistics
▪ involves organizing, summarizing, and displaying data.
▪ collects data through survey, interview
▪ presents data by means of tables and graphs
▪ characterizes data using sample mean

Examples
1. A bowler wants to find his bowling average for the past 10 games.
2. A teacher wishes to determine the percentage of students who passed the preliminary
examination in Differential calculus.
3. A student wishes to determine the average monthly expenditures on school supplies
for the past 3 weeks.

II. Inferential Statistics


▪ involves using sample data to draw conclusions about a population.
▪ drawing conclusions and/or making decisions concerning a population based on
sample results.
Examples
1. A manager would like to predict based on previous years’ sales, the sales performance
of a company for the next five years.
2. A politician would like to estimate, based on opinion poll, his chance for winning in the
upcoming 2019 senatorial election.
3. A basketball player wants to estimate his chance of winning the most valuable player
(MVP) award based on his season averages and the averages of his opponents.
Estimation
Estimate the population mean weight using the sample mean weight

Hypothesis testing
Test the claim that the population mean weight is 120 pounds

Important Terminologies in Statistics


Population
The collection of all responses, measurements, or counts that are of interest

____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.
7 Week 1 - Engineering Data Analysis

Sample
A portion or subset of the population

Parameter
A number that describes a population characteristic
Example: Average gross income of all people in the Philippines in 2020.

Statistic
A number that describes a sample characteristic
Example: 2020 gross income of people in a sample of 3 regions.

DATA CLASSIFICATION
WHY WE NEED DATA?
▪ To provide input to survey
▪ To provide input to study
▪ To measure performance of service or production process
▪ To evaluate conformance to standards
▪ To assist in formulating alternative courses of action
▪ To satisfy curiosity

VARIABLES AND THEIR MEASUREMENT


VARIABLE
A characteristic of persons, objects or events that differs in value across persons, objects or events
____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.
8 Week 1 - Engineering Data Analysis

Dichotomous variable
A variable that can have only two values

QUALITIES OF VARIABLES
Exhaustive Should include all possible answerable responses
Mutually exclusive No respondent should be able to have two attributes simultaneously
Example Employed vs. Unemployed
it is possible to be both if looking for a second job while
employed

QUALITATIVE VARIABLE
Variable whose observations vary in kind but not in degree
▪ Sex
▪ Religion
▪ Marital status

QUANTITATIVE VARIABLE
Variable whose observations vary in magnitude
▪ Age
▪ No. of children
▪ Income

DISCRETE VARIABLES
Quantitative variables whose observations can assume only a countable number of values
▪ No. of children in the family
▪ No. of family planning methods heard
▪ No. of dates in the past month

CONTINUOUS VARIABLES
Quantitative variables whose observations can assume any one of the countless number of values in
a line interval
____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.
9 Week 1 - Engineering Data Analysis

▪ Height
▪ Weight
▪ Time

INDEPENDENT VARIABLES
Cause or determine or influence the dependent variable(s)

DEPENDENT VARIABLES
Presumed outcome of the influence of the independent variable(s)

INTERVENING VARIABLES
▪ Sometimes referred to as test or control variables
▪ Used to test whether the observed relations between the independent and dependent
variables are spurious
▪ Serve either to increase or decrease the effect the independent variable has on the dependent
variable

LEVELS OF MEASUREMENT
1. Nominal
A measurement level in which numbers are used as labels or names rather than to reflect
quantitative information
Examples
▪ Sex 1 = Male
2 = Female
▪ Marital status
▪ ID number

2. Ordinal
A measurement level in which values reflect only rank order
Examples
▪ Educational attainment
1 = Elementary
2 = High School
3 = College
▪ Opinion on an issue (Strongly agree, Agree, Neutral, Disagree, strongly disagree)

3. Interval
A measurement level with an arbitrary zero point in which numerically equal intervals at
different locations on the scale reflect the same quantitative difference
Examples
▪ Temperature in Celsius or Fahrenheit
▪ IQ level
____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.
10 Week 1 - Engineering Data Analysis

4. Ratio
The highest level of measurement that has all the characteristics of the interval scale plus a
true zero point
Examples
▪ Income
▪ No. of children
▪ Age

Properties Held by Each Level of Measurement


Level of Property
measurement Categories Ranks Equal intervals True zero point
Nominal Yes No No No
Ordinal Yes Yes No No
Interval Yes Yes Yes No
Ratio Yes Yes Yes Yes

Levels of measurement guidelines


▪ It is usually best to gather data at highest level of measurement possible because one can
perform more mathematical operations and gain greater precision of measurement
▪ Interval and ratio variables can be changed to become ordinal or nominal variables but not
vice versa

____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.
11 Week 1 - Engineering Data Analysis

SAMPLING TECHNIQUES
Sampling is the process of identifying the sample from the population to ensure that what is true for
the sample is also true for the population or simply “the process of measuring a small portion of
something and making a general statement about the whole thing.”

TYPES OF SAMPLING
Probability
each element in the population has an equal, independent chance of being selected. The goal is to
obtain a sample representative of the target population
Examples
1. Simple random sampling
2. Stratified random sampling
3. Cluster sampling
4. Systematic Sampling

Nonprobability
1. Consecutive sampling: commonly used in intervention studies.
2. Convenience sampling
3. Purposive sampling: commonly used in qualitative research.

Random Sampling
Each member of the population has an equal chance of being selected.

Simple Random Sampling


All samples of the same size are equally likely.
Assign a number to each member of the population.

Random numbers
can be generated by a random number table, software program or a calculator.
Data from members of the population that correspond to these numbers become members of the
sample.

Stratified Random Sampling


Divide the population into groups (strata) and select a random sample from each group. Strata could
be age groups, gender or levels of education, for example.

Cluster Sampling
Divide the population into individual units or groups and randomly select one or more units. The
sample consists of all members from selected unit(s).

____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.
12 Week 1 - Engineering Data Analysis

Systematic Sampling
Choose a starting value at random. Then choose sample members at regular intervals.
We say we choose every kth member. In this example, k = 5. Every 5th member of the population is
selected.

Convenience Sampling
Choose readily available members of the population for your sample.

Sample Size
Purpose: To make a rough estimate of how many subjects required to answer the research question.
During the design of the study, the sample size calculation will indicate whether the study is feasible.
During the review phase, it will reassure the reviewers that not only is the study feasible, but that
resources are not being wasted by recruiting more subjects than is necessary

Two Basic Methods of Sample Size Estimation


1. Hypothesis-based
2. Confidence interval-based

Brief Overview of Sample Size Calculations

Hypothesis-based sample sizes


indicate the number of subjects necessary to reasonably test the primary study hypothesis.
Hypotheses can be shown to be wrong, but they can never be proven correct. This is because the
investigator cannot test all people in the world with the condition of interest. The investigator
attempts to test the research hypothesis through a sample of the larger target population

Hypothesis-based sample size


From the data collected, inferences are made about the larger population. For example, if 80% of
patients self-administering analgesia report good pain control, whereas only 40% of patients
receiving nurse-administered analgesia report good pain control, one would conclude that there is a
difference between the two methods and that self-administered analgesia is superior. However,
there is always a possibility that since we have only used a sample of all possible patients, there may,
in fact, be no difference between the two but the results have just occurred due to chance To test
this formally, a statistical test would be done.

In this case the P value is 0.03. This P value means that the probability of obtaining these results or
results even more extreme, if in truth there is no difference between the two methods, is no more
than 3%. Therefore, either self-administered analgesia is better than nurse-administered analgesia or
a very unusual event has occurred. When there is truly no difference between two interventions, but

____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.
13 Week 1 - Engineering Data Analysis

the results of our study suggest there is a difference, a type 1 error has occurred. Generally, studies
will accept a 5% risk (α level) of making a type 1 error. The calculated P value is the probability that we
may have made a type 1 error.

Download the spreadsheet by clicking on the download button using the link: https://www.research-
advisors.com/tools/SampleSize.htm

____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.
14 Week 1 - Engineering Data Analysis

A type 2 error occurs when we conclude there is no evidence of a difference between two groups,
when in truth there is. Most investigators accept a greater risk of making a type 2 error, usually 10%
or 20% (β level).

Components of the Hypothesis-based Sample Size Calculation


1. Type 1 error (α): falsely rejects null hypothesis ∗ Usual risk 0.05
2. Type 2 error (β): falsely accepts null hypothesis ∗ Usual risk 0.1 - 0.2 ∗ Study’s power = 1-β

Print References
1. Guzman, P. (2016). Statistics and probability. Quezon City : C & E Publishing.
2. Mercado, J. P. (2016). Next century mathematics (statistics and probability). Manila: Phoenix
Publishing House
3. Belecina, R. R. (2016). Statistics and probability. Manila: Rex Book Store
4. Lim,Y. et.al. (2016). Statistics and probability. Manila: Sibs Publishing House .
5. Parreño, E. B. (2014). Basic statistics (A Worktext).Quezon City: C and E Publishing, Inc.
6. Narag, E. C. (2010). Basic statistics with calculator and computer application. Manila: Rex
Bookstore, Inc.
7. Asaad, A.S. (2008). Statistics make simple for researchers. Manila: Rex book Store, Inc.
8. Altares, P. et al. (2003) Elementary statistics: a modern approach. Manila : Rex Book Store.
9. Danao, R.(2002) introduction to statistics and econometrics. Quezon City : UP Press.
10. Del Rosario, A. (2004). Business statistics. Manila : Del Ros Publishing House.
11. De Veaux, R. (2004). introduction to statistics. Boston : Pearson.
12. Kazmier, L. J. (2004). theory and problems of business statistics. 4th ed. New York :
McGraw-Hill.
13. Lind, D. (2005). Statistical techniques in business and economics.12 th ed. Boston : McGraw
Hill.
14. Walpole, R.(2000). Probability and statistics for engineers and scientists. 6th ed. Singapore
: Pearson Education Asia.

Electronic Sources
1. https://bernatbosch.files.wordpress.com/2012/06/statistical-methods-ii.pdf
2. Statistics.Mussouri. Retrieved: August 8, 2020 from
.http://www.mussouri.edu/~soilwww/statsamp.doc
3. BasicStatistics. Retrieved: August 8, 2020 from.
http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/book.ht
ml
4. G. Jay Kerns. (March 24, 2011) Introduction to Probability and Statistics Using R. First Edition.
Retrieved: August 8, 2020 from http://cran.r-
project.org/web/packages/IPSUR/vignettes/IPSUR.pdf
____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.
15 Week 1 - Engineering Data Analysis

5. T. T. Soong. (2004) Fundamenmtals of Probability and Statistics for Engineers. Retrieved:


August 31, 2017 from http://vfu.bg/en/e-Learning/Math
Soong_Fundamentals_of_probability_and_statistics_for_engineers.pdf Retrieved: August
8, 2020 from
6. F.M Dekking, C. Kraaikamp, et. al, (2005) A modern Introduction to probability and Statistics
Understanding Why and How. Retrieved: August 8, 2020 from
http://www.cis.temple.edu/~latecki/Courses/CIS2033-
Spring12/A_modern_intro_probability_statistics_Dekking05.pdf
7. Prasanna Sahoo (2008). Probability and Mathematical Statistics. Retrieved: August 8, 2020
from http://www.math.louisville.edu/~pksaho01/teaching/Math662TB-09S.pdf

____________________________________________________________________________________________________
Lecture 1 Introduction to Statistics PRELIM
This document is a property of University of Saint Louis Tuguegarao. It must not be reproduced or transmitted in
any form, in whole or in part, without expressed written permission.

You might also like