0% found this document useful (0 votes)
14 views

STS Reviewer

Uploaded by

gwellnrd6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

STS Reviewer

Uploaded by

gwellnrd6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Matlab

Overview of Statistics
• MatLab is an analytical platform and programming
language that is widely used by engineers and
Statistics scientists. As with R, the learning path is steep, and
− Statistics is the science of collecting, organizing, you will be required to create your own code at some
summarizing, and analyzing information to draw point.
conclusions or answer questions.
− It provides procedure in data collection, presentation, SAS
organization, and interpretation to have a meaningful
• SAS is a statistical analysis platform that offers
idea.
options to use either the GUI, or to create scripts for
more advanced analyses. It is a premium solution
Importance of Statistics that is widely used in business, healthcare, and
− Statistics plays a major role in many aspects of our lives. human behavior research alike.
− It is used in sports, for example, to help a general
manager decide which player might be the best fit for a GraphPad Prism
team.
• GraphPad Prism is premium software primarily used
− It is used in politics to help candidates understand how
within statistics related to biology, but offers a range
the public feels about various policies.
of capabilities that can be used across various fields.
− It is used in medicine to help determine the effectiveness
of new drugs.
− Statistical research in business enables managers to Minitab
analyze past performance, predict future business • The Minitab software offers a range of both basic
practices and lead organizations effectively. Statistics and fairly advanced statistical tools for data analysis.
can describe markets, inform advertising, set prices and
respond to changes in consumer demand. Excel
− Statistics, being quantitative tools widely used in the • Excel offers a wide variety of tools for data
areas of economics and finance, could help to shape visualization and simple statistics. It is simple to
effective monetary and fiscal policies and to develop generate summary metrics and customizable
pricing models for financial assets such as equities, graphics and figures, making it a usable tool for
bonds, currencies, and derivative securities. many who want to see the basics of their data.

Computer Software Recall


− Imagine you've just spent weeks, months, or even years • Statistics is the science of collecting, organizing,
gathering data for a research project, and now you want summarizing, and analyzing information to draw
to analyze it all to find out what it means. If the data conclusions or answer questions.
seems too massive to handle, then you use computer • It provides procedure in data collection,
software to deal with the data and make sure the results presentation, organization, and interpretation to
are useful and informative. have a meaningful idea.

SPSS Data
• SPSS (Statistical Package for the Social Sciences) is • The information referred to the definition is the data.
perhaps the most widely used statistics software
• According to the Merriam Webster dictionary, data
package within human behavior research. SPSS
are “factual information used as a basis for
offers the ability to easily compile descriptive
reasoning, discussion, or calculation”
statistics, parametric and non-parametric analyses,
as well as graphical depictions of results through the
graphical user interface (GUI). Types of Statistics
Descriptive Statistics
R It basically consists of organizing and summarizing data.
Descriptive statistics describe data through numerical
• R is a free statistical software package that is widely
summaries, tables, and graphs.
used across both human behavior research and in
other fields. While R is a very powerful software, it Examples:
also has a steep learning curve, requiring a certain 1.The average score of a volleyball player for the past 10
degree of coding. games
2.Birth rate in rural areas in the Philippines
3.Enrollment record of all colleges in BSU – TNEU Lipa ❖ Quantitative variables or numerical variables are
Campus variables that take on numerical values representing
an amount or quantity. These numerical values
Inferential Statistics should answer the question how much or how many.
− Some examples of qualitative variables are height,
It is the logical process that involves generalizing
weight, distance, salary, etc.
from a sample to the population from which the sample was
❖ Variables can also be classified into two according
selected and assessing the reliability of such generalizations.
to purpose whether experimental or mathematical.
It is also called as statistical inference or inductive statistics.
Examples:
Experimental Classification
1. A car manufacturer wishes to estimate the average lifetime
of batteries by testing a sample of 50 batteries.
2. The political views of the youth in the urban areas with ➢ Independent variables or explanatory variables are
respect to inflation rate in Asia variables controlled by the experimenter or
researcher, and expected to have an effect on the
3. A campaign manager analyzes the effect of TV ads on the
behavior of the subjects.
promotion of a presidential candidate
➢ Dependent variables or outcome variables measure
the behavior of subjects and expected to be
Basic Terminologies in Statistics influenced by the independent variable.
o Example:
❖ A population consists of all the members of the o For instance, to predict the value of
group about which you want to draw a conclusion, fertilizer on the growth of plants, the
while sample is a portion or part of the population of dependent variable is the growth of plants
interest selected for analysis. while the independent variable is the
❖ A parameter is a numerical index describing a amount of fertilizer used.
characteristic of a population while a statistic is a
numerical index describing a characteristic of a Mathematical Classification
sample.
→ Discrete variables are quantitative variables that
are either a finite number of possible values or a
Sources of Data countable number of possible values. These are
variables that are countable.
❖ Primary data are data that come from an original Some examples of these variable are number of cars, number
source, and are intended to answer a specific of siblings, etc.
research question. This can be taken by interview, → Continuous variables are quantitative variables
mail-in questionnaire, survey or experimentation. that have an infinite number of possible values that
❖ Secondary data are data taken from previously are not countable. These are variables that are no
recorded data, such as information in previously longer countable but are measurable.
conducted research, financial statements, business
Some examples of these variables are height, weight, volume,
periodicals, and government reports. It can also be
etc.
taken electronically, for instance via internet
websites, etc.
❖ A constant is a characteristic of objects, people, or Level of Measurement of Variables
events that does not vary. For example, the → Nominal Level is the first level of measurement and
temperature at which water boils (100 degree it is characterized by data that consist of names,
Celsius) is a constant. labels or categories only. Data cannot be arranged in
❖ A variable is a characteristic of objects, people, or ordering scheme. Nominal scales have no numerical
events that can take different values. It can vary in value.
quantity like weight of people, or in quality like hair Some examples of nominal level variables are
color of people. - Sex (male or female)
- Type of School (public or private)
Two Types of Variables - Eye Color (blue, green, brown).

❖ Qualitative variables or categorical variables are → Ordinal Level involves data that may be arranged
variables that yield categorical responses. These are in some order, but differences between data values
words or codes that represent class or category. either cannot be determined or meaningless. An
− Some examples of qualitative variables are eye ordinal scale not only classifies subjects but also
color, sex, occupation, student number, etc.
ranks them in terms of the degree to which they existing data that were originally collected for the
possess a characteristic of interest. purpose of the study.
Some examples of ordinal level variables are → Questions can either be:
- Highest Educational Attainment (elementary, high school, o An open-ended question is a type of
bachelor, masteral, doctoral) question that does not include response
categories. This type of question is usually
- Rank of military officer (lieutenant, captain, major,
appropriate for collecting subjective data.
colonel).
o A closed-ended question is a type of
question that includes a list of response
→ Interval Level is a measurement level that specifies categories from which the respondent will
the distances between each interval on the scale. select his answer. This type of question is
Variables of this level have no absolute zero. This usually appropriate for collecting objective
means that a value of zero does not mean the absence data.
of the quantity. 3. Focus Group – It is a group interview of
Some examples of interval level variables are approximately six to twelve people who share
- Temperature on Fahrenheit/Celsius thermometer similar characteristics or common interests. A
- IQ (e.g., high IQ vs. average IQ vs. low IQ), facilitator guides the group based on a
predetermined set of topics.
4. Experiment – It is a method of collecting data
→ Ratio Level represents the highest, most precise,
where there is direct human intervention on the
level of measurement. Variables of this level have
conditions that may affect the values of the variable
absolute zero which means that a value of zero
of interest.
means the absence of the quantity.
5. Observation – It is a method of collecting data on
Some examples of ratio level variables are the phenomenon of interest by recording the
- Height and weight observations made about the phenomenon as it
- Time actually happens. involves collecting information
- Distance and speed without asking questions.

Important Note Secondary data can be collected by:


If the entire population is studied, then inferential statistics is 1. Published report on newspaper and periodicals.
not necessary, because descriptive statistics will provide all 2. Financial Data reported in annual reports.
the information that we need regarding the population. 3. Records maintained by the institution.
4. Internal reports of the government departments.
Data collection is the process of gathering and measuring 5. Information from official publications.
information on variables of interest, in an established
systematic fashion that enables one to answer stated research
Sample Size
questions, test hypotheses, and evaluate outcomes.
→ The sample size is typically denoted by n and it is
always a positive integer. No exact sample size can
Steps in Data Gathering be mentioned here and it can vary in different
1. Set the objectives for collecting data research settings. However, all else being equal,
2. Determine the data needed based on the set objectives. large sized sample leads to increased precision in
3. Determine the method to be used in data gathering and estimates of various properties of the population.
define the comprehensive data collection points.
4. Design data gathering forms to be used. Choosing of sample size depends on nonstatistical
5. Collect data. considerations and statistical considerations.
• Non-statistical considerations – It may include
Methods of Data Collection availability of resources, man power, budget, ethics
and sampling frame.
Primary data can be collected by:
• Statistical considerations – It will include the
1. Direct personal interviews – The researcher has
desired precision of the estimate.
direct contact with the interviewee. The researcher
gathers information by asking questions to the
interviewee.
2. Indirect/Questionnaire Method – These methods
of data collection involve sourcing and accessing
Three criteria need to be specified to determine the • Sampling technique/Sampling Strategies - It is a plan you
appropriate sample size: set forth to be sure that the sample you use in your research
study represents the population from which you drew your
1. Level of Precision sample.
− Also called sampling error, the level of precision, is • Sampling Bias - This involves problems in your sampling,
the range in which the true value of the population which reveals that your sample is not representative of your
is estimated to be.
population.
2. Confidence Interval
− It is statistical measure of the number of times out of Advantages of Sampling
100 that results can be expected to be within a
Here are the advantages of sampling over complete
specified range. For example, a confidence interval
enumeration:
of 90% means that results of an action will probably
meet expectations 90% of the time. - Less Labor

3. Degree of Variability - Greater Efficiency and Accuracy


- Reduced Cost
− Depending upon the target population and attributes
under consideration, the degree of variability varies - Convenience
considerably. The more heterogeneous a population is, - Greater Speed
the larger the sample size is required to get an optimum - Ethical Considerations
level of precision. - Greater Scope

Basic Sampling Design → Population is a group to which the results of the


The goal in sampling is to obtain individuals for a study in study are intended to apply. A sample is a group in
such a way that accurate information about the population can a research study on which information is obtained.
be obtained. One of the most important steps in the research
process is to select the sample of individuals who
Reason for Sampling will participate as a part of the study.
- Important that the individuals included in a sample → Sampling refers to the process of selecting these
individuals.
represent a cross section of individuals in the population.
- If sample is not representative it is biased. You cannot
Two Types of Sampling
generalize to the population from your statistical data.
Random Sampling or Probability Sampling
o It is a process whose members had an equal chance
Definitions
of being selected from the population. Samples are
• Observation unit - An object on which a measurement is obtained using some objective chance mechanism,
taken. This is the basic unit of observation, sometimes called thus involving randomization.
an element. In studying human populations, observation o They require the use of a complete listing of the
units are often individuals. elements of the universe called the sampling frame.
• Target population - The complete collection of o The probabilities of selection are known. They are
generally referred to as random samples. They allow
observations we want to study.
drawing of valid generalizations about the
• Sampled population - The collection of all possible universe/population.
observation units that might have been chosen in a
sample; the population from which the sample was taken. a. Simple Random Sampling
• Sample - A subset of a population. It is the most basic method of drawing a probability
• Sampling unit - A unit that can be selected for a sample. sample which assigns equal probabilities of selection to each
We may want to study individuals, but do not have a list of possible sample. It is also a process of selecting n sample size
all individuals in the target population. Instead, households in the population via random numbers or through lottery.
serve as the sampling units, and the observation units are EXAMPLE:
the individuals living in the households. Alice conducted a study to determine the prevalence of
• Sampling frame - A list, map, or other specification of malaria in a province. From the list of 300 health centers,
Alice obtained 100 health centers using a random number
sampling units in the population from which a sample may
generator. The directors of each sampled health center were
be selected. For a survey using in-person interviews, the
interviewed to obtain the necessary information.
sampling frame might be a list of all street addresses.
b. Systematic Sampling EXAMPLE:
It is obtained by selecting every kth individual from A human resource director interviews the qualified applicants
the population until the desired number of subjects or in a supervisory position
respondents is obtained. The first individual selected c. Quota Sampling
corresponds to a random number between 1 to k. It is applied when an investigator survey collects
EXAMPLE: information from an assigned number, or quota of individuals
Leni conducted a study to determine the prevalence of from one of several sample units fulfilling certain prescribed
malaria in a province. From the list of all patients in the criteria or belonging to one stratum to one stratum.
province, Leni sampled 50 patients starting from patient with EXAMPLE:
ID number 4 and every 23rd patient thereafter, and retrieved When the respondents are composed of men aged over 30 or
their medical records. 20 people who have bought cellular phones in the last week.
It is in the interviewer’s discretion which men or cellular
c. Stratified Random Sampling phones buyers they select.
It is obtained by separating the population into non-
overlapping groups called strata and then obtaining a simple d. Snowball Sampling
random sample from each stratum. The individuals within It is a technique in which one or more members of a
each stratum should be homogeneous (or similar) in some population are located and used to lead the researchers to
way. other members of the population.
EXAMPLE: EXAMPLE:
A media manager wants to determine the proportion of To obtain a sample of homeless individuals, the researcher
Filipino households who patronize their nationwide drama will interview individuals on the street or at homeless shelter.
program simultaneously aired on radio and shown on TV.
Using the sampling frame of households arranged by region,
e. Voluntary Sampling
200 households from each region were randomly selected.
It is a technique when a sample is composed of
respondents who are self-select (volunteered) into the
d. Cluster Sampling
study/survey. Most of the time, the respondents have a strong
It is a process of selecting clusters from a population interest in the topic of the study.
which is very large or widely spread out over a wide
EXAMPLE:
geographical area
Consider a news show asks their viewers to participate in an
EXAMPLE:
on-line poll. The samples are viewers who have chosen
The Fuds Administration (FA) wants to know if there are themselves and not the survey administrator.
high levels of aflatoxin in Gagaraya’s Cracker Nut. The FA
head took a random sample of batches of the said cracker
Measure of Central Tendency
nut and all bags in the chosen batches are included in the
- A measure of central tendency, commonly referred to as
sample.
an average, is a single value that represents a data set. Its
purpose is to locate the center of a data set.
Non-random Sampling or Non-probability Sampling
o It is a sampling procedure where samples selected in
There are three different measures of central tendency:
a deliberate manner with little or no attention to
mean, median, mode.
randomization. Samples are obtained haphazardly,
selected purposively or are taken as volunteers. The
probabilities of selection are unknown. They should Mean
not be used for statistical inference. − The mean, or arithmetic mean, is the most frequently
a. Convenience Sampling used measure of central tendency. It is the only common
measure in which all values play an equal role meaning
It is a process of selecting a group of individuals who
to determine its values you would need to consider all
are conveniently available for a study.
the values of any given data set.
EXAMPLE:
− It is appropriate to determine the central tendency of an
A researcher may only include close friends and clients to be interval or ratio data.
included in the sample population
− The symbol , called “x bar”, is used to represent the
mean of a sample and the symbol μ, called “mu”, is used
b. Purposive Sampling to denote the mean of a population.
It is a process of selecting based from judgement to
select a sample which the researcher believed, based on prior
information, will provide the data they need.
Properties of Mean Mode
- A set of data has only one mean. The mode is the value in a data set that appears most
- Mean can be applied for interval and ratio data. frequently. Like the median and unlike the mean, the
- All values in the data set are included in computing the extreme values in a data set do not affect the mode.
mean.
- The mean is very useful in comparing two or more data sets. → A data set that has only one value that occur the
- Mean is most appropriate in symmetrical data. greatest frequency is said to be unimodal.
- Mean is affected by the extreme small or large values
(outliers) on a data set. If the data has two values with the same greatest
frequency, both values are considered the mode and the data
set is bimodal.
Mean can be computed as:
𝑆𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠 If a data set have more than two modes, and the data set is
MEAN = said to be multimodal.
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠

Sample Mean There are also some cases when data set values have
the same number frequency, when this occur, the data set is
said to be no mode.

Population Mean Properties of Mode


- The mode is found by locating the most frequently occurring
value.
- The mode is the easiest average to compute.
- There can be more than one mode or even no mode in any
Median given data set.
→ The median is the midpoint of the data array. - Mode is not affected by the extreme small or large values.
→ When the data set is ordered whether ascending or - Mode can be applied for nominal, ordinal, interval, and ratio
descending, it is called data array. data.
→ Median is an appropriate measure of central
tendency for data that are ordinal or above, but is
more valuable in an ordinal type of data. MEASURES OF RELATIVE
Properties of Median POSITION
- The median is unique, there is only one median for a set of
data.
The measure of relative position provides information about
- The median is found by arranging the set of data from
the position or location of particular values relative to the
lowest or highest (or highest to lowest) and getting the value
entire data set.
of the middle observation.
- Median is not affected by the extreme small or large
values. → Quantiles are statistics that describe various
subdivisions of a frequency distribution into equal
- Median can be applied for ordinal, interval, and ratio data.
proportions.
- Median is most appropriate in a skewed data.
1. Quartiles – split the data array in 4 equal parts.
2. Deciles – split the data array to 10 equal parts.
To determine the value for median in a data set with n 3. Percentiles – split the data array in 100 equal parts.
values, we need to consider two rules.
Formula: (Position)
A. If n is odd, the median is the middle-ranked value. QUARTILE PERCENTILES
B. If n is even, the median is the average of the two 𝑛𝑘 𝑛𝑘
middle ranked values. Qk= + 0.5 PK= + 0.5
4 100
𝑛+1
Median (Rank Value) =
2
DECLES
𝑛𝑘
DK= + 0.5
10
If the resulting positioning is an INTEGER, then the
particular numerical observation to that point is chosen for
the quartile.

If the resulting positioning is NOT AN INTEGER, then use


interpolation.

INTERPOLATION (Formula [Value])


Q=Lower Value+Decimal(Upper Value−Lower Value)

Measure of Dispersion
→ Spread of data values from the average
→ Dispersion is the difference between the actual value
and the average value.

Range
- Difference of highest and lowest value.
- (low value – lesser the variability or malapit sa mean)

Standard Deviation
- Describes the difference between data values and
mean
-
- calculated as the square root of variance.

Variance
- Squared measure of standard deviation.

Whatever you do, work at it with all your heart, as working for the Lord, not for human masters.

- Colossians 3:23

You might also like