100% found this document useful (1 vote)
162 views4 pages

Statistics and Probability Notes

Statistics is the scientific study of collecting, organizing, analyzing, and interpreting data. There are two main categories of statistics - descriptive statistics which describe characteristics of a data set, and inferential statistics which make inferences about a population based on a sample. Key terms in statistics include variables, parameters, statistics, data types, measures of central tendency, and sampling techniques. Common graphs used to present statistical data include bar graphs, line graphs, pie charts, histograms, and frequency polygons.

Uploaded by

jeay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
162 views4 pages

Statistics and Probability Notes

Statistics is the scientific study of collecting, organizing, analyzing, and interpreting data. There are two main categories of statistics - descriptive statistics which describe characteristics of a data set, and inferential statistics which make inferences about a population based on a sample. Key terms in statistics include variables, parameters, statistics, data types, measures of central tendency, and sampling techniques. Common graphs used to present statistical data include bar graphs, line graphs, pie charts, histograms, and frequency polygons.

Uploaded by

jeay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

STATISTICS AND PROBABILITY - reviewer

SLP 1 Variable is a characteristic or property of a population


Statistics - scientific body of knowledge that deals with or sample which makes the members different from
the collection, organization or presentation, analysis, each other.
and interpretation of data. 1. DISCRETE VARIABLE - one that can assume a finite
Collection - gathering of information and data. number of values. In other words, it can assume
Organization or presentation - involves summarizing specific values only. (counting)
data or information in textual, graphical, or tabular 2. CONTINUOUS VARIABLE - one that can assume
forms. infinite values within a specified interval. (measuring)
Analysis - involves describing the data by using 3. DEPENDENT VARIABLE - variable which is affected or
statistical methods and procedures. influenced by another variable.
Interpretation - process of making conclusion based on 4. INDEPENDENT VARIABLE - one which affects or
the analyzed data. influences the dependent variable.
Constant is a property or characteristic of a population
TWO CATEGORIES OF STATISTICS or sample, which makes the members of the group
1. DESCRIPTIVE STATISTICS - statistical procedure similar to each other.
concerned with describing the characteristics and
properties of a group of persons, places, or things. SCALES OF MEASUREMENTS
2. INFERENTIAL STATISTICS - statistical procedure that is 1. NOMINAL SCALE - This is the most primitive level of
used to draw inferences or information about the measurements. The nominal level of measurement is
properties or characteristics by a large group of people, used when we want to distinguish one object from
or things or the basis of the information obtained from another for identification purposes. (Gender,
a small portion of a large. nationality, and civil status)
2. ORDINAL SCALE - data are arranged in some specified
TERMINOLOGIES IN STATISTICS order or rank. When objects are measured in this
Population refers to a large collection of objects, level, we can say that one is better or greater than the
persons, places, or things. other. (ranking of contestants in a beauty contest, of
Sample is a small portion or part of the population. It siblings of the family, or of honor students in the class)
could also be defined as a subgroup, subset, or 3. INTERVAL SCALE - we can say not only that one object
representative of a population. is greater or less than another, but we can also specify
Parameter is any numerical or nominal characteristic of the amount of difference. (Exam scores)
a population. It is a value or measurement obtained 4. RATIO SCALE - Measurement is like the interval level.
from a population. The only difference is the ratio level always starts from
Statistic is an estimate of a parameter. It is any value or an absolute or true zero point. (Weight)
measurement obtained from a sample.
Data (singular form is datum) are facts, or a set of ORGANIZATION AND PRESENTATION OF DATA
information or observation under study. FREQUENCY DISTRIBUTION TABLE - is a device for
1. QUALITATIVE DATA - can assume values that manifest organizing and presenting grouped data.
the concept of attributes. These are sometimes called RANGE - highest observed value-lowest observed
categorical data. value.
2. QUANTITATIVE DATA - data which are numerical in CLASS INTERVAL - the range of values used in defining
nature. These are data obtained from counting or a class. It defines the number of rows desired in the
measuring. table.
CLASS WIDTH - difference between the lower class 2. INDIRECT OR QUESTIONNAIRE METHOD - makes use
limit and the next lower class limit. of a written questionnaire. The researcher gives or
CLASS MARK (class mid-point) - the numerical value distributes the questionnaire to the respondents either
that is exactly in the middle of each class. by personal delivery or by mail.
CLASS BOUNDARIES - cannot belong to any class. 3. REGISTRATION METHOD - data is governed by laws.
Class boundaries between adjacent classes are the (birth & death rates registered in NSO)
midpoint between the upper limit of the first class, 4. EXPERIMENTAL METHOD - usually used to find out
and the lower limit of the higher class. cause and effect relationship. Scientific researchers
CUMULATIVE FREQUENCY - is the frequency of the often use this method.
class plus the frequencies for all previous classes.
RELATIVE FREQUENCY (RF) is also known as the SAMPLING TECHNIQUES
percentage frequency. 1. PROBABILITY SAMPLING - This is also called the
SIMPLE RANDOM SAMPLING. In this technique, the
GRAPHS samples are randomly picked and therefore the
After the data has been collected and tabulated, the selection of sample is without any bias.
next step is to sketch the graph to make the data more 2. RESTRICTED RANDOM SAMPLING - This is often times
presentable, easier to understand, and more when the population to be considered is too large.
appealing and pleasing to the reader. a. Systematic Sampling- The selection of sample is
● BAR GRAPH are usually presented to compare data or done by picking every kth element of the population.
to determine which class or interval is common or b. Stratified Sampling - The population is divided into
appears frequently in the text. Rectangular figures or strata (groups) based on their homogeneity or
bars are used to show variations in the frequencies of commonalities.
observations. 3. CLUSTER SAMPLING - This technique is frequently
● LINE GRAPH are used to show trends and increases or applied on geographical basis when the population
decreases in sales, scores, body temperatures of from which a sample is to be selected includes
patients, enrolment of students in certain courses, or heterogeneous groups.
population per year, a line graph is more appropriate to 4. MULTI-STAGE SAMPLING - a combination of several
use than a bar graph sampling techniques.
● PIE CHART is useful when presenting the sizes of
components that make up a certain whole entity. MEASURES OF CENTRAL TENDENCY
● FREQUENCY HISTOGRAM - The frequency will be A single value that is used to identify the “center” of the
represented by points in the vertical axis and the class data. it is thought of as a typical value of the
intervals in the horizontal axis. distribution. Precise yet simple. Most representative
● FREQUENCY POLYGON - Unlike in the frequency value of the data. It is the center of concentration of
histogram where bars drawn side by side are used, scores in any set of data
points connected by line segments are utilized in the ● MEAN - To find the arithmetic mean, add all the items
frequency polygon. It looks like a usual line graph or observations then, divide the sum by the total
except for the labels in the horizontal axis which are number of observations. Total average
class intervals. ● MEDIAN - The median is the midpoint of an array of
numbers or observations. Let us denote the median by
METHODS OF COLLECTING DATA the symbol Md.
1. DIRECT OR INTERVIEW METHOD - In this method, the ● MODE - The mode is the observation that appears the
researcher has a direct contact with the interviewee. most number of times in a distribution.
The researcher obtains the information needed by MEASURES OF LOCATION/POSITION
asking questions and inquiries from the interviewee. Fractiles/quantiles are measures of location or position
which include not only central location but also any
position based on the number of equal divisions in a KURTOSIS - degree of how peak or how flat a curve of
given distribution. distribution is with respect to the normal distribution
● QUARTILES - Divide a distribution into four equal parts. curve.
● DECILES - values that divide a distribution into 10 equal 1. MESOKURTIC - it is defined as the distribution whose
parts. kurtosis is that of normally distributed curve. When the
● PERCENTILES - values that divide the distribution into value of Ku is equal to 0.263, the curve is a normal
100 equal parts. curve or mesokurtic.
2. LEPTOKURTIC - It is defined as the distribution whose
MEASURE OF VARIATION curve of distribution is more peaked than that of a
It will enable you to know how varied the observations normally distributed curve. When the value of Ku is
are, whether there are extreme values in the less than 0.263, the curve is a normal curve or
distribution or whether the values are very close with leptokurtic or thin.
each other. 3. PLATYKURTIC - The distribution with flatter curve of
distribution than that of a normally distributed curve.
● RANGE -simplest form of measuring the variation of a When the value of Ku is greater than 0.263, the curve is
distribution. a normal curve or platykurtic or flat.
● MEAN ABSOLUTE DEVIATION - subtract the mean
score from each raw score, then, using the absolute SLP 2
values of the differences, get the sum of the results. ● Normal Distribution is one of the most important
● The sum is called the sum of the deviations from the continuous probability distributions used in statistics.
mean. Many occurrences in both natural and social sciences
● VARIANCE - another measure of variation which can be can be represented by the normal distribution.
used instead of the range. ● One of the assumptions in statistical estimation and
● STANDARD DEVIATION -standard deviation, σ (sigma) hypothesis-testing problems concerning means and
for a population or s for a sample, is the square root of variances is the condition of normality. Also known as
the value of the variance. GAUSSIAN DISTRIBUTION - a function whose
● COEFFICIENT OF VARIATION - When it is necessary to probability that any real observation will fall between
compare the variability of two or more groups, the task any two real limits or real numbers, as the curve
is easy if the means are the same. approaches zero on either side.
● QUARTILE DEVIATION - another way of determining ● The graph of the normal distribution is often called the
the spread of a distribution in terms of quartiles. normal curve or the bell-shaped curve.
● PERCENTILE RANGE - is the difference between the ● The normal distribution is also known as the
90th percentile and the 10th percentile. DeMoivre’s curve or the Gaussian distribution in
honour of DeMoivre and Gauss (1777-1855) who gave
MEASURES OF SHAPE independent derivations of the mathematical equation
● COEFFICIENT OF QUARTILE DEVIATION - Another for the normal curve. Abraham DeMoirve and Karl
measure of relative dispersion which can be used if the Gauss
1st and 3rd quartiles are known.
● SKEWNESS - the measure of the shape of the curve. It ● The standard distribution or unit normal distribution
refers to the symmetry or asymmetry of the frequency is used in problems involving the normal distribution.
distribution. Gives information whether the ● Areas under a curve can be computed using z-score
distribution is normal or abnormal. and z-table.
1. Negatively Skewed - Skewed to the Left, Sk = – CASES
2. Normal - Skewed Normaly, Sk = 0 1. BETWEEN TWO Z-SCORES - get the area of the given
3. Positively Skewed - Skewed to the Right, Sk = + z-scores then find the difference.
2. BETWEEN MEAN & 1 Z-SCORE - the area of mean is 2. INTERVAL ESTIMATE - It is a range of numerical value
equal to 0.50, find the area of given z-score, then that approximates the parameter.
find the difference of the two
3. > Z-SCORE - get the area of the given z-score ESTIMATOR - refers to a measurement taken from a
then,subtract it to 1. sample gathered to represent the population. A good
4. < Z-SCORE - whatever the value of the area of the estimator should be…
given z-score, that’s already the answer, or AS IS! ● Unbiased
● Consistent
SAMPLING & SAMPLING DISTRIBUTIONS ● Relatively efficient
RANDOM SAMPLE is a sample of size n chosen from a
finite population of size N, such that each subset of size CONFIDENCE INTERVAL - interval estimate constructed
n has equal probability of being selected. This kind of from a point estimate and a margin of error.
sample is also called as simple random sample. MARGIN OF ERROR - refers to the maximum estimate of
SAMPLING DISTRIBUTION - probability distribution of a how far the parameter can deviate from the point
statistic. estimate given a certain confidence level. It is the
● PARAMETER - It is any quantity that describes the average distance between the parameter and the
characteristic of a population sample measure.
● STATISTIC - It is any quantity that describes the CONFIDENCE LEVEL -”degree of confidence” or “degree
characteristic of a sample. A random variable of certainty of the confidence interval”, tells us how
computed from the values in a sample. sure we would like to be that the confidence interval
SAMPLING VARIABILITY - It describes the occurrence of contains the parameter it estimates.
a variation of a particular value of statistic from sample POPULATION PORTION can be estimated by a
to sample depending on the certain sample selected confidence interval using z-statistic. Thepopulation
from a given population. proportion is also referred to as the TRUE
SAMPLING DISTRIBUTION OF SAMPLE MEAN - In cases PROPORTION.
when we want to make inferences about the
population, we naturally consider studying the samples
from the population.
QUIZ
THE CHEBYSHEV’S THEOREM 1. The total area under the normal curve is equal to one
- It was proven by Pafnuty Chebyshev, a Russian = CORRECT
Mathematician. 2. A point estimate is constructed from an interval
The theorem stated that the proportion of any estimate and a margin of error = WRONG
distribution lying within standard deviations of the 3. To find the arithmetic mean, add all the items or
mean is always at least. Is valid for any distribution of observations then, divide the sum by the total
data. It can be used for samples or populations. number of observations = CORRECT
4. As the margin of error increases, the standard
ESTIMATION OF PARAMETERS deviation decreases = WRONG
the process of approximating a parameter based on a 5. As the sample size increases, the margin of error
statistic. decreases = CORRECT
A parameter refers to the population measure while
statistic refers to the sample measure.

TYPES OF ESTIMATE
1. POINT ESTIMATE - A single numerical value that
approximates a parameter.

You might also like