Introduction & Basic Concepts in Statistics
Introduction & Basic Concepts in Statistics
Concepts in Statistics
Statistics is used in business and economics. It plays
an important role in the exploration of new markets
for a product, forecasting of business trends, control
and maintenance of high-quality products, improvement
of employer-employee relationship and analysis of data
concerning insurance, investment, sales, employment,
transportation, communications, auditing and
accounting procedures.
STATISTICS is the branch of mathematics that deals
with the theory and method of collecting, organizing,
presenting, analyzing and interpreting data.Two
The average, or measure of the center of a data set, consisting of the mean, median, mode,
or midrange
The spread of a data set, which can be measured with the range or standard deviation
A confidence interval gives a range of values for an unknown parameter of the population by
measuring a statistical sample. This is expressed in terms of an interval and the degree of
confidence that the parameter is within the interval.
Tests of significance or hypothesis testing where scientists make a claim about the population
by analyzing a statistical sample. By design, there is some uncertainty in this process. This can
be expressed in terms of a level of significance.
Two Branches of Statistics
1. Statistical Theory – is concerned with the formulation of
theories, principles, and formulas which are used as bases
in the solution of problems related to Statistics.
2. Statistical Methods – is concerned with the application of
the theories, principles and formulas in the solution of
everyday problems.
OTHER STATISTICAL TERMS:
• POPULATION – a set of data consisting of all conceivable possible observations of a certain
phenomenon. It refers to the totality of the observations. Population is denoted by capital N.
The Likert Scale: strongly disagree, disagree, neutral, agree, strongly agree.
• Interval scale contains all the properties of the ordinal scale, in addition to which, it
offers a calculation of the difference between variables. The main characteristic of
this scale is the equidistant difference between objects.
Interval Scale Examples
There are situations where attitude scales are considered to be interval scales.
Apart from the temperature scale, time is also a very common example of an interval
scale as the values are already established, constant, and measurable.
Calendar years and time also fall under this category of measurement scales.
Likert scale, Net Promoter Score, Semantic Differential Scale, Bipolar Matrix
Table, etc. are the most-used interval scale examples.
Celsius Temperature.
Fahrenheit Temperature.
IQ (intelligence scale).
SAT scores.
Time on a clock with hands.
• Ratio Scale: 4th Level of Measurement
• is defined as a variable measurement scale that not only produces the order of
variables but also makes the difference between variables known along with
information on the value of true zero. It is calculated by assuming that the variables
have an option for zero, the difference between the two variables is the same and
there is a specific order between the options.
• With the option of true zero, varied inferential, and descriptive analysis techniques
can be applied to the variables. In addition to the fact that the ratio scale does
everything that a nominal, ordinal, and interval scale can do, it can also establish the
value of absolute zero. The best examples of ratio scales are weight and height. In
market research, a ratio scale is used to calculate market share, annual sales, the
price of an upcoming product, the number of consumers, etc.
• Examples of Ratio scale
Age
Weight
Height
Sales Figures
Ruler measurements.
1. Collection of data
2. Presentation of data
3. Analysis of data
4. Interpretation of data
Data Collection and Data Presentation
What are DATA?
4. OBSERVATION
This method is a way of collecting data through observing. The observer gains
firsthand knowledge by being in and around the social setting that is being
investigated.
5. EXPERIMENTATION
An experiment is a procedure carried out to support, refute, or validate a
hypothesis. An experiment is a method that most clearly shows cause-and-effect
because it isolates and manipulates a single variable, in order to clearly show its
effect.
DATA PRESENTATION
Once data has been collected, it has to be classified and organized in such a way that it becomes
easily readable and interpretable, that is, converted to information.
1. BAR GRAPH
A bar chart or bar graph is a chart or graph that presents categorical data with
rectangular bars with heights or lengths proportional to the values that they
represent. The bars can be plotted vertically or horizontally.
LINE GRAPH
A line graph is a graphical display of information that changes continuously over time.
A line graph may also be referred to as a line chart. Within a line graph, there are
points connecting the data to show a continuous change. The lines in a line graph can
descend and ascend based on the data. We can use a line graph to compare different
events, situations, and information.
PIE GRAPH
A pie chart is a circular chart divided into wedge-like sectors, illustrating
proportion. Each wedge represents a proportionate part of the whole, and the total
value of the pie is always 100 percent.
Pie charts can make the size of portions easy to understand at a glance. They're
widely used in business presentations and education to show the proportions among a
large variety of categories including expenses, segments of a population, or answers to
a survey.
SCATTER DIAGRAM
A scatter diagram also called a scatterplot, is a type of plot or
mathematical diagram using Cartesian coordinates to display values for typically two
variables for a set of data. If the points are coded (color/shape/size), one additional
variable can be displayed. The data are displayed as a collection of points, each having
the value of one variable determining the position on the horizontal axis and the value
of the other variable determining the position on the vertical axis.
5. PICTOGRAPH/PICTOGRAM
A pictograph is a chart or graph, which uses pictures to represent data. A pictograph
is one of the simplest forms of data visualization.
Two types of Sampling
• Probability sampling
• Simple random
• Systematic
• Stratified
• Cluster
• Non-probability sampling
• Convenience/Accidental
• Judgmental/Purposive
• Quota
• Snowball
Probability vs non-probability sampling
1. Probability or Random Sampling
Provides equal chances to every single element of the population to be
included in the sampling.
2. Non-Probability Sampling
The samples are selected in a process that does not give all the
individuals in the population equal chances of being selected.
Samples are selected on the basis of their accessibility or by the
purposive personal judgment of the researcher.
Probability-based Sampling
Example
Population is 1,000. Desired sample size is 100. Sampling interval is 10
Get a random start from 1 to 10 in the list as first sample and every 10th
in the list
Probability-based Sampling
Stratified Sampling
Used to ensure that different groups in the population are adequately represented in the sample
Step 1. Identify the population and divide the population into different groups or strata according to
criteria.
Step 2. Decide on the sampling size or actual percentage of the population to be considered as sample.
Step 3. Get a proportion of sample from each group
Step 4. Select the respondents by random sampling
Cluster Sampling
Often called geographic sampling
Used in large scale surveys
The population is divided into multiple groups called clusters . The
clusters are selected with simple random or systematic sampling
technique for data collection and data analysis.
Example: the Population includes elementary schools in the Province.
The province is first divided into Districts which are treated as clusters
and are randomly selected. From the districts, the schools can be picked
out at random and then classes and then students are selected at random
Non-Probability Sampling
1. Accidental or Convenience Sampling
Researcher selects subjects that are more readily accessible or
available.
2. Purposive Sampling
Subjects are selected based on the needs of the study.
Non-Probability-based Sampling
Quota Sampling
Researcher takes a sample that is in proportion to some characteristic or trait of the
population
The population is divided into groups or strata (the basis may be age, gender,
education level, race, religion etc.
Samples are taken from each group to meet a quota.
Care is taken to maintain the correct proportions representative of the population.
Example :
The population consists of 60% female and 40% male.
The desired sample size is 200.
Therefore, the sample should consist of ____ females and ____ males.
Non-Probability-based Sampling
A study on science teaching is to be conducted in high schools of a region.
There are 4,641 teachers grouped according to area of specialization.
There are 2,243 biology teachers, 1,406 chemistry teachers and 992 physics
teachers.
The desired sample size is 300.
Select the sample according to the Quota Sampling technique.
Non-Probability-based Sampling
4. Snowball Sampling
This type of sampling starts with known sources of information, who or
which will in turn give other sources of information . As this goes on,
data accumulates.