0% found this document useful (0 votes)
4 views

unit 1 DS vs IS

Uploaded by

jyotibh966
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

unit 1 DS vs IS

Uploaded by

jyotibh966
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Difference between Descriptive and Inferential

Statistics
Descriptive statistics provide a summary of the features or attributes of a dataset, while inferential
statistics enable hypothesis testing and evaluation of the applicability of the data to a larger population.
Here are the key differences between descriptive vs inferential statistics:

Descriptive Statistics Inferential Statistics

Make inferences and draw conclusions about a


Purpose Describe and summarize data
population based on sample data

Analyzes and interprets the Uses sample data to make generalizations or


Data Analysis
characteristics of a dataset predictions about a larger population

Focuses on the entire population or Focuses on a subset of the population (sample)


Population vs Sample
dataset to draw conclusions about the entire population

Estimates parameters, tests hypotheses, and


Provides measures of central
Measurements determines the level of confidence or
tendency and dispersion
significance in the results

Hypothesis testing, confidence intervals,


Mean, median, mode, standard
Examples regression analysis, ANOVA (analysis of
deviation, range, frequency tables
variance), chi-square tests, t-tests, etc.

Generalize findings to a larger population, make


Goal Summarize, organize, and present data predictions, test hypotheses, evaluate
relationships, and support decision-making

Population Estimated using sample statistics (e.g., sample


Not typically estimated
Parameters mean as an estimate of population mean)

Sample Crucial; the sample should be representative of


Not required
Representativeness the population to ensure accurate inferences
Inferential statistical techniques are used to analyze the sample's behavior. These include
the models used for regression analysis and hypothesis testing. The first-step sample is
used to draw conclusions.

types of Inferential Statistics

Inferential Statistics helps to draw conclusions and make predictions based on a data set. It is done
using several techniques, methods, and types of calculations. Some of the most important types of
inferential statistics calculations are:

1. Regression Analysis

Regression models show the relationship between a set of independent variables and a dependent
variable. This statistical method lets you predict the value of the dependent variable based on
different values of the independent variables. Hypothesis tests are incorporated to determine whether
the relationships observed in sample data actually exist in the data set.

2. Hypothesis Tests

Hypothesis testing is used to compare entire populations or assess relationships between variables
using samples. Hypotheses or predictions are tested using statistical tests so as to draw valid
inferences.

3. Confidence Intervals

The main goal of inferential statistics is to estimate population parameters, which are mostly
unknown or unknowable values. A confidence interval observes the variability in a statistic to draw an
interval estimate for a parameter. Confidence intervals take uncertainty and sampling error into
account to create a range of values within which the actual population value is estimated to fall.

Each confidence interval is associated with a confidence level that indicates the probability in the
percentage of the interval to contain the parameter estimate if you repeat the study.

Example of Descriptive Statistics

Examples of descriptive statistics are used to enumerate and explain a dataset's key characteristics.
Measures like mean, median, mode, range, variance, and standard deviation are some examples. For
instance, you could use descriptive statistics to determine the average age, the age distribution, and
the age standard deviation of a group of individuals if you wanted to summarize their ages.

Example of Inferential Statistics

Using a sample of data, inferential statistics is used to draw conclusions or generalizations about a
broader population. Examples include regression analysis, confidence ranges, and hypothesis testing.
For instance, you could use inferential statistics to assess whether there is a significant difference in
the outcomes of patients who receive the drug compared to those who receive a placebo if you want
to know if a new drug is effective.

Introduction
Hypothesis testing is one of the most important techniques applied in various fields such as
statistics, economics, pharmaceutical, mining and manufacturing industries. Suppose we want
to know if something took place if certain medicines are effective, if groups differ from each
other or if one variable predicts another variable.

ll in all, we want to predict if the data collected is statistically significantly different from another. This article is
for anyone who wants to know and understand the concept of hypothesis testing, which is a significant
component of inferential statistics. The 5 steps taken to conduct the hypothesis testing have been explained in
detail.

Alright, let’s begin!

What is Hypothesis Testing?

Hypothesis Testing is an inferential statistical method that is required to use sample data to solve assumptions
about a population parameter (a characteristic that describes a population).

Unlike inferential statistics, descriptive statistics simply describes a data set without helping in drawing
inferences. In this context, inferential statistics is said to go beyond the descriptive statistics. It is particularly
used when it is not possible to examine each data point of the population.

Inferential statistics allows researchers to make generalizations about a


population by using a representative sample. However, since one cannot predict the
behavior of a population accurately in almost all cases, the results are said to be
based on uncertainty.

Further, the sampling error can be observed here. This error occurs if the sample
drawn does not represent the entire population. To prevent this error, it is
recommended to collect a random sample before applying inferential statistics.
Inferential statistics requires logical reasoning to arrive at the results. The procedure
of reaching the outcomes is stated as follows:

a. A sample is chosen from the population that needs to be studied. The chosen sample must
reflect the nature and characteristics of the population.
b. The tools of inferential statistics are applied to the sample to assess its behavior. These
include the regression models and the hypothesis testing models. The former consists of
linear regression, nominal regression, logistic regression, etc., while the latter consists of the
z-test, t-test, f-test, analysis of variance (ANOVA), etc.
c. Inferences are drawn from the sample chosen in the first step. The inferences are
assumptions or estimations related to the entire population.

Types
Let us go through the types of tools used under inferential statistics.

#1 – Regression Analysis
It measures the change in one variable with respect to the other variable. Linear
regression is popularly used in inferential statistics.

#2 – Hypothesis Testing Models


It requires creating the null and alternate hypothesis. Inferences are drawn by
considering the critical value, test statistic, and confidence interval. A hypothesis
test can be two-tailed, left-tailed, and right-tailed. The hypothesis testing models
consist of the following tools:
a) Z-test

Z-test is used when the sample size is greater than or equal to 30 and the data set
follows a normal distribution. The population variance is known to the researcher.
The formulas are given as follows:
Null hypothesis: H0 : μ=μ0

Alternate hypothesis: H1: μ>μ0


where,

 x̄ = sample mean
 μ = population mean
 σ = standard deviation of the population
 n = sample size

b) T-test

T-test is used when the sample size is less than 30 and the data set follows a t-
distribution. The population variance is not known to the researcher. The formulas
are given as follows:
Null Hypothesis: H0: μ=μ0

Alternate Hypothesis: H1: μ>μ0

The representations x̄ , μ, and n are the same as stated for the z-test. The letter “s”
represents the standard deviation of the sample.

c) F-test

F-test checks whether a difference between the variances of two samples or


populations exists or not. The formulas are given as follows:

where,

d) Confidence interval

It suggests the range within which the estimate will fall if the test is conducted on
the population. When the confidence interval is high, one can state confidently that
the sample results reflect the behavior of the population.

Example
Let us consider an example of inferential statistics.

Mr. A wants to open a coffee shop in New York, USA. To design the appropriate
menu, a survey is conducted on 300 residents with the aim of understanding their
tastes and preferences. The survey includes people of different age groups, gender,
and income class. After applying the tools of inferential statistics, the results are
stated as follows:
 70% of women like the caramel macchiato.
 50% of the total residents like café mocha.
 Almost 100% of the adults like Americano coffee.
 25% of teenagers like café latte.

With these outcomes, Mr. A is confident that including all the above varieties of
coffee will bring diverse customers to his shop. Moreover, Mr. A also wants to add
new, innovative flavors to give a rich drinking experience to his customers.

Inferential Statistics vs Descriptive


Statistics
The differences between inferential and descriptive statistics are listed as follows:

Differentiators Inferential Statistics Descriptive Statistics

It helps make inferences about the population


It describes the data set by showing
Definition from which a representative sample has been
a summary of the data points.
drawn.

The tools used are measures of


Tools for The tools used are regression analysis and dispersion (range and standard
analysis hypothesis tests. deviation) and central tendency
(mean, median, and mode).

There is uncertainty as the behavior of the


There is no uncertainty as one
unknown population is predicted from the
Uncertainty describes the data points that have
results of a known sample. This uncertainty is
been actually measured.
reflected in the sampling error.

It is used when a numerical summary


It is used when each data point of the
Applicability or graphical representation of the
population cannot be conveniently examined.
data points is required.

You might also like