0% found this document useful (0 votes)

44 views

L1-D3 Concepts of Data Analysis

Uploaded by

Simar

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views

L1-D3 Concepts of Data Analysis

Uploaded by

Simar

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Concepts of Data Analysis

www.infocepts.com
Data analysis
“Data analysis is a process of inspecting, cleansing, transforming, and modeling data
with the goal of discovering useful information, informing conclusions,
and supporting decision-making” -Wikipedia
Attributes Quantitative Qualitative
Helpful in “answering questions of who, where, how many, how Provides naturally occurring information and assists in answering why and
Usage much, and what is the relationship between specific variables” how questions
Type of Data Hard data are collected, as they are in the form of numbers, Soft data are collected, as they are in the form of words (texts, images,
counts and other statistical formulae artefacts, narratives) and everything else.
Clear and formulated conventions for data analysis and process Methods of data analysis are not clearly formulated and process is not
is predictable predetermined.
Process Data analysis is usually done at the end when all data has been Data is analyzed as they are collected because data collection and analysis
collected in a linear fashion are interactive and occur in overlapping cycles
Not flexible and is usually difficult to follow-up on promising Flexible and allows adjustments during data collection through
hunches supplementary questions to gather additional data
Standardized data is collected through measuring either Huge amounts of data need to be summarized and interpreted
qualitative or quantitative variables
Approach
The analyst seeks to verify or test a theory and the approach The analyst lets the data and the interpretation of it, guide analysis
tends to be confirmatory without any assumption and the approach tends to be exploratory

Relationships between independent and dependent variables is Focus on the meaning of events and actions as expressed by the
Focus
of major concern (tends to be variable-centric) participants (case-centric)
Statistical and probability techniques mostly driven by Non –mathematical or non-numerical methods such as content
Tools & Techniques
mathematical and numerical methods analysis, ground theory, conversation analysis etc…

www.infocepts.com
Visual analysis of data - Scatter Plot Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Order
33 26 24 21 19 20 18 18 52 56 27 22 18 49 22 20 23 32 20 18 19 20 30 22 23 32 25 53 57 50
 Count
Visual analysis helps in determining repeated patterns, high
level linear trend and outliers of data.
 The analysts use Scatter plot as first visualization to see data
patterns
 They are very helpful in understanding relationship between
two quantitative variables, visually
 It makes it easy to identify Clusters, patterns and outliers for
further analysis

Patterns and outliers in Scatter Plots

Negative Relationship Positive Relationship Linear Relationship Non-linear Relationship

Outlier

Cluster

www.infocepts.com 3
Introduction to Statistics
Descriptive Statistics

www.infocepts.com
Population and Samples

In Statistical Analysis, the data to be analysed is termed as population.

The Population size (N) may be known or unknown. E.g. the population can
be world population infected by swine-flu for swine-flu infection analysis,
which is difficult to quantize.
A Sample is always a portion of the population and it’s size (n) is always
known
 A population consists of the set of all entities for which the Analysis is
performed.
 A sample is a subset of the measured entities selected from the
population.
 When population data is unrealistic to collect, the analysts use random
sample of data and infer for population.(E.g. Spread of Swine-flu in the
world)
 Sampling from the population is often done randomly, such that every possible sample of equal size (n) will have an equal

chance of being selected. Such a sample is called Simple Random Sample.

www.infocepts.com
Descriptive Statistics

 Descriptive Statistics refers to statistical methods of describing and summarizing data The Descriptive statistics provide following information about data

using tabular, visual, and quantitative techniques.  Central Tendency is an average value of any distribution of data that best represents the
 It Provides a summary of numerical statistical measures that describe location, dispersion, middle. Also called centrality.

and shape for sample data  Dispersion or variability : Dispersion describes the spread or scattering of data from its
 A variable is a single characteristic of the data. central location.

Measures of Dispersion Measures of Central Tendency

 Range is Difference between maximum and minimum values  Mean is average of all values of a variable in sample

 Interquartile Range is Difference between third and first quartile (Q3 - Q1)  Median is value at the centre of the ordered values of a variable in a sample .

 Variance is Average*of the squared deviations from the mean  Mode is most repeated value in the sample. There can be more than one Mode for a

 Standard Deviation is Square root of the variance sample.

www.infocepts.com 6
Measures of central Tendency
Mean Mode

 The Mean is the arithmetic average of data values

 Mode is the value that is repeated highly or has highest frequency or count of records
 Mean = sum of values divided by the count of values

  Mode is Not affected by extreme values (outliers)

Affected by extreme values (outliers)

Population Mean Sample Mean  It is used for either numerical or categorical data where mean or median has no meaning.

N = Population Size
n = Sample Size
 There may be no mode or several modes in a data set. Hence, it is seldom used unless absolutely required.
N

x
n


x  x 2    xN
 1 i1
i
x i
x1  x 2    x n
N N x i1

n n

Median Choosing Right Measure

  Mean is generally used, unless extreme values (outliers) exist

In an ordered (sorted) data set, the median is the “middle” value at n or N/2 location in ordered data set, i.e.,

the number that splits the data set in half

 The Median is often used, when the data is highly skewed

  Mode is used when mean or median are not useful or data is categorical.
When n or N or count of values is even then median is computed as average of two middle values.

 The median is not affected by extreme values (outliers)

For example, when a television retailer decides how many of each screen size to stock, the mean 32.53 of screen-sizes

of television set sold will not help, because there is no television of screen-size 32.53-inch. Knowledge that the mode
 Useful when data are highly skewed is 30 inches would tell him the screen-size of television that is sold most.

www.infocepts.com
Sample
Measure of Central Tendency
Illustration I

Sample
Mean =

What is average order receipt rate for September 2018?

What percent of days minimum sale was recorded?

= 28.83 orders/day ~ 29 orders per day

Median for even sample size = sum of 15th and 16th values /2

= 23 + 23 / 2 = 23

Mode = 18 (six times (days) the value is repeated)

Minimum value = 18

Average order receipt per day for September 2018 is 29 Percent of month that had minimum business = 6/30 = 0.2

orders/day
We can say that 20% of September had minimum business

20% of September 2018 had minimum business

Measures of dispersion
Range

 Range is span of values of a variable that appeared in data

 It helps understand the boundaries of spread of values of a variable in a sample.

 It is calculated as

 As it is dependent on maximum and minimum values, it is sensitive to Outliers.

Percentile Quartile & interquartile range

th
For any particular number r between 0 and 100, the r percentile is a value such that r percent of the observations in  th th th
25 ,50 and 75 percentiles are called quartiles
th
the data set fall at or below that value. E.g. 95 Percentile means 95% of data is smaller than the value.
th
 Quartiles distribute dataset in 4 equal sets which has 25% records each
The most common way to compute the r percentile is to
1.  th th
25 and 75 percentiles are called first and third quartile respectively
Order the data values from smallest to largest
2. th
Calculate the rank of the r percentile using the formula  th th
0 quartile is minimum value and 4 quartile is maximum value.

 th nd
50 percentile or 2 quartile is called median
3. Round I to nearest integer
4.  The difference between the values at first and third quartile is called interquartile range (IQR). It contains middle 50% of data.
Locate the value at the position I from smallest end
5. th
That value is the r percentile value
Hence it is not sensitive to outliers

 IQR is used to find outliers which is defined as values that are below Q1-1.5*IQR or above Q3+1.5*IQR

www.infocepts.com
Sample
Measure of Dispersion – Range & Quartile
Illustration II Day of Month

Find the Outliers in the data?

outliers

What is the range of receipt of orders per day on days

when business is higher than minimum ? Q1

Order Count
st th
1 Quartile location = 30 * 25/100 = 30 * 0.25 = 7.5 ~ 8 location

th
Q1 = value at 8 location = 20

3rd Quartile location = 30 * 75/100 = 30 * 0.75 = 22.5 ~ 22nd location

nd
Q3 = value at 22 location = 32
Any order count less than 2 or greater than 50.25 is a probable

outlier and need further analysis in details. Inter Quartile Range = IQR = 32 – 20 = 12

Q3 Outlier value on minimum side = 20 – (12*1.5) = 20 – 18 = 2

Half of the month, when business was more than minimum, the
Outlier value on maximum side = 32 + (1.5*12) = 32 + 18 = 50.25
order receipts per day was in range from 20 to 32 orders .
So any value < 2 or > 50.25 is a probable outlier

50% of the days, the order receipt was between 20 and 32. So we can say that

Half of the month when the business was more than minimum, the order receipts per day was in range from 20 to 32.
Variance & Standard Deviation
 Spread of data from its mean is called deviation from mean

 Deviation is calculated as

 The sum of all deviation in a population or sample is zero. Hence it is difficult to find single deviation for entire

sample or population.

 Hence, it is squared and then averaged like mean, which gives single value for population or sample called
2
variance denoted by sigma square for population. ( ) and s for sample

 Variance is in squared units unlike deviation. To describe data in its own units, square root of variance was
Population Variance and standard Sample Variance and standard deviation
defined as standard deviation and denoted by sigma (σ) for population and s for sample.
deviation
 Standard deviation is used to state the approximate percentage of values that may lie within a k time of

standard deviations from the mean of a data set, if the data are normally distributed. (μ±kσ), generally k is 1,2

or 3.

www.infocepts.com
Sample Measure of Dispersion – Variance & standard deviation
Illustration III

What is the range of receipt of orders per day 80% of

the September 2018 ?

= mean = 29

N = 30

Variance = = 4839 / 30 = 161.3

Standard deviation = σ = = ±12.7 ~ ±13

80% of the month, the order receipts per day was in range from

16 to 42 orders .

From our plot around 80% of values lie between 16 and 42

Statistical Distribution
Normal Distribution, Skewness, Kurtosis and
central limit theorem

www.infocepts.com
Statistical Distribution
Characteristics of Normal Distribution

 The distribution is symmetric, so its measure of skewness is zero.

 A statistical distribution is a graphical depiction of frequency counts or probabilities for various values of a
 The mean, median, and mode are all equal. Thus, half the area falls above the mean and half falls below it.
variable that can occur.

 Distributions are important because most of the analyses done in business statistics are based on the
The empirical rules apply exactly for the normal distribution is

 Around 68.3% observations will fall within 1 standard deviation of mean

characteristics of a particular distribution.

 In statistical experiments involving chance, outcomes occur randomly. Hence, probability of occurrence of
Around 95.4% observations will fall within 2 standard deviation of mean

 Around 99.7% observations will fall within 3 standard deviation of mean

values is used to study distribution.

 A random variable is a variable that contains the outcomes of a chance experiment.

 Experimental findings show that most commonly seen distribution of random variable probabilities, in nature

and man made things is normal distribution. Hence most commonly used distribution is normal distribution, in

statistics.

www.infocepts.com 14
Frequency Distribution
 Frequency is number of occurrence of value or a range of values or category in data set. Skewness

 The categorical data is grouped in categories and non categorical numeric data can be group in ranges. Each category  Skewness is measure of the degree of asymmetry of a frequency distribution
or group is called Class.  Coefficient of skewness is between -1 and 1 for symmetric skewness coefficient is 0.
 Frequency Table shows frequency for each class with limits of each class

 Histogram is the graph that plots data from frequency table.

 Frequency distribution curve helps analysts study distribution of data in various classes and verify hypothesis about

data to infer conclusions from it using inferential statistics.

Frequency Table and Histogram of our sample data

Kurtosis
Frequency

Measure of flatness or peakedness of a frequency distribution

Class

www.infocepts.com
Central Limit Theorem n=5
n = 20
0. 2 5
0. 2

0. 2 0

0. 1 5
0. 1
0. 1 0

When sampling from a population with mean μ and finite standard deviation σ, the sampling 0. 0 5

0. 0 X
0. 0 0
X

distribution of the sample mean will tend to a normal distribution with mean μ and standard
Large n

deviation ; as the sample size becomes large (n >30).
0. 4

n 0. 3

0. 2

0. 1

0. 0 X
-


 Central Limit theorem applies for Sampling population of any Distribution.

 Hence it is a custom that Sample must consist 30 or more observations.

 In case it is not possible to have 30 or more observations then sample must be tested for normal distribution.

www.infocepts.com 16
Training &

Development

Thank You !

International Economics 4th Edition Feenstra Test Bank
100% (2)
International Economics 4th Edition Feenstra Test Bank
101 pages
Ethical Dilemmas of Confidentiality With Adolescent Clients Case Studies From Psychologists
No ratings yet
Ethical Dilemmas of Confidentiality With Adolescent Clients Case Studies From Psychologists
26 pages
Noodle Analytics Case
No ratings yet
Noodle Analytics Case
28 pages
Guesstimates
No ratings yet
Guesstimates
8 pages
Guesstimate Session - PGP31 Finals
No ratings yet
Guesstimate Session - PGP31 Finals
10 pages
Case Book
No ratings yet
Case Book
16 pages
L1-D5 Inference and Presentation
No ratings yet
L1-D5 Inference and Presentation
10 pages
Guesstimate S
No ratings yet
Guesstimate S
10 pages
Con Club - Fms Delhi: Online Induction Learning - Batch 2021 - Activity 3
No ratings yet
Con Club - Fms Delhi: Online Induction Learning - Batch 2021 - Activity 3
3 pages
Gbs As A Digital Transformation Engine A Practical How To Guide
No ratings yet
Gbs As A Digital Transformation Engine A Practical How To Guide
37 pages
Gyan Capsule 1 - Introduction To Consulting PDF
No ratings yet
Gyan Capsule 1 - Introduction To Consulting PDF
9 pages
L1-D2 Basics of Data Preperation and Quality
100% (1)
L1-D2 Basics of Data Preperation and Quality
17 pages
Guestimate
No ratings yet
Guestimate
11 pages
Business Math
No ratings yet
Business Math
8 pages
Management Consulting and Case Solving For Dummies: 11. Guestimates
No ratings yet
Management Consulting and Case Solving For Dummies: 11. Guestimates
6 pages
Hindware Kitchen Ensemble Catalogue
0% (1)
Hindware Kitchen Ensemble Catalogue
20 pages
Banking Basics SBI PDF
No ratings yet
Banking Basics SBI PDF
90 pages
PM School-Smytten-Vishad
No ratings yet
PM School-Smytten-Vishad
14 pages
Motivation and Values: by Michael R. Solomon
No ratings yet
Motivation and Values: by Michael R. Solomon
34 pages
A Practical Guide To Conjoint Analysis
100% (1)
A Practical Guide To Conjoint Analysis
8 pages
Consulting Compendium
100% (1)
Consulting Compendium
87 pages
CRUX HR Casebook 2021-22
100% (1)
CRUX HR Casebook 2021-22
19 pages
Analytics - PrepBook 2018 PDF
No ratings yet
Analytics - PrepBook 2018 PDF
34 pages
Capacity Change Framework - Example Case Interview
No ratings yet
Capacity Change Framework - Example Case Interview
12 pages
Traditional Conjoint Analysis With Excel
No ratings yet
Traditional Conjoint Analysis With Excel
9 pages
SigmaEta Guesstimate PDF
No ratings yet
SigmaEta Guesstimate PDF
10 pages
Analytics - PrepBook 2018 Laterals
No ratings yet
Analytics - PrepBook 2018 Laterals
34 pages
Next Leap Project My Gate
No ratings yet
Next Leap Project My Gate
9 pages
Mental Health Startup
No ratings yet
Mental Health Startup
5 pages
The Fms Consulting Casebook 2021 22
No ratings yet
The Fms Consulting Casebook 2021 22
140 pages
T3 - Customer-Driven Marketing Strategy - Lecture Slides - 213 (Required)
No ratings yet
T3 - Customer-Driven Marketing Strategy - Lecture Slides - 213 (Required)
73 pages
02-03 ASAP Business Analytics-2 Descriptive Statistics
No ratings yet
02-03 ASAP Business Analytics-2 Descriptive Statistics
109 pages
Iima Beta - Pe & VC Primer
No ratings yet
Iima Beta - Pe & VC Primer
23 pages
BITS Pilani: Strategic Management Institute of Rural Management, Anand
No ratings yet
BITS Pilani: Strategic Management Institute of Rural Management, Anand
45 pages
Unit 3 Customer Analytics
No ratings yet
Unit 3 Customer Analytics
29 pages
Explore What Power BI Can Do For You: Angeles University Foundation College of Computer Studies
No ratings yet
Explore What Power BI Can Do For You: Angeles University Foundation College of Computer Studies
58 pages
Frameworks and Dos and Donts of Guestimates
No ratings yet
Frameworks and Dos and Donts of Guestimates
13 pages
How To Measure & Reach Product Market Fit (Jeff Chang)
No ratings yet
How To Measure & Reach Product Market Fit (Jeff Chang)
32 pages
MT416 - BCommII - Introduction To Business Analytics - MBA - 10039 - 19 - PratyayDas
No ratings yet
MT416 - BCommII - Introduction To Business Analytics - MBA - 10039 - 19 - PratyayDas
44 pages
Unit 8 SOM SLM
No ratings yet
Unit 8 SOM SLM
18 pages
Lecture 2 - Introduction To Game Theory PDF
No ratings yet
Lecture 2 - Introduction To Game Theory PDF
30 pages
Marketing Analytics: PPT-9 (Marketing Metrics X-Ray)
No ratings yet
Marketing Analytics: PPT-9 (Marketing Metrics X-Ray)
18 pages
Marketing Information and Research
No ratings yet
Marketing Information and Research
29 pages
BCG Case Study
No ratings yet
BCG Case Study
2 pages
MKTG1100 Chapter 4
100% (1)
MKTG1100 Chapter 4
28 pages
Consult Club IIMA - Cook Book 3
No ratings yet
Consult Club IIMA - Cook Book 3
13 pages
Consumer Research Overview: Nmims T4 June 2022
No ratings yet
Consumer Research Overview: Nmims T4 June 2022
29 pages
Session 2 - Excel Fundamentals For Data Exploration
No ratings yet
Session 2 - Excel Fundamentals For Data Exploration
56 pages
BCG Matrix: Question Marks???
No ratings yet
BCG Matrix: Question Marks???
6 pages
Types of Analytics: What Is Descriptive Analytics?
No ratings yet
Types of Analytics: What Is Descriptive Analytics?
3 pages
BCG Matrix and Its Significance in Product Mix Analysis - NCK Pharma Solution Private Limited - Powered by Comm100 PDF
No ratings yet
BCG Matrix and Its Significance in Product Mix Analysis - NCK Pharma Solution Private Limited - Powered by Comm100 PDF
6 pages
The Self: Consumer Behavior
No ratings yet
The Self: Consumer Behavior
28 pages
A Whole New Ball Game: Navigating Digital Change in The Sports Industry
No ratings yet
A Whole New Ball Game: Navigating Digital Change in The Sports Industry
8 pages
MITx SCX KeyConcept SC1x FV
No ratings yet
MITx SCX KeyConcept SC1x FV
70 pages
Analytics PrepBook AnSoc 2017 PDF
100% (1)
Analytics PrepBook AnSoc 2017 PDF
41 pages
BEM2044 W1 Introduction To Qualitative Marketing Research and Research Philosophy-1
No ratings yet
BEM2044 W1 Introduction To Qualitative Marketing Research and Research Philosophy-1
23 pages
Negotiating Skills - PPT - Developing Differentiated Negotiation Strategies
No ratings yet
Negotiating Skills - PPT - Developing Differentiated Negotiation Strategies
21 pages
Maintaining and Monitoring The Online Presence
No ratings yet
Maintaining and Monitoring The Online Presence
6 pages
How to Prepare GMP Audit
No ratings yet
How to Prepare GMP Audit
33 pages
Benchmarking SME Banking Practices
100% (2)
Benchmarking SME Banking Practices
8 pages
Presentation of Data
No ratings yet
Presentation of Data
25 pages
02Data (2)
No ratings yet
02Data (2)
36 pages
Data Democratization
No ratings yet
Data Democratization
50 pages
Ecircular: Non-Financial Service (NFS) Requests Introduction of Standard Customer Request Form (CRF)
100% (1)
Ecircular: Non-Financial Service (NFS) Requests Introduction of Standard Customer Request Form (CRF)
7 pages
L1-D1 Business Foundation
No ratings yet
L1-D1 Business Foundation
15 pages
Northwestern Kellogg Product Strategy Intro Webinar 08.17.2020 PDF
No ratings yet
Northwestern Kellogg Product Strategy Intro Webinar 08.17.2020 PDF
51 pages
1 s2.0 S2212017312005956 Main
No ratings yet
1 s2.0 S2212017312005956 Main
8 pages
Design of Scaffolding For Drainage
No ratings yet
Design of Scaffolding For Drainage
3 pages
22605 2024 Summer Question Paper
No ratings yet
22605 2024 Summer Question Paper
2 pages
HCI - Assignment # 01
No ratings yet
HCI - Assignment # 01
2 pages
Lecture - 12-13 Prosody
No ratings yet
Lecture - 12-13 Prosody
31 pages
Foundation Engineering Course Content
No ratings yet
Foundation Engineering Course Content
3 pages
Stochastic Interacting System PDF
No ratings yet
Stochastic Interacting System PDF
346 pages
8D Report Training MID
No ratings yet
8D Report Training MID
23 pages
Forecasting Stock Performance in Indian Market Usi
No ratings yet
Forecasting Stock Performance in Indian Market Usi
25 pages
Overheads (A)
No ratings yet
Overheads (A)
8 pages
Induction Course For New History Panel Chairpersons (New) : MR FONG Ho-Nam, Nelson
No ratings yet
Induction Course For New History Panel Chairpersons (New) : MR FONG Ho-Nam, Nelson
38 pages
1.1 Standards of Length, Mass, and Time: Physics
No ratings yet
1.1 Standards of Length, Mass, and Time: Physics
5 pages
Logcat
No ratings yet
Logcat
26 pages
0077 - 019035-C-Hp-16-Rc-0077-A-Rc Detail of Slab at Terrace Floor Level (SH 1 of 2)
No ratings yet
0077 - 019035-C-Hp-16-Rc-0077-A-Rc Detail of Slab at Terrace Floor Level (SH 1 of 2)
1 page
Lev Vygotsky
No ratings yet
Lev Vygotsky
8 pages
Malfunction Phenomenon: Iphone 6S Plus Could Not Be Charged
No ratings yet
Malfunction Phenomenon: Iphone 6S Plus Could Not Be Charged
3 pages
Communication and Production
No ratings yet
Communication and Production
3 pages
Rubrics
No ratings yet
Rubrics
1 page
Compressor Parts
No ratings yet
Compressor Parts
7 pages
Commercial Kitchen Ventilation
No ratings yet
Commercial Kitchen Ventilation
44 pages
MS-T - Instruction Manual IB (NA) 0200004ENG-A (09.13)
No ratings yet
MS-T - Instruction Manual IB (NA) 0200004ENG-A (09.13)
32 pages
Degrees of Protection Provided by Enclosures (IP Code) (Identical National Adoption)
0% (1)
Degrees of Protection Provided by Enclosures (IP Code) (Identical National Adoption)
14 pages
Dedm PDF
No ratings yet
Dedm PDF
115 pages
SMA Sunny Boy SB3.0-7.7-US-40 Specs
No ratings yet
SMA Sunny Boy SB3.0-7.7-US-40 Specs
3 pages
TC500-550 User Manual
No ratings yet
TC500-550 User Manual
125 pages
Questions For Practicing With Key
No ratings yet
Questions For Practicing With Key
35 pages
CH1 - Audit An Overview
No ratings yet
CH1 - Audit An Overview
17 pages
A20 Datasheet V1.1 20130321
No ratings yet
A20 Datasheet V1.1 20130321
35 pages

L1-D3 Concepts of Data Analysis

Uploaded by

L1-D3 Concepts of Data Analysis

Uploaded by

Concepts of Data Analysis

Patterns and outliers in Scatter Plots

Negative Relationship Positive Relationship Linear Relationship Non-linear Relationship

In Statistical Analysis, the data to be analysed is termed as population.

chance of being selected. Such a sample is called Simple Random Sample.

Measures of Dispersion Measures of Central Tendency

 Standard Deviation is Square root of the variance sample.

 The Mean is the arithmetic average of data values

  Mode is Not affected by extreme values (outliers)

Median Choosing Right Measure

  Mean is generally used, unless extreme values (outliers) exist

the number that splits the data set in half

 The median is not affected by extreme values (outliers)

What is average order receipt rate for September 2018?

What percent of days minimum sale was recorded?

Mode = 18 (six times (days) the value is repeated)

20% of September 2018 had minimum business

 Range is span of values of a variable that appeared in data

 It helps understand the boundaries of spread of values of a variable in a sample.

 As it is dependent on maximum and minimum values, it is sensitive to Outliers.

Percentile Quartile & interquartile range

Find the Outliers in the data?

What is the range of receipt of orders per day on days

when business is higher than minimum ? Q1

3rd Quartile location = 30 * 75/100 = 30 * 0.75 = 22.5 ~ 22nd location

Q3 Outlier value on minimum side = 20 – (12*1.5) = 20 – 18 = 2

What is the range of receipt of orders per day 80% of

the September 2018 ?

Variance = = 4839 / 30 = 161.3

Standard deviation = σ = = ±12.7 ~ ±13

From our plot around 80% of values lie between 16 and 42

 The distribution is symmetric, so its measure of skewness is zero.

 Around 68.3% observations will fall within 1 standard deviation of mean

 Around 99.7% observations will fall within 3 standard deviation of mean

 A random variable is a variable that contains the outcomes of a chance experiment.

 Histogram is the graph that plots data from frequency table.

data to infer conclusions from it using inferential statistics.

Frequency Table and Histogram of our sample data

Measure of flatness or peakedness of a frequency distribution

 Central Limit theorem applies for Sampling population of any Distribution.

 Hence it is a custom that Sample must consist 30 or more observations.

You might also like