0% found this document useful (0 votes)

2 views

Week 2 Cheat Sheet

This cheat sheet provides essential mathematical equations and Excel functions for statistics and data analysis, focusing on population vs. sample parameters, descriptive statistics, quartiles, percentiles, and histograms. Key Excel functions include AVERAGE, VAR.P, VAR.S, STDEV.P, STDEV.S, SKEW, KURT, QUARTILE, and PERCENTILE. It also explains how to visualize data distributions using histograms and offers guidance on determining the number of bins.

Uploaded by

raresdynu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Week 2 Cheat Sheet

Uploaded by

raresdynu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Week 2 Cheat Sheet

Stascs and Data Analysis with Excel, Part 1

Charlie Nu©elman

Here, I provide the mathemacal equaons and some of the important Excel funcons required to
perform various calculaons in Week 2 of the course. The headings represent the screencasts in which
you will ﬁnd those calculaons and concepts. Not all screencasts are referenced below – just the ones
that have complex mathemacal formulas or Excel formulas that are tricky to use.

Diﬀerence Between Populaon and Sample

The populaon size is denoted by 𝑁 and the sample size is denoted by 𝑛. Populaon mean (𝜇) can be
esmated using the sample mean (𝑥̅ ). Populaon variance (𝜎 6 ) can be esmated using the sample
variance (𝑠 6 ). Standard deviaon is the square root of variance.

Formulas for these parameters are:

Ç
1
𝜇 = Í 𝑥Ü
𝑁
Ü@5
Ç
1
𝑥̅ = Í 𝑥Ü
𝑛
Ü@5
Ç
1
𝜎 = Í(𝑥Ü − 𝑥̅ )6
6
𝑁
Ü@5
Ç
6
1
𝑠 = Í(𝑥Ü − 𝑥̅ )6
(𝑛 − 1)
Ü@5

Populaon and sample mean (average) can be calculated using the AVERAGE funcon in Excel.
Populaon and sample variances can be calculated using the VAR.P and VAR.S formulas, respecvely,
and the populaon and sample standard deviaons can be calculated using the STDEV.P and STDEV.S
formulas, respecvely. The COUNT funcon is useful in counng the number of observaons.

The Summaon Symbol

The summaon symbol, Σ, or Greek le©er sigma, is used as an indicator to sum over the expression that
follows the symbol. The integer below the symbol (typically wri©en as some index variable equal to 1, or
other number) is the start value for which iteraon and summaon will occur. Above the summaon
symbol is the stop value, or the number at which iteraon and summaon will occur:
For example, if x = {1, 2, 3, 4, 5}, then:
9

Í 𝑥Ü = 𝑥5 + 𝑥6 + ⋯ + 𝑥9
Ü@5

= 1 + 2 + 3 + 4 + 5 = 15
The summaon symbol is used in the deﬁnion and calculaon of average and variance (see below).

Descripve Stascs
Another common measure of spread in a set of data is the range of the data, which is just the maximum
value in the data set minus the minimum value. We can calculate the maximum value of a set of data in
Excel using the MAX funcon and the minimum value using the MIN funcon; the range is simply the
diﬀerence between those two values.

Skewness and kurtosis are somemes used to describe the asymmetry of a set of data when compared
to the normal distribuon. The SKEW and KURT funcons in Excel can determine these parameters. For
more informaon on how to interpret these values, please visit support.microsoL.com.

Quarles and Percenles

For either quarles or percenles, we ﬁrst determine a rank, 𝑘, by using one of the formulas below. We
can either include the median or exclude the median (it is more common to exclude the median). The
parameter 𝑝 is the desired percenle; for quarles, the ﬁrst quarle is the same as the 25th percenle
(𝑝 = 0.25) and the third quarle is the 75th percenle (𝑝 = 0.75).

Including the median: 𝑘 = 𝑝 ∙ (𝑛 − 1) + 1

Excluding the median: 𝑘 = (𝑛 + 1) ∙ 𝑝

Once we have the rank, we can linearly interpolate between ordered values in our data. For example, if
our (ordered) data is: 5, 9, 12, 14, 17, 18, 21, 22, 25 (𝑛 = 9) and we wish to find the first quarle
including the median, we would calculate the rank as 𝑘 = 0.25 ∙ (9 − 1) + 1 = 3. Therefore, the first
quarle in this case is 12. Similarly, the third quarle would be calculated to be 21 (𝑘 = 7).

For the same data set, if we wished to find the first quarle excluding the median, we would calculate
the rank as 𝑘 = (9 + 1) ∙ 0.25 = 2.5. Therefore, we linearly interpolate 50%of the way between the 2 nd
and 3rd values of the ordered data, and the first quarle is 9 + 0.
5 x (12 – 9) = 10.
5. Similarly, the third
quarle would be calculated to be 21.5 (𝑘 = 7.5).
Percenles are calculated exactly the same but 𝑝 can be any connuous value between 0 and 1. For
example, for the above data set if we wanted to calculate the median-excluded 13 th percenle, we
calculate the rank: 𝑘 = (9 + 1) ∙ 0.13 = 1.3. The 13th percenle is then 30% of the way between the 1st
and 2nd of the ordered values = 5 + 0.3 x (9 – 5) = 6.2.

In Excel, we can use the QUARTILE(data,q), QUARTILE.INC(data,q), and QUARTILE.EXC(data.q) to

calculate quarles, where q = 1 for the 1st quarle and q = 3 for the 3rd quarle. We can use the
PERCENTILE(data,p), PERCENTILE.INC(data,p), and PERCENTILE.EXC(data,p) funcons in Excel to
calculate the 100pth percenle (for example, for the 67th percenle p would be 0.67).

Histograms
The best way to visualize the distribuon of univariate data is the use of a histogram. In a histogram,
the data are sorted into “bins” of constant width and frequencies of each bin are plo©ed as a column
chart. We typically esmate a lower bound and an upper bound for the number of bins:

𝑛ÕÜáæ,ßâêØå = 𝐼𝑁𝑇k𝐿𝑂𝐺6(𝑛)o − 1

𝑛ÕÜáæ,èããØå = √𝑛 (typically rounded to the nearest integer)

Here, 𝑛 is the number of observaons or experimental measurements. I like to choose the actual
number of bins to be somewhere between the lower and upper esmates for number of bins.

Excel’s histogram tool (Data  Data Analysis  Histogram) is great for parsing the data into the bins,
but the user must provide the bin boundaries.

SSC JSO Formula Bank
No ratings yet
SSC JSO Formula Bank
17 pages
Permutation and Combinations
From Everand
Permutation and Combinations
Ramesh Chandra
4/5 (36)
stat app ch 2 (3)
No ratings yet
stat app ch 2 (3)
7 pages
MetNum1 2023 1 Week 10
No ratings yet
MetNum1 2023 1 Week 10
79 pages
3) S1 Representation and Summary of Data - Dispersion
No ratings yet
3) S1 Representation and Summary of Data - Dispersion
27 pages
Ch.2 Measures of Location and Spread
No ratings yet
Ch.2 Measures of Location and Spread
1 page
GCE As Level Representation of Dbxbbcata Measures of Central Tendency and Variation
No ratings yet
GCE As Level Representation of Dbxbbcata Measures of Central Tendency and Variation
9 pages
Stats Formulas
No ratings yet
Stats Formulas
54 pages
Note 02
No ratings yet
Note 02
31 pages
Measures of Location and Spread
No ratings yet
Measures of Location and Spread
1 page
Midterms-Day-4 (1)
No ratings yet
Midterms-Day-4 (1)
51 pages
History Reporting
No ratings yet
History Reporting
61 pages
Unlock Scilab13
No ratings yet
Unlock Scilab13
38 pages
Measures of Dispersion - Docx4.18.23
No ratings yet
Measures of Dispersion - Docx4.18.23
3 pages
Data Science Course
No ratings yet
Data Science Course
50 pages
Module I. Basic Calculations. Average, Standard Deviation by Excel (5)
No ratings yet
Module I. Basic Calculations. Average, Standard Deviation by Excel (5)
48 pages
8th PPT Lecture On Measures of Position
0% (1)
8th PPT Lecture On Measures of Position
19 pages
Statistics Refresher
No ratings yet
Statistics Refresher
11 pages
1.3 Measure of Variability and Position
No ratings yet
1.3 Measure of Variability and Position
47 pages
ADS PRINT ans
No ratings yet
ADS PRINT ans
4 pages
Lec 3 Num Measures
No ratings yet
Lec 3 Num Measures
5 pages
Statistics Midterm Review
No ratings yet
Statistics Midterm Review
21 pages
U3 Excel Canvas
No ratings yet
U3 Excel Canvas
11 pages
SALMAN ALAM SHAH - Definitions of Statistics
No ratings yet
SALMAN ALAM SHAH - Definitions of Statistics
16 pages
4 - IB Math Applications & Interpretations SL Notes - Unit 4 Statistics
No ratings yet
4 - IB Math Applications & Interpretations SL Notes - Unit 4 Statistics
17 pages
GE MODMAT Unit 4 Statistics 1
No ratings yet
GE MODMAT Unit 4 Statistics 1
14 pages
Basic Statistical Description of Data
No ratings yet
Basic Statistical Description of Data
13 pages
Measures of Central Tendency to Z Score
No ratings yet
Measures of Central Tendency to Z Score
33 pages
SLG 5.2 Box Plots
No ratings yet
SLG 5.2 Box Plots
9 pages
Data Management
No ratings yet
Data Management
50 pages
Statistics Part 1 and 2
No ratings yet
Statistics Part 1 and 2
53 pages
SRM Assignment
No ratings yet
SRM Assignment
14 pages
Representation of Data - 1.1.4
No ratings yet
Representation of Data - 1.1.4
6 pages
Manual
No ratings yet
Manual
46 pages
Measures of Spread and Dispersion
No ratings yet
Measures of Spread and Dispersion
20 pages
Ankit fos
No ratings yet
Ankit fos
16 pages
Lecture 1 PDF
No ratings yet
Lecture 1 PDF
55 pages
Measures of Dispersion: Profgrcnair
No ratings yet
Measures of Dispersion: Profgrcnair
22 pages
Descriptive Statistic - Session 5
No ratings yet
Descriptive Statistic - Session 5
6 pages
Dispersion
No ratings yet
Dispersion
10 pages
IDS
No ratings yet
IDS
14 pages
STATISTICS
No ratings yet
STATISTICS
10 pages
Describing Data: Centre Mean Is The Technical Term For What Most People Call An Average. in Statistics, "Average"
No ratings yet
Describing Data: Centre Mean Is The Technical Term For What Most People Call An Average. in Statistics, "Average"
4 pages
Spring Semester, 2020-2021
No ratings yet
Spring Semester, 2020-2021
40 pages
Assignment - KI - Group 2
No ratings yet
Assignment - KI - Group 2
22 pages
Business Statistics CH (7)
No ratings yet
Business Statistics CH (7)
37 pages
R_-_III_UNIT[1]
No ratings yet
R_-_III_UNIT[1]
34 pages
Computation Variation and Quartile
No ratings yet
Computation Variation and Quartile
18 pages
Frequency Distribution Table: Measure of Dispersion: Range, Variance, Standard Deviation
No ratings yet
Frequency Distribution Table: Measure of Dispersion: Range, Variance, Standard Deviation
4 pages
sp5 1
No ratings yet
sp5 1
20 pages
FDSA unit 2
No ratings yet
FDSA unit 2
44 pages
IB Analysis & Interpretations SL II (1)
No ratings yet
IB Analysis & Interpretations SL II (1)
35 pages
GB Academy Equation List
No ratings yet
GB Academy Equation List
16 pages
Lecture4_slides
No ratings yet
Lecture4_slides
22 pages
AGA 3842-2022-2023. Descriptive Statistics
No ratings yet
AGA 3842-2022-2023. Descriptive Statistics
101 pages
Lecture3 Slides
No ratings yet
Lecture3 Slides
20 pages
Essential Stats For Decision Making-1 Descriptive Stats-2011
No ratings yet
Essential Stats For Decision Making-1 Descriptive Stats-2011
116 pages
Calculus by Muhammad Umer
From Everand
Calculus by Muhammad Umer
Muhammad Umer
No ratings yet
50 most powerful Excel Functions and Formulas
From Everand
50 most powerful Excel Functions and Formulas
Andrei Besedin
4/5 (1)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Can explain this
No ratings yet
Can explain this
9 pages
cource4_chapter2
No ratings yet
cource4_chapter2
33 pages
Combinepdf Removed
No ratings yet
Combinepdf Removed
420 pages
Only A
No ratings yet
Only A
1 page
FYP Documentation Sample
No ratings yet
FYP Documentation Sample
19 pages
Lab # 07 Implementation of SQL Statements (DDL)
No ratings yet
Lab # 07 Implementation of SQL Statements (DDL)
14 pages
Resume Sample For Data Entry
100% (1)
Resume Sample For Data Entry
7 pages
Computer
No ratings yet
Computer
27 pages
Box Plots Questions MME
No ratings yet
Box Plots Questions MME
9 pages
PBI E-Book
No ratings yet
PBI E-Book
121 pages
Accounting Has Been Done Manually Till The 1980s
No ratings yet
Accounting Has Been Done Manually Till The 1980s
10 pages
DBMS-Question Paper (Set-1)
No ratings yet
DBMS-Question Paper (Set-1)
4 pages
0045 0514 V1-Datasheet-Radio-Manager-2 English LR
No ratings yet
0045 0514 V1-Datasheet-Radio-Manager-2 English LR
4 pages
UNIT IV
No ratings yet
UNIT IV
46 pages
Module 04 Develop Solutions That Use Azure Cosmos DB
No ratings yet
Module 04 Develop Solutions That Use Azure Cosmos DB
35 pages
50 Assignment Itsm With Solution
No ratings yet
50 Assignment Itsm With Solution
60 pages
Java EE 5 Development using GlassFish Application Server 1st Ed. Edition David R. Heffelfinger - The latest ebook is available, download it today
100% (1)
Java EE 5 Development using GlassFish Application Server 1st Ed. Edition David R. Heffelfinger - The latest ebook is available, download it today
47 pages
Database System Concept
No ratings yet
Database System Concept
51 pages
Server Administration
No ratings yet
Server Administration
4 pages
20EEE653 Advanced Industrial Automation and Building Automation
No ratings yet
20EEE653 Advanced Industrial Automation and Building Automation
9 pages
Database (Ibrahem)
No ratings yet
Database (Ibrahem)
15 pages
Professional Crystal Reports for Visual Studio NET 2nd Edition David Mcamis pdf download
100% (2)
Professional Crystal Reports for Visual Studio NET 2nd Edition David Mcamis pdf download
60 pages
Important Short and Long Questions of Database
No ratings yet
Important Short and Long Questions of Database
1 page
Master of Php Record 2025doc NEW (1)
No ratings yet
Master of Php Record 2025doc NEW (1)
65 pages
CS619 Final Project Viva Notes
100% (1)
CS619 Final Project Viva Notes
25 pages
Sap CDHDR - Cdpos
No ratings yet
Sap CDHDR - Cdpos
9 pages
CMT 102
No ratings yet
CMT 102
2 pages
Installation Guide Data Integration Linux en
No ratings yet
Installation Guide Data Integration Linux en
205 pages
Comparison Among Different CNN Architectures For Signature Forgery Detection Using Siamese Neural Network 2021
No ratings yet
Comparison Among Different CNN Architectures For Signature Forgery Detection Using Siamese Neural Network 2021
6 pages
Database-Environment
No ratings yet
Database-Environment
2 pages
SQL Errors 6000-7000
No ratings yet
SQL Errors 6000-7000
6 pages
Ebook-2023-Glossary-AI-Terms
No ratings yet
Ebook-2023-Glossary-AI-Terms
22 pages
[FREE PDF sample] Mastering ArcGIS Enterprise Administration: Install, configure, and manage ArcGIS Enterprise to publish, optimize, and secure GIS services 1st Edition Chad Cooper ebooks
100% (2)
[FREE PDF sample] Mastering ArcGIS Enterprise Administration: Install, configure, and manage ArcGIS Enterprise to publish, optimize, and secure GIS services 1st Edition Chad Cooper ebooks
65 pages
Mastering MySQL A Comprehensive Guide
No ratings yet
Mastering MySQL A Comprehensive Guide
10 pages

Week 2 Cheat Sheet

Uploaded by

Week 2 Cheat Sheet

Uploaded by

Week 2 Cheat Sheet

Stascs and Data Analysis with Excel, Part 1

Diﬀerence Between Populaon and Sample

Formulas for these parameters are:

The Summaon Symbol

Quarles and Percenles

Including the median: 𝑘 = 𝑝 ∙ (𝑛 − 1) + 1

Excluding the median: 𝑘 = (𝑛 + 1) ∙ 𝑝

In Excel, we can use the QUARTILE(data,q), QUARTILE.INC(data,q), and QUARTILE.EXC(data.q) to

𝑛ÕÜáæ,èããØå = √𝑛 (typically rounded to the nearest integer)

You might also like

Stascs and Data Analysis with Excel, Part 1

Diﬀerence Between Populaon and Sample

The Summaon Symbol

Quarles and Percenles