Material DA 7

This document describes useful statistical measures for descriptive analytics using the built-in "iris" and "mtcars" data sets in R. It discusses count, mean, median, mode, range, quartiles, standard deviation, skewness, and kurtosis. For the iris data set, examples are given calculating the count, mean, median, range, quantiles, standard deviation, skewness and kurtosis of variables like Sepal Length. Bar plots and boxplots are also used to visualize variables like Species in the iris data.

Uploaded by

Aparna Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views

Material DA 7

Uploaded by

Aparna Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Descriptive Analytics on “iris”, “mtcars” data sets

Useful Statistical Measures for Descriptive Analytics :

1. Count: used for counting number of observation or group variable observations. Mostly it is
used for categorical data column. Eg., Species: Is a variable in “iris” data set in built in R
studio. It is a group variable, because there are three varieties of species present in that
column.
Following are the methods used to describe Species variable:
 Frequency table of Species
 Compare the different types of species
 Bar or Pie chart can be prepared for this variable

# Usage of count for descriptive analytics

library(dplyr)

sp=iris%>%count(Species) #dplyr package

Species n
<fct> <int>
1 setosa 50
2 versicolor 50
3 virginica 50
“sp” is a data set, it can be copied to working directory with the following syntax:

write.csv(sp,"sp.csv")

Bar plot for the variable Species in iris data set:

counts=table(iris$Species)
barplot(counts, main="Species Distribution",
xlab="Number of Species")

2. Mean: Used to represent the continuous variable measure and also used for comparing two
or more than two variables collected under the same characteristics. Eg., "Sepal.Length"
"Sepal.Width" "Petal.Length" "Petal.Width" these variables means can be compared for
further analysis.
3. Median: Is used to represent the ordinal data measure and also used for comparing two or
more than two variables collected under the same characteristics.
4. Mode: Is used to count highest number of times occurring observations in the
categorical/specifically nominal variable.
5. Range: Is used for understanding the spread of the data distribution in a simplest way with
minimum and maximum of the values in the variable.
6. Quartiles and other positional measures: These are positional values of the ordinal data
observations. Genrally for any quartile Qn = n(n+1)/4, if it is decile, Dn=n(n+1)/10 and
percentile is Pn =n(n+1)/100
7. Standard deviation: It is one of mostly used statistical measure to understand the deviations
between the values of the variable and there by best measure for representing the variation
among all values. It is used for numerical measures which suits for mathematical operations.
( x− x́ )2
Formula for Standard deviation (Std)=
√ n
8. Skewness: Skewness is a measure of symmetry, or more precisely, the lack of symmetry.
A distribution, or data set, is symmetric if it looks the same to the left and right of the centre
point.
Examples of skewness:

9. Kurtosis: Kurtosis is a parameter that describes the shape of a random variable’s probability
distribution. For normal probability distribution values, value of kurtosis is almost equal to
one. If it is positive value, the peak is higher and for negative value flatter is more.

#Mean

mean(iris$Sepal.Length)

mean(iris$Sepal.Width)

mean(iris$Petal.Length)

mean(iris$Petal.Width)

summary(iris$Sepal.Length)

summary(iris$Sepal.Width)

summary(iris$Petal.Length)

summary.data.frame(iris)

irisn=iris[,-5]

irist=summary.data.frame(irisn)

irisst=as.data.frame(irist)

irisst

write.csv(irist,"irist.csv")
sd(irisn$Sepal.Length)

var(irisn$Sepal.Length)

boxplot(irisn)

library(ggplot2)

ggplot(iris, aes(y = Sepal.Length, x = Species, color = Species)) +

geom_boxplot() +

theme_classic()

quantile(iris$Sepal.Length)

quantile(irisn$Sepal.Length,c(0.30,0.45))

library(e1071)

skewness(iris$Sepal.Length) # run for e1071 package

kurtosis(iris$Sepal.Length)

Vungle A B Test
No ratings yet
Vungle A B Test
1 page
Setters Books
No ratings yet
Setters Books
6 pages
Where To Find Free Cosmetic Formulas 10.08.2012
100% (3)
Where To Find Free Cosmetic Formulas 10.08.2012
14 pages
Protection Scheme 13.8kV Switchgear
100% (1)
Protection Scheme 13.8kV Switchgear
69 pages
Material DA 7
No ratings yet
Material DA 7
3 pages
Material DA 7
No ratings yet
Material DA 7
3 pages
Univariate and Multivariate Data Exploration
No ratings yet
Univariate and Multivariate Data Exploration
26 pages
Basic Descriptive Statistics Using R
No ratings yet
Basic Descriptive Statistics Using R
4 pages
Unit 3
No ratings yet
Unit 3
11 pages
C4 Descriptive Statistics
No ratings yet
C4 Descriptive Statistics
34 pages
Materi 1 B VDE
No ratings yet
Materi 1 B VDE
18 pages
1 3 ST-explore
No ratings yet
1 3 ST-explore
55 pages
B180 Expt 9 Sem II
No ratings yet
B180 Expt 9 Sem II
8 pages
10
No ratings yet
10
7 pages
ML R Experiment1
No ratings yet
ML R Experiment1
10 pages
Graph Plotting in R Programming
No ratings yet
Graph Plotting in R Programming
12 pages
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
STAT 214-T241-Lab 2
No ratings yet
STAT 214-T241-Lab 2
23 pages
Unit 4 L1
No ratings yet
Unit 4 L1
3 pages
Unit3__R
No ratings yet
Unit3__R
19 pages
Week 1-12 Statistics
No ratings yet
Week 1-12 Statistics
84 pages
Using R For Data Preprocessing, Exploratory Analysis, Visualization
No ratings yet
Using R For Data Preprocessing, Exploratory Analysis, Visualization
7 pages
Descriptive Statistics in R
No ratings yet
Descriptive Statistics in R
49 pages
Data Exploration and Visualisation With R: Yanchang Zhao
No ratings yet
Data Exploration and Visualisation With R: Yanchang Zhao
45 pages
ge8 statistics
No ratings yet
ge8 statistics
2 pages
Statdescr
No ratings yet
Statdescr
23 pages
Biometry Lecture 3 Posted
No ratings yet
Biometry Lecture 3 Posted
47 pages
Data Exploration LEC3 AM
No ratings yet
Data Exploration LEC3 AM
59 pages
Chapter 4: Summarizing & Exploring Data (Descriptive Statistics) Graphics! Graphics! Graphics! (And Some Numbers)
No ratings yet
Chapter 4: Summarizing & Exploring Data (Descriptive Statistics) Graphics! Graphics! Graphics! (And Some Numbers)
85 pages
Exploratory Data Analysis - NOTES
No ratings yet
Exploratory Data Analysis - NOTES
31 pages
All Lectures
No ratings yet
All Lectures
53 pages
M1.2 DS
No ratings yet
M1.2 DS
29 pages
Statistics
No ratings yet
Statistics
30 pages
Decriptive Statistics in Data Science
No ratings yet
Decriptive Statistics in Data Science
9 pages
Chapter 1
No ratings yet
Chapter 1
25 pages
Descriptive Statistics - 11.02.23
No ratings yet
Descriptive Statistics - 11.02.23
46 pages
Statistics Midterms Reviewer 1
No ratings yet
Statistics Midterms Reviewer 1
9 pages
Advanced Statistics
No ratings yet
Advanced Statistics
259 pages
HNS 2321 BIOSTATISTICS LECTURE 3 AND 4 DESCRITIVE STATISTICS
No ratings yet
HNS 2321 BIOSTATISTICS LECTURE 3 AND 4 DESCRITIVE STATISTICS
36 pages
Chapter 1 Descriptivestatistics
No ratings yet
Chapter 1 Descriptivestatistics
21 pages
Exploratory Spatial Data Analysis
No ratings yet
Exploratory Spatial Data Analysis
54 pages
Angilan, Ef
No ratings yet
Angilan, Ef
5 pages
A Complete Guide To The Iris Dataset in R
No ratings yet
A Complete Guide To The Iris Dataset in R
3 pages
Capital Gains
No ratings yet
Capital Gains
8 pages
Biometry - Chapter 1
No ratings yet
Biometry - Chapter 1
22 pages
BDA 09 Shridhti Tiwari
No ratings yet
BDA 09 Shridhti Tiwari
12 pages
Math236_Lecture_2 (1)
No ratings yet
Math236_Lecture_2 (1)
64 pages
Module 2 Iris Data Set
No ratings yet
Module 2 Iris Data Set
1 page
CH3 Summarizing Data Description
No ratings yet
CH3 Summarizing Data Description
31 pages
Muthayammal College of Arts and Science Rasipuram: Assignment No - 3
No ratings yet
Muthayammal College of Arts and Science Rasipuram: Assignment No - 3
8 pages
Actuary_Math.Stat._Lec1-9
No ratings yet
Actuary_Math.Stat._Lec1-9
22 pages
MS102
No ratings yet
MS102
9 pages
Statistics[1]
No ratings yet
Statistics[1]
152 pages
Data Notes For IN3
No ratings yet
Data Notes For IN3
66 pages
statistics
No ratings yet
statistics
10 pages
614 Descriptive Statistcs
No ratings yet
614 Descriptive Statistcs
56 pages
STAL2073 Chapter2 2020 2021 123b7d8c3766b42b306dd4 231017 010636
No ratings yet
STAL2073 Chapter2 2020 2021 123b7d8c3766b42b306dd4 231017 010636
12 pages
Quantitative Data Analysis Thru Descriptive Statistics
No ratings yet
Quantitative Data Analysis Thru Descriptive Statistics
6 pages
SCA - Module 4
No ratings yet
SCA - Module 4
49 pages
R Programming
No ratings yet
R Programming
4 pages
Data Analysis
No ratings yet
Data Analysis
43 pages
Co-Clustering: Models, Algorithms and Applications
From Everand
Co-Clustering: Models, Algorithms and Applications
Gérard Govaert
No ratings yet
Data Structures and Algorithm
From Everand
Data Structures and Algorithm
Knowledge Flow
No ratings yet
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet
CASE-Indian Staffing Industry (SWOT Analysis) : Submitted By: - Aparna Singh - 19021141023 M.B.A. (2019-21)
No ratings yet
CASE-Indian Staffing Industry (SWOT Analysis) : Submitted By: - Aparna Singh - 19021141023 M.B.A. (2019-21)
6 pages
Aparna Singh: Work Experience
No ratings yet
Aparna Singh: Work Experience
1 page
Class Exercise For Chapter 3
No ratings yet
Class Exercise For Chapter 3
2 pages
Speculation and Postponement New
No ratings yet
Speculation and Postponement New
7 pages
IMC Marketing Cottle Taylor Case
No ratings yet
IMC Marketing Cottle Taylor Case
22 pages
Solution of Sarvodaya Samiti Case Study
No ratings yet
Solution of Sarvodaya Samiti Case Study
29 pages
18 Customer Relationship Marketing in The Airline Industry: Reinhold Rapp
No ratings yet
18 Customer Relationship Marketing in The Airline Industry: Reinhold Rapp
2 pages
Business Analytics Introduction
No ratings yet
Business Analytics Introduction
8 pages
BA GROUP ASSIGNMENT 3 (FOR Histogram On Mtcars and Iris)
No ratings yet
BA GROUP ASSIGNMENT 3 (FOR Histogram On Mtcars and Iris)
21 pages
Creative Writing Open University - Unlock Your Writing Potential
No ratings yet
Creative Writing Open University - Unlock Your Writing Potential
4 pages
Machine Breakdown Insuranc Etariff Rate
No ratings yet
Machine Breakdown Insuranc Etariff Rate
60 pages
Btech 1 Sem Engineering Mathematics 1 Nas103 2019
No ratings yet
Btech 1 Sem Engineering Mathematics 1 Nas103 2019
2 pages
Russo-Japanese War 1904 Timeline
No ratings yet
Russo-Japanese War 1904 Timeline
13 pages
Swift Standards Masterclass 2019 Presentation
No ratings yet
Swift Standards Masterclass 2019 Presentation
120 pages
[Mastering Programming Languages Series] Edet, Theophilus - C Programming_ Building Blocks of Modern Code (Mastering Programming Languages Series) (2024, Mastering Programming Languages Series) - libgen.li
No ratings yet
[Mastering Programming Languages Series] Edet, Theophilus - C Programming_ Building Blocks of Modern Code (Mastering Programming Languages Series) (2024, Mastering Programming Languages Series) - libgen.li
438 pages
The Art of Fencing Reduced To Its True Principles Sabre
No ratings yet
The Art of Fencing Reduced To Its True Principles Sabre
16 pages
Quiz #2 (3rd Grading)
No ratings yet
Quiz #2 (3rd Grading)
8 pages
CK-E55 H - Generator Set Caterpillar2
No ratings yet
CK-E55 H - Generator Set Caterpillar2
6 pages
A Hybrid CNN-Transformer Architecture for Precise Medical Image Segmentation
No ratings yet
A Hybrid CNN-Transformer Architecture for Precise Medical Image Segmentation
13 pages
All Laptop Motherboard IC Equivalents DOUBLE CLICK
No ratings yet
All Laptop Motherboard IC Equivalents DOUBLE CLICK
8 pages
淘气豆英语说明书
No ratings yet
淘气豆英语说明书
5 pages
Foundation Engineering2
No ratings yet
Foundation Engineering2
8 pages
Logistics and S Chain Management
No ratings yet
Logistics and S Chain Management
46 pages
Awareness in Anesthesia Finalised
100% (1)
Awareness in Anesthesia Finalised
42 pages
Nanda Nursing Diagnosis List 2018-2020
100% (1)
Nanda Nursing Diagnosis List 2018-2020
7 pages
Research Internship
No ratings yet
Research Internship
2 pages
Seismic Design and Analysis of Safety-Related Nuclear Structures in Sweden
No ratings yet
Seismic Design and Analysis of Safety-Related Nuclear Structures in Sweden
88 pages
My College Entrance Exam Experience
No ratings yet
My College Entrance Exam Experience
2 pages
Q3e LS5 U01A Student
No ratings yet
Q3e LS5 U01A Student
9 pages
Chapter 3 - Object Oriented Programming
No ratings yet
Chapter 3 - Object Oriented Programming
110 pages
Laporan Akhir Dda - Indah Kholidah - E1j021048
No ratings yet
Laporan Akhir Dda - Indah Kholidah - E1j021048
42 pages
Activity 1 - Word Scramble
No ratings yet
Activity 1 - Word Scramble
4 pages
Principles of Pulmonary Medicine, 7th Edition Complete Volume Download
100% (4)
Principles of Pulmonary Medicine, 7th Edition Complete Volume Download
16 pages
Introduction To Networking Devices-1
No ratings yet
Introduction To Networking Devices-1
25 pages
Nursing Informatics: Activity II
No ratings yet
Nursing Informatics: Activity II
54 pages
Management Principles Developed by Henri Fayol
No ratings yet
Management Principles Developed by Henri Fayol
1 page