0% found this document useful (0 votes)

8 views

Distributions Plotting

Uploaded by

Supreme Urs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Distributions Plotting

Uploaded by

Supreme Urs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Plotting Statistical Distributions and Properties

Jan Anders

11/07/2023

There are multiple approaches to displaying a distribution and clarifying concepts to oneself. The concept
you are most likely already familiar with is simulating a random sample that follows a particular distribution
and then plotting it. It its simplest form, this looks like this:
# Simulate a sample of size n = 1000 (default: mean 0, variance 1)
x <- rnorm(1000)
hist(x, freq = FALSE)

Histogram of x
0.4
0.3
Density

0.2
0.1
0.0

−3 −2 −1 0 1 2 3

x While in
the real world, this approach is exactly the first thing we do when approaching a problem (plotting,
exploratory data analysis), we are presented with multiple problems when trying to explore a theoretical
concept:
• The empirical distribution does not match the true theoretical density closely enough (at least not
for reasonably small sample sizes). This will become especially problematic when you’re dealing with
distributions that are highly skewed or you are interested in the density of extreme values, since you
will have to simulate very often and at high sample sizes to estimate these properly.
• Out of the box, you will have to rely on binning or to get an understanding of the shape of the density.
You will not have a proper value for the density of any given value of x.
• This is not a clean (mathematical) way to approach this problem

1
What is a better way to plot a density function?
There are three other functions for any given distribution - Remember?
• d (dnorm, dbinom, dpois) The density
• p (pnorm, pbinom, ppois) The cumulative density function
• q (qnorm, qbinom, qpois) The quantile for a given cumulative probability
The documentation of these functions is arguably a bit lacking, so here’s what they do:

d
Get the density of any (well known) probability distribution for a given parametrisation. Very simple example:
The density (probability) of getting 0 in a Bernoulli experiment with p = 0.5 is 0.5.
dbinom(0, 1, 0.5)

## [1] 0.5
The highest density of the normal distribution is at its mean:
dnorm(0, mean = 0, sd = 1)

## [1] 0.3989423
These functions (like everything in R) are vectorized, so they can handle multiple values, which we can make
use of to create a plot:
x <- c(-1, -0.5, 0, 0.5, 1)
y <- dnorm(x, 0, 1)
plot(x, y, type = "l")
0.40
0.35
y

0.30
0.25

−1.0 −0.5 0.0 0.5 1.0

x A bit
bulky right now. Let’s use some helper functions to get this to look better. Most useful in this context is
probably the sequence function. It creates a sequence of x-values that you can then use to calculate the
densities:

2
seq(0, 1, by = 0.05)

## [1] 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70
## [16] 0.75 0.80 0.85 0.90 0.95 1.00
Let’s use this to create a plot of the normal density. Don’t forget to set the type of the plot to “line”. The
default just plots the points.
x <- seq(-5, 5, by = 0.05)
y <- dnorm(x, 0, 1)
plot(x, y, type = "l")
0.4
0.3
0.2
y

0.1
0.0

−4 −2 0 2 4

x Much
better.
Of course, sometimes we want multiple curves in the same diagram to compare different concepts (this is
likely what you’ll most often want to do, in this course we’re mostly using R to build intuition after all).
The lines function comes in handy:
# create a plot of the standard normal density we just saw
x <- seq(-5, 5, by = 0.1)
y <- dnorm(x, 0, 1)
plot(x, y, type = "l")

# create new values for a different distribution

y2 <- dnorm(x, 1, 0.5)
lines(x, y2, col = "red")

3
0.4
0.3
0.2
y

0.1
0.0

−4 −2 0 2 4

x Prob-
lem: The original plot dictates the x- and y-axis. The quick fix is to manually specify ranges.
plot(x, y, type = "l", ylim = c(0, max(y, y2)))

lines(x, y2, type = "l", col = "red")

0.8
0.6
0.4
y

0.2
0.0

−4 −2 0 2 4

p
Let’s get back to working with distributions. There are two other functions left we need to look at. The
p-function does the same as the d-function, just with the distribution function (cumulative probability
function). Really helpful when you need to calculate cumulative probabilities (such as in typical probability
theory, testing, . . . ). You can control the side of the cumulative probability mass by the lower.tail argument

4
(or you just use 1-q, this will result in different results for discrete probabilities though, careful!).
For example, we know that 50% of the probability mass of the normal distribution is left of the mean value.
If we plug in the value 0 for x we get the probability of obtaining 0 or a value smaller than that. In other
words, the probability mass left of 0:
pnorm(0, mean = 0, sd = 1, lower.tail = TRUE) # lower tail = TRUE is default

## [1] 0.5
Let’s repeat the same procedure for plotting we saw with the density function here as well:
x <- seq(-5, 5, by = 0.05)
y_mass <- pnorm(x)
plot(x, y_mass, type = "l")

#adding the density for visualizing the relationship

lines(x, dnorm(x))
1.0
0.8
0.6
y_mass

0.4
0.2
0.0

−4 −2 0 2 4

x
pbinom(5, 10, 0.5, lower.tail = FALSE) + 1/2 * dbinom(5, 10, 0.5)

## [1] 0.5
If you can explain why the result of the above is 0.5, then you’ve probably understood the probability and
density function (and the binomial distribution).

q One function to go: The quantile function. This one is basically the inverse of the probability (cdf)
function. It gives you the corresponding value of x for a given quantile. Simple example again: The 50%
quantile of the normal is at 0, the 97.5% quantile at 1.96 (Rule of thumb: 2):
qnorm(0.5, 0, 1)

## [1] 0
qnorm(0.025, 0, 1)

## [1] -1.959964

5
This function is really helpful in confidence intervals and testing, where you often need to reach a certain
minimum probability and want to know the corresponding minimum value of your random variable to reach
it. For example, say we want to know the 95% confidence interval of a normally distributed variable with
µ = 50 and V ar = 100. We just get the 2.5% and 97.5% quantile of this variable:
qnorm(c(0.025, 0.975), mean = 50, sd = sqrt(100))

## [1] 30.40036 69.59964

This way, you can get rid of the tedious transformation to the standard normal (at least when you’re allowed
to use R).
Plotting quantiles on a normal density:
plot(x, dnorm(x), type = "l")
abline(v = qnorm(c(0.025, 0.975), 0, 1))
0.4
0.3
dnorm(x)

0.2
0.1
0.0

−4 −2 0 2 4

r
This one you are likely most familiar with. It simulates a random sample from the data. Arguably the most
fun out of the four functions.
rnorm(5)

## [1] 0.2159454 0.2187151 -2.2901460 0.2321641 -0.6583549

I would highly encourage you to try to use the proper function for a given task. Not only will it make your
code cleaner, you will repeat probability theory while doing it.
One more note: A nicer way of plotting a function is the curve function. It takes away the need to define
a sequence of x with proper spacing (you can control the number of points it calculates in the background
with the parameter n though). As first parameter, curve expects an expression (function) that has x as first
parameter and returns a value for a given x, so you can plot any arbitrary function that you may also define
yourself.
curve(dnorm(x), from = -5, to = 5)

6
0.4
0.3
dnorm(x)

0.2
0.1
0.0

−4 −2 0 2 4

x There
are many more distribution functions available in R. The nice thing about curve is that it also has an “add”
parameter, so you can stack as many plots as you want:
# unregister x
rm(x)
left <- -2
right <- 6

curve(dnorm, from = left, to = right, col = 1, ylim = c(0,1))

curve(dexp(x, 1), from = left, to = right, add = TRUE, col = 4)
curve(dgamma(x, 1, 0.5), from = left, to = right, add = TRUE, col = 5)
curve(dweibull(x, 2, 1), from = left, to = right, add = TRUE, col = 6)
1.0
0.8
0.6
dnorm(x)

0.4
0.2
0.0

−2 0 2 4 6

7
Of course, everything showed here can be done much more professionally, there are density objects and
professional plotting libraries like ggplot (in which you can of course also build line plots using the methods
from here). These are sometimes a bit overkill and might take quite long to get something right, which can
be tedious when you just want to quickly check something. Base R plotting will be sufficient for everything
we do in this course, so we’ll stick to it for now.

CS1 R Summary Sheets
No ratings yet
CS1 R Summary Sheets
26 pages
R Lab - Probability Distributions
No ratings yet
R Lab - Probability Distributions
10 pages
Applied Nonparametric Statistics 2
No ratings yet
Applied Nonparametric Statistics 2
15 pages
Design of A Magnetic Levitation Control System PDF
No ratings yet
Design of A Magnetic Levitation Control System PDF
5 pages
Using R To Plot The Probability Density Function
No ratings yet
Using R To Plot The Probability Density Function
3 pages
Sim R
No ratings yet
Sim R
6 pages
Day 3
No ratings yet
Day 3
19 pages
Math10282 Ex05 - An R Session
No ratings yet
Math10282 Ex05 - An R Session
6 pages
Probability Distributions in R
No ratings yet
Probability Distributions in R
42 pages
Presentation 3
No ratings yet
Presentation 3
29 pages
Genetica Cuantitativa
No ratings yet
Genetica Cuantitativa
120 pages
5-Normal Distribution-23-01-2025
No ratings yet
5-Normal Distribution-23-01-2025
35 pages
R-Program Lab Manual
No ratings yet
R-Program Lab Manual
57 pages
Lab-4
No ratings yet
Lab-4
6 pages
Probability Functions in R
No ratings yet
Probability Functions in R
6 pages
A Guide To Dnorm, Pnorm, Qnorm, and Rnorm in R
No ratings yet
A Guide To Dnorm, Pnorm, Qnorm, and Rnorm in R
7 pages
Lecture 2 - R Graphics PDF
No ratings yet
Lecture 2 - R Graphics PDF
68 pages
Plot exponential distribution
No ratings yet
Plot exponential distribution
2 pages
R_FS
No ratings yet
R_FS
52 pages
STTN 225 R Summary
No ratings yet
STTN 225 R Summary
18 pages
Mathematical Computations Using R
No ratings yet
Mathematical Computations Using R
53 pages
Statistics Cheat Sheet
100% (1)
Statistics Cheat Sheet
4 pages
5 Describing Populations: in This Chapter We Describe Populations and Samples Using The Language of Probability
No ratings yet
5 Describing Populations: in This Chapter We Describe Populations and Samples Using The Language of Probability
9 pages
R Commands
No ratings yet
R Commands
5 pages
Core Statistics PDF
100% (4)
Core Statistics PDF
256 pages
Package DISTRIB': R Topics Documented
No ratings yet
Package DISTRIB': R Topics Documented
8 pages
Numerical Integration and Graphics in R: 1 Defining Mathematical Functions
No ratings yet
Numerical Integration and Graphics in R: 1 Defining Mathematical Functions
6 pages
A Guide to dnorm, pnorm, rnorm, and qnorm in R
No ratings yet
A Guide to dnorm, pnorm, rnorm, and qnorm in R
3 pages
Continuous Distributions in R
No ratings yet
Continuous Distributions in R
155 pages
An R Tutorial Starting Out
No ratings yet
An R Tutorial Starting Out
9 pages
C06_Probability_and_Sampling_Distributions
No ratings yet
C06_Probability_and_Sampling_Distributions
7 pages
00 Lab Notes
No ratings yet
00 Lab Notes
13 pages
Experiment-6
No ratings yet
Experiment-6
7 pages
R Session - Note3
No ratings yet
R Session - Note3
4 pages
Econometrics I - Problem Set 1: Econometricswithr Download R
No ratings yet
Econometrics I - Problem Set 1: Econometricswithr Download R
3 pages
graphs
No ratings yet
graphs
5 pages
Probability Problem Solution Strategy in R PDF
No ratings yet
Probability Problem Solution Strategy in R PDF
12 pages
Introduction To Rstudio: Creating Vectors
No ratings yet
Introduction To Rstudio: Creating Vectors
11 pages
R
No ratings yet
R
4 pages
Lab 8
No ratings yet
Lab 8
5 pages
2016 04 27 Cmpe 140 Computing Econ 09 Graphics Continued
No ratings yet
2016 04 27 Cmpe 140 Computing Econ 09 Graphics Continued
28 pages
huzz
No ratings yet
huzz
10 pages
R Code Cheat Sheet
No ratings yet
R Code Cheat Sheet
3 pages
Working With Only Normal Curves
No ratings yet
Working With Only Normal Curves
2 pages
7 Plotting
No ratings yet
7 Plotting
12 pages
R03 Simulation.128
No ratings yet
R03 Simulation.128
18 pages
R in 15 Min
No ratings yet
R in 15 Min
4 pages
Statistics Using R Tutorial
No ratings yet
Statistics Using R Tutorial
22 pages
Gamlss-Manual Instructions On How To Use The Gamlss Package 2008
No ratings yet
Gamlss-Manual Instructions On How To Use The Gamlss Package 2008
206 pages
Unit 5 Advanced Graphics in r
No ratings yet
Unit 5 Advanced Graphics in r
43 pages
Calculator Use Part8
No ratings yet
Calculator Use Part8
1 page
Simple Statistics Functions in R
No ratings yet
Simple Statistics Functions in R
41 pages
Statistics With MATLABOctave
No ratings yet
Statistics With MATLABOctave
46 pages
Statistics With MATLAB/Octave: Andreas Stahel Bern University of Applied Sciences Version of 30th June 2017
No ratings yet
Statistics With MATLAB/Octave: Andreas Stahel Bern University of Applied Sciences Version of 30th June 2017
46 pages
Unit 2 R
No ratings yet
Unit 2 R
16 pages
Lab-2: Probability Distributions Name: Objective:To Compute Probability Density Function (PDF) and Cumulative Distribution Function (CDF) Outcomes
No ratings yet
Lab-2: Probability Distributions Name: Objective:To Compute Probability Density Function (PDF) and Cumulative Distribution Function (CDF) Outcomes
15 pages
QM2 Tutorial 3
No ratings yet
QM2 Tutorial 3
26 pages
Handout1a MATLAB Tutorial
No ratings yet
Handout1a MATLAB Tutorial
32 pages
UNIT-4
No ratings yet
UNIT-4
38 pages
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Attacking Problems in Logarithms and Exponential Functions
From Everand
Attacking Problems in Logarithms and Exponential Functions
David S. Kahn
5/5 (1)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Standards For Curricula Assessment Systems - PDF 31300458
No ratings yet
Standards For Curricula Assessment Systems - PDF 31300458
24 pages
3-In-1 Differential Pressure Controller
No ratings yet
3-In-1 Differential Pressure Controller
1 page
PMT-MS-017 Rev.00 - FUSE CUTOUT METHODOLOGY
No ratings yet
PMT-MS-017 Rev.00 - FUSE CUTOUT METHODOLOGY
7 pages
23-24 - Active and Passive Voice ULP Class VII Eng Lang
No ratings yet
23-24 - Active and Passive Voice ULP Class VII Eng Lang
3 pages
CCU Exhibition
No ratings yet
CCU Exhibition
18 pages
Bergancia Act.2.1,2.2
No ratings yet
Bergancia Act.2.1,2.2
3 pages
HEQEP Project Firoj
No ratings yet
HEQEP Project Firoj
26 pages
Contentless Syntax, Ineffable Semantics, and Transcendental Ontology. Reflections On Wittgenstein's Tractatus
No ratings yet
Contentless Syntax, Ineffable Semantics, and Transcendental Ontology. Reflections On Wittgenstein's Tractatus
6 pages
Safe Surf o
No ratings yet
Safe Surf o
6 pages
Brief Curriculum Vitae: Specialisation: (P Ea 1. 2
No ratings yet
Brief Curriculum Vitae: Specialisation: (P Ea 1. 2
18 pages
2.5 Related Rates
No ratings yet
2.5 Related Rates
5 pages
Hash Function Using Chaotic Maps
No ratings yet
Hash Function Using Chaotic Maps
25 pages
Housekeeping Services NCII: Quarter 3
No ratings yet
Housekeeping Services NCII: Quarter 3
10 pages
Comparative Philosophy
100% (1)
Comparative Philosophy
12 pages
Marmaray Bosphorus Crossing Project: Surveying Activity and Geodetic Monitoring
No ratings yet
Marmaray Bosphorus Crossing Project: Surveying Activity and Geodetic Monitoring
7 pages
Fault Tree Analysis and Hazop
No ratings yet
Fault Tree Analysis and Hazop
42 pages
Asdasd
No ratings yet
Asdasd
114 pages
Cognos Query Studio
No ratings yet
Cognos Query Studio
48 pages
The Robert M. Buchan Department of Mining Queen's University at Kingston
No ratings yet
The Robert M. Buchan Department of Mining Queen's University at Kingston
3 pages
How To Automate Word From Visual Basic
0% (1)
How To Automate Word From Visual Basic
6 pages
The Non Hodgkin Lymphoma of Steve Dornan and His Previous Exposure To Contamination From Uranium Weapons in Bosnia
No ratings yet
The Non Hodgkin Lymphoma of Steve Dornan and His Previous Exposure To Contamination From Uranium Weapons in Bosnia
40 pages
Module 1-Evaluate-Task 2-Prelim Journal Writing
No ratings yet
Module 1-Evaluate-Task 2-Prelim Journal Writing
3 pages
MediaTek LinkIt Smart 7688 Duo Quick Start Guide
No ratings yet
MediaTek LinkIt Smart 7688 Duo Quick Start Guide
27 pages
Poster
No ratings yet
Poster
1 page
Csc430 Project Requirements
No ratings yet
Csc430 Project Requirements
3 pages
Interfaces in C# PDF
No ratings yet
Interfaces in C# PDF
3 pages
Career Anchors and Job/role Planning: Tools For Career and Talent Management
No ratings yet
Career Anchors and Job/role Planning: Tools For Career and Talent Management
1 page
Final End-Term Question Paper.docx
No ratings yet
Final End-Term Question Paper.docx
3 pages