Practical 4

TP 4 de statistiques

Uploaded by

rtchuidjangnana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Practical 4

TP 4 de statistiques

Uploaded by

rtchuidjangnana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

University of Geneva GSEM

Statistics I Fall 2017

Prof. Eva Cantoni Practical 4

Review of Practicals 1-3 and a full data analysis

Goals: The objective of this practical is to review the descriptive statistics, plots and basic
R programming we have learned through the first three practicals. In additoin, a new data
analysis is perfomed to consolidate what has been learned so far, while learning about few
extra possibilities of R.

1 Revision
Revisit practical 1, 2 and 3 (files Practical1.R, Practical2.R, Practical3.R are pro-
vided in the corresponding folders in Chamilo). Make sure you understand what is done
with each command in R.

2 A Full Data Analysis

Import the dataset Cities.csv in R and store them as a dataframe called Cities. This
dataset contains information on the economic conditions in 48 cities around the world in
1991. The variables contained in the dataset are the following:

• “City” : City name

• “Work ” : Weighted average of the number of working hours in 12 occupations

• “Price” : Index of the cost of 112 goods and services excluding rent (Zurich = 100)

• “Salary” : Index of hourly earnings in 12 occupations after deductions (Zurich = 100)

Since the summary statistics and plots we have learned so far has its suitable variable type,
it is necessary to know what kind of data you have in your file before summarizing or
visualizing it. You can check the type of variables in the Cities dataset with:
str ( Cities )
or with:
class ( Cities $ ...) # ... has to be replaced by a variable name
to get the type for each variable. Use
summary ( Cities )

1
University of Geneva GSEM
Statistics I Fall 2017
Prof. Eva Cantoni Practical 4

to get a basic description of the dataset.

Look at the entire dataset by typing Cities in R. What do you observe?

The dataset contains some missing values, coded NA. Some of the functions (e.g. vioplot)
cannot handle this and would need special treatment, see point 4. below.

To perform your data analysis, consider the following steps:

1. Provide summary statistics for the variables which have suitable type in the dataset.
When appropriate, draw a kernal density plot to check whether their distributions are
symmetric or not.

2. Draw boxplots of all the numerical (continuous) variables into a single graphical win-
dow. You can use the par() function including the option mfrow=c(nrows, ncols)
to create a matrix of nrows by ncols plots that are filled in by row. For example, if
you need plots to be arranged horizontally, let nrows=1.
par ( mfrow = c (1 ,3)) # 3 figures arranged in a row
boxplot ( Cities $ Work , col = " lightsalmon1 " )
# with the default color changed to lightsalmon
boxplot ( Cities $ Price , col = " mediumseagreen " )
boxplot ( Cities $ Salary , col = " goldenrod2 " )
par ( mfrow = c (1 ,1)) # back to the default setting

What can you say about the distribution of each variable by looking only at the
boxplots?

3. Draw histograms of all the numerical (continuous) variables into a single graphical
window. Use here as well the col parameter to change the default settings.
Describe the distribution of the variables with these new information.

4. Draw violin plots of all the numerical (continuous) variables into a single graphical
window. You have to use the function na.omit() here to eliminate the missing values.
par ( mfrow = c (1 ,3)) # 3 figures arranged in a row
vioplot ( na . omit ( Cities $ Work ))
vioplot ( na . omit ( Cities $ Price ))
vioplot ( na . omit ( Cities $ Salary ))
par ( mfrow = c (1 ,1)) # back to the default setting

Describe the distribution of the variables with these new information. Try to change
manually the width of the bandwidth with the parameter h. What do you observe?

5. Compare, via QQ-plots, the empirical distribution of variables Work, Price and Salary
separately with the Gaussian distribution and draw a reference line. Does the Gaussian
distribution fit well?

Unit 2
No ratings yet
Unit 2
32 pages
Lab 5
0% (1)
Lab 5
5 pages
Modern Multidimensional Calculus
From Everand
Modern Multidimensional Calculus
Marshall Evans Munroe
No ratings yet
Practical4 Solution-1
No ratings yet
Practical4 Solution-1
9 pages
R Guide For Mathematical Statistics With Resampling and R - Plots in R
No ratings yet
R Guide For Mathematical Statistics With Resampling and R - Plots in R
3 pages
Math10282 Ex03 - An R Session
No ratings yet
Math10282 Ex03 - An R Session
10 pages
Apunts BLOC 1 Estadística
No ratings yet
Apunts BLOC 1 Estadística
15 pages
Plots in R
No ratings yet
Plots in R
3 pages
Week4 2020
No ratings yet
Week4 2020
25 pages
David Gerbing - R Visualizations Derive Meaning From Data (2020) - 1 - CRC Press (9780429894923)
100% (1)
David Gerbing - R Visualizations Derive Meaning From Data (2020) - 1 - CRC Press (9780429894923)
252 pages
Lecture 2 - R Graphics PDF
No ratings yet
Lecture 2 - R Graphics PDF
68 pages
Note 2
No ratings yet
Note 2
27 pages
Graph Plotting in R Programming
No ratings yet
Graph Plotting in R Programming
12 pages
R-Unit 4
No ratings yet
R-Unit 4
93 pages
Basics of Data Analysis and Graphics In
No ratings yet
Basics of Data Analysis and Graphics In
103 pages
Introduction to R for Business Analytics(1)
No ratings yet
Introduction to R for Business Analytics(1)
7 pages
Chapter - 03 - Review of Basic Data
No ratings yet
Chapter - 03 - Review of Basic Data
92 pages
Notes
No ratings yet
Notes
6 pages
R Practicals
No ratings yet
R Practicals
32 pages
STA1007S Lab 3: Plots (II) and Sub-Setting: "Sample"
No ratings yet
STA1007S Lab 3: Plots (II) and Sub-Setting: "Sample"
10 pages
Modelling With R
No ratings yet
Modelling With R
3 pages
CS ELEC 4 Midterm Module
No ratings yet
CS ELEC 4 Midterm Module
59 pages
MDPN460 Lecture05
No ratings yet
MDPN460 Lecture05
32 pages
DA_Lab_Week-2
No ratings yet
DA_Lab_Week-2
22 pages
Lecture 10 R
No ratings yet
Lecture 10 R
117 pages
Practical2 3
No ratings yet
Practical2 3
6 pages
R Commands
No ratings yet
R Commands
18 pages
R For Data Exploration
No ratings yet
R For Data Exploration
52 pages
Genetica Cuantitativa
No ratings yet
Genetica Cuantitativa
120 pages
R Studio Lab Summary Sheet
No ratings yet
R Studio Lab Summary Sheet
3 pages
Unit III - R Programming
No ratings yet
Unit III - R Programming
21 pages
R Notes For Data Analysis and Statistical Inference
No ratings yet
R Notes For Data Analysis and Statistical Inference
10 pages
Matematika BAB 5 Graphic in R
No ratings yet
Matematika BAB 5 Graphic in R
6 pages
Chapter 03 Visualization (R)
No ratings yet
Chapter 03 Visualization (R)
30 pages
MultivariateRGGobi PDF
No ratings yet
MultivariateRGGobi PDF
60 pages
R Tutorial
No ratings yet
R Tutorial
15 pages
An R Tutorial Starting Out
No ratings yet
An R Tutorial Starting Out
9 pages
7CCMMS61 Statistics For Data Analysis: Francisco Javier Rubio Department of Mathematics
No ratings yet
7CCMMS61 Statistics For Data Analysis: Francisco Javier Rubio Department of Mathematics
13 pages
Data Visualization in R Sem-III 2021 PDF
No ratings yet
Data Visualization in R Sem-III 2021 PDF
57 pages
Time Series Practice
No ratings yet
Time Series Practice
4 pages
R-Programming-Cheat-Sheet
No ratings yet
R-Programming-Cheat-Sheet
7 pages
Introduction To R PDF
No ratings yet
Introduction To R PDF
56 pages
Workshop Activity: X Seq y Length
No ratings yet
Workshop Activity: X Seq y Length
3 pages
Business Analytics Unit - IV Notes_60637706_2025_05!15!02_16
No ratings yet
Business Analytics Unit - IV Notes_60637706_2025_05!15!02_16
28 pages
Data Preprocessing
No ratings yet
Data Preprocessing
27 pages
DV - Unit 2
No ratings yet
DV - Unit 2
73 pages
Basic R Commands For Data Analysis
No ratings yet
Basic R Commands For Data Analysis
7 pages
R Tutorial
No ratings yet
R Tutorial
15 pages
P6ADBMS
No ratings yet
P6ADBMS
34 pages
R Data Types 8
No ratings yet
R Data Types 8
7 pages
R Manual PDF
No ratings yet
R Manual PDF
78 pages
ppt3
No ratings yet
ppt3
20 pages
R Notes
No ratings yet
R Notes
4 pages
seminar_1 2
No ratings yet
seminar_1 2
14 pages
Muthayammal College of Arts and Science Rasipuram: Assignment No - 1
No ratings yet
Muthayammal College of Arts and Science Rasipuram: Assignment No - 1
10 pages
Lab0 R Tutorial EHS
No ratings yet
Lab0 R Tutorial EHS
9 pages
R Visualization ADA
No ratings yet
R Visualization ADA
47 pages
STAT-1000---Worksheet-2 (1)
No ratings yet
STAT-1000---Worksheet-2 (1)
14 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Basic Exercises for Competitive Programming: Python
From Everand
Basic Exercises for Competitive Programming: Python
Jan Pol
No ratings yet
Anglais-bac-C-D-2002
No ratings yet
Anglais-bac-C-D-2002
5 pages
Anglais-bac-C-D-2004
No ratings yet
Anglais-bac-C-D-2004
4 pages
Anglais-bac-C-D-2015
No ratings yet
Anglais-bac-C-D-2015
4 pages
Anglais-bac-C-D-2001
No ratings yet
Anglais-bac-C-D-2001
5 pages
Anglais-bac-C-D-1999 bis
No ratings yet
Anglais-bac-C-D-1999 bis
5 pages
PS1 - LE - 2020 - With Solutions
No ratings yet
PS1 - LE - 2020 - With Solutions
16 pages
Anglais-bac-C-D-2000 bis
No ratings yet
Anglais-bac-C-D-2000 bis
4 pages
PS1 Le 2020
No ratings yet
PS1 Le 2020
6 pages
Labour Problem Set 2 Zildete
No ratings yet
Labour Problem Set 2 Zildete
17 pages
Labour Problem Set 1
No ratings yet
Labour Problem Set 1
14 pages
Does Entry Regulation Hinder Job Creation Bis
No ratings yet
Does Entry Regulation Hinder Job Creation Bis
9 pages
Education 2018
No ratings yet
Education 2018
63 pages
Chidi's Budget and Utility - Doing Algebra and Calculus With R and Yacas - Andrew Heiss
No ratings yet
Chidi's Budget and Utility - Doing Algebra and Calculus With R and Yacas - Andrew Heiss
15 pages
Code Book For New Jersey
No ratings yet
Code Book For New Jersey
3 pages
Af PS2 2014 15
No ratings yet
Af PS2 2014 15
4 pages
Article MIT
No ratings yet
Article MIT
9 pages
Assignment MEF 2 2018
No ratings yet
Assignment MEF 2 2018
5 pages
Problem Set 1 Solution
No ratings yet
Problem Set 1 Solution
5 pages
MEF Competition
No ratings yet
MEF Competition
10 pages
Random Number Generating (With TI-83/4)
No ratings yet
Random Number Generating (With TI-83/4)
1 page
Parameters and Hyperparameters notes
No ratings yet
Parameters and Hyperparameters notes
2 pages
Sta470, Feb 22
No ratings yet
Sta470, Feb 22
6 pages
Ch13 5-ConditionalRandomFields
No ratings yet
Ch13 5-ConditionalRandomFields
57 pages
Brief Intro To ML PDF
No ratings yet
Brief Intro To ML PDF
236 pages
Chapter 14 - Measures of Central Tendency and Dispersion
No ratings yet
Chapter 14 - Measures of Central Tendency and Dispersion
140 pages
diwali pdf
No ratings yet
diwali pdf
41 pages
CS229 Lecture Notes: Andrew NG and Tengyu Ma April 25, 2023
No ratings yet
CS229 Lecture Notes: Andrew NG and Tengyu Ma April 25, 2023
223 pages
Normal Distribution
No ratings yet
Normal Distribution
54 pages
SOA Exam P Syllabus
No ratings yet
SOA Exam P Syllabus
3 pages
Random Variable2
No ratings yet
Random Variable2
19 pages
EC3500 - Analysis of Random Signals: HTTP://WWW - Itl.nist - Gov/div898/handbook/eda
No ratings yet
EC3500 - Analysis of Random Signals: HTTP://WWW - Itl.nist - Gov/div898/handbook/eda
2 pages
(Ebook) Applied Linear Regression by Sanford Weisberg ISBN 9781118386088, 1118386086, B00GY2UPAS - The latest ebook version is now available for instant access
100% (1)
(Ebook) Applied Linear Regression by Sanford Weisberg ISBN 9781118386088, 1118386086, B00GY2UPAS - The latest ebook version is now available for instant access
54 pages
Causal Forecasting For Pricing
No ratings yet
Causal Forecasting For Pricing
17 pages
Course Paper On Regression Analysis of Gold Prices
33% (3)
Course Paper On Regression Analysis of Gold Prices
16 pages
21AI63AI
No ratings yet
21AI63AI
2 pages
Analysis of Variance Analysis of Variance: Steps For One Way Classification
No ratings yet
Analysis of Variance Analysis of Variance: Steps For One Way Classification
2 pages
18.probability 3-4-2024
No ratings yet
18.probability 3-4-2024
3 pages
Sample Exam With Solutions. Econometrics II 2015.
No ratings yet
Sample Exam With Solutions. Econometrics II 2015.
15 pages
Understandable Statistics 12th Edition Brase C.H. - eBook PDF download
100% (1)
Understandable Statistics 12th Edition Brase C.H. - eBook PDF download
51 pages
Wilks Lambda
No ratings yet
Wilks Lambda
4 pages
(FREE PDF Sample) Statistics For The Life Sciences Myra L. Samuels Ebooks
100% (1)
(FREE PDF Sample) Statistics For The Life Sciences Myra L. Samuels Ebooks
62 pages
hm4 2015
100% (1)
hm4 2015
2 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
State Model Syllabus For Under Graduate Course in Statistics (Bachelor of Arts Examination)
No ratings yet
State Model Syllabus For Under Graduate Course in Statistics (Bachelor of Arts Examination)
33 pages
SPSS Statistics: A Practical Guide 5e 5th Edition Kellie Bennett - The latest ebook version is now available for instant access
100% (1)
SPSS Statistics: A Practical Guide 5e 5th Edition Kellie Bennett - The latest ebook version is now available for instant access
61 pages
Musa Okumoto
No ratings yet
Musa Okumoto
27 pages
Risk
No ratings yet
Risk
2 pages
EUROLAB Cook Book - Doc No 8 Determination of Conformance - Rev. 2017
No ratings yet
EUROLAB Cook Book - Doc No 8 Determination of Conformance - Rev. 2017
3 pages
Levene Jds Mar2010
No ratings yet
Levene Jds Mar2010
3 pages