0% found this document useful (0 votes)
3 views

R Notes

Uploaded by

2tmjzgbwdm
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

R Notes

Uploaded by

2tmjzgbwdm
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

load

Statistical inference
(statr) 1. StrC>
:library
c.
library (dplyr) use to return a list of objectand their structure.
3.
library (ggplotz)
2. Side
by side box plots =

to remove no soxplot (AA-b, data =


-, "A", ycab:" b")
nlab:

-
Sum (n, na.rm:TRUE ) first variable sand variable
-

WA. remove

to compart mean of a
groups
a data 1.3

+group-by CA')%>1
-summarise (mean-weight :
mean weight)
↓ "habit' mean-weight'
I 1.23
-

2. 4.56

3. NA
7.84

NA to
compare
e,
no a
group
given (data weight, datas habit,
data
by summary)
parameter were
3. to conduct
interal-mean,
hypotesting confidence
estimating
a

sample
dist
-
sinference
Cy M,nitdeddata---Iac,
median
type: "e', null 0,
proportion
=
=

null
dist alternative:less method:simulation
greater

0.95

- inference (y:ran, data:


method:"theoretical" (
-istatistic: 'mean', type:'Ci', conf-lere) 0.99
=

tidyvere
Install:
install
packages ('dplyr')
IPdd

>library (dplyr)

load EBA
1.
library stats ( 1. load data

2.
Library (aplyr)
3.
library (ggplotz) #summarize data
a. Summarize Data dataset

summary (data name)


terms: dim (dataname] to get dimension of the dataset in term num of
of
now a column.

1.
Min: minimum value

2. IstQu=25th
percentile
3. Visualize the Data.
using (gyp1+)
skewness
3. Median:median value create
histogram of certain variable
-

Mean:mean
4. value
>ggplotC dataname, als (n-variable)) .5%

5. 3rd Qu:75th
percentive +geom-histogram (binwidth:10)

6. Max: Maximum value.

#create
scatterplotofvariable,
is variable a fix variable
using a

ggplot (data:dataname) +
grom-point(mapping:aes ( =
-1y
= - ))

Create
Barplot #create boxplot
#

ggplot (dataname, als


(u=-, <ggplot (data:dataname, als (n=vary, yeare13%.2%
(fill:"blue")
y
=
- 1) geom-bar (stat:
+

+geom-boxplot

barplot (Ata, I correlation


create matrixof around to ade.)
main: I >round (cor (dataname ic C'r,'' v.2',' v.3'....3]),a)
• -1 indicates a perfectly negative linear correlation between two variables
nimb:""ylab:" " • 0 indicates no linear correlation between two variables
• 1 indicates a perfectly positive linear correlation between two variables
!?ename'scamee
beside:TRUE)

a mutate2) Add new variable 3. Visualize the data


using [pplyr)
b select () Selectvariable summarize
#
()"?A var Remove NAvalu
X

"filter selectobservations summarize (vary, delay:mean (dep-dcay), na.rm=TEME)

## deldy
d. arrange's ordering of
the row

#
# I 12.6
e. rename is Rename war, name

f.
group-by Graps data summarize group-by, *AFs, LA45 I
+

Summarise () CVar1, v.2, V37%.3%


g. gives summary group-by
+ summarize (mean:mean (Var4) 3

from nyc-adu
<

ran-flights. 1.71.
+group-by (Origin) 1.2%
+summarize (meandd:meansdlp-delay), sd-dd:
sd<dyp-delay), n n(s)
=

1. Measure of center 1. Mean

2 median

Measure
spread min
2 -
of 1 a max

2. Std dd, var

3. Percentile

4.21 IQR
&
R codes -

Regression.
Scatterplot
a protecta spacey-y>
1. (
+
gen-point

2. Correlation a data name . summarise (corcy,n))


coefficient b. cor (data$-, data - 3

dependent
war
&

which is more
wantto
are Vardadd when this compare
significantpredictor]
, independent L
M -xX - XX

3. Run linear a model-namec - lm (y-n, data =


-
>

Regression ·
Then to show model
output:
summary (model-name) summary
5 (M.xx -xx)

Intercept"mean value of the


response value when all of the

variable in modal
predictor
0
=

:value o f
y when wis

Plotregression gyplot (data:-, y-y))


4. line (
grom-point (shape-1
a aes (n= n, +

**- *I*
geom-smooth (method:"Im",
scatterplot se:FALSE)
on t =

stat
>plot- S
SE:TRUE the of Regression
-
- Im( to see standard error
-
,
*
En PAEph bEY
>abline C-1-)=LEMIAAY
4.1
Jitterplot a
ggplot
(data:-,aes (n=x, j.ys(
·
scatterplot +geom-point (position:position -jitter (r: -
,h
=

- 1)

AA I:Do +
ylab("-"( · NULL

nab("-")
+

·0.1

5. View data of fitted & a New-of-name:cbind Cdf, model - name $fitted values, model- name presiduals)
Residual

6. Residual plotto check a


ggplot [data:Mode)- name, des(u=n, y: resid)) +geom-point()
A cinarity

List(model- names residuals, breaks:...)


7.
Histogram for a.

* Residual

(M-xx -xx $residuals ( dots GY * AZ:not normal.


qqnormSim
8.99 plotto test
a
ggnorm (model- name I residuals (
7
of
normality qqline (model-name si reciduals)
Residual.

9. of
Checking
a

Multicollinearity
the
potential-predictoria,
Corplot (cor(potential predictor),
-
method: "number" s

Air 1 f
10. Run multiple
a.
Model-namec - 1m
(yU rulerva, data:-)
-
Linear Regression summary (model- name ( variables.
independent
1. Filter some variable.
t #Fr
saboc- [dataname] %.5% filter (variable="(
·
Histogram ↑

s nist (abc $variable, breaks: 5(

Scatterplot
·

ggplot (data:abs) (if%2% notwork,


justuce +
a +

geom-point(mapping y ()
(n
aes
1
= =
=

2. Barchart:stacked (to find the largest number of


a var. (

(data:
a
ggplot Itgeom-bar
me
(mapping aes
= (n=
fill= (-geom-col (position:"stack")
use to create bar chart
· to separate plot each
for differentvariable, use
facet-wrap.
(data:- +
ggplot
a

geom-bur (mapping: aes (n =-)) +

facet-wrap (- *-Y variable ASA

st.*A
e

Man!
3. Variance test
[abc $ (
I var
-
> var (abod $( 0.558.12
>war.test Cabos abcd$S 0
-
05
p
>

p
< 0.05
&
p10.05. Rejectnull, acceptHi :resethall acceptnull
ifC1 contain 1:nodiff
between arms of
study

4. Residual fitted
plot
is

< res<- resid (model name)

a
plot (fitted (modelname), res)

, abline 10,01

Hypothesis testing 1985


1. T-test Sindependenttwo-sample test)
greater
> t (dataBaba-data beef, falternative:"two-sided"),
test mu=0, confleve) 0.95)
=

> t test (data sabo, data $bCd) SURE


↑ I to

2. ANOVA

One
way
·

>one-way<-900CY-A, data: 3
summary (one way

Two
way
·

, two (YrA +
B, data= C
way--nor
>summary (two-way s

Correlation
·
coefficient
s round (cor(dataname [CC'abs',' bad', 'def'(3),2]

You might also like