SlideShare a Scribd company logo
R 語⾔言與資料分析
objects (物件) & numeric processes
Entering Input
在 R 的提⽰示符號 (prompt, “>”) 右⽅方鍵⼊入指令。 符號 “<-“ 為設定運算⼦子 (the assignment operator).
R 語⾔言的語法決定指令是否完備
The # character indicates a comment. Anything to the right of the # (including the # itself) is ignored.
> x <- 1
> print(x)
[1] 1
> x
[1] 1
> msg <- "hello"
> x <- ## Incomplete expression
> x <- “5 ## Incomplete expression
> x <- c(5,) ## Incomplete expression
R 物件命名規則
Case sensitive
• A and a are different
All alphanumeric symbols are allowed (A-Z, a-z, 0-9)
• “.”, “_”.
Name must start with “.” or a letter.
Do not use reserved keywords
❑ 錯誤命名
■ 3x
■ 3_x
■ 3-x
■ 3.x
■ .3variable
❑ 正確命名
■ x_3
■ x3
■ x.3
■ taiwan.taipei.x3
■ .variable
Evaluation
執⾏行指令 (evaluation) 之後,其結果會被回傳,可能會直接列印在 console 視窗,或者存⼊入某個變數
The [1] indicates that x is a vector and 5 is the first element.
> x <- 5
> x
[1] 5
> print(x)
[1] 5
## nothing printed
## auto-printing occurs
## explicit printing
The : operator is used to create integer sequences.
> x <- 1:20
> x
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
[16] 16 17 18 19 20
Printing
> 3 - 4
[1] -1
> 5 * 6
[1] 30
> 7 / 8
[1] 0.875
> 1 + 2 * 3
[1] 7
> (1 + 2) * 3
[1] 9
> 15 / 4
[1] 3.75
> 15 %% 4
[1] 3
> 2^2
[1] 4
> 2^0.5
[1] 1.414214
> 2^ 4.3
[1] 19.69831
> 2^-0.5
[1] 0.7071068
> log(4) # natural log
> log10(4) # log in base 10
> log(4,10) # same as above
> sqrt(9) # square root
> abs(3-4) # absolute value
> exp(1) # exponential
Using R as a Calculator
> squareroot(2)
Error: could not find function “squareroot”
> sqrt 2
Error: unexpected numeric constant in "sqrt 2"
> sqrt(-2)
[1] NaN
Warning message:
In sqrt(-2) : NaNs produced
> sqrt(2
+ )
[1] 1.414214
Warnings and Errors
Objects
R has five basic or “atomic” classes of objects:
· character
· numeric (real numbers)
· integer
· complex
· logical (True/False)
The most basic object is a vector
· A vector can only contain objects of the same class
· BUT: The one exception is a list, which is represented as a vector but can contain objects of
different classes (indeed, that’s usually why we use them)
Empty vectors can be created with the vector() function.
Numbers
· Numbers in R a generally treated as numeric objects (i.e. double precision real numbers)
· If you explicitly want an integer, you need to specify the L suffix
· Ex: Entering 1 gives you a numeric object; entering 1L explicitly gives you an integer.
· There is also a special number Inf which represents infinity; e.g. 1 / 0; Inf can be used in
ordinary calculations; e.g. 1 / Inf is 0
· The value NaN represents an undefined value (“not a number”); e.g. 0 / 0; NaN can also be
thought of as a missing value (more on that later)
Creating Vectors
The c() function can be used to create vectors of objects.
Using the vector() function
> x <- c(0.5, 0.6) ## numeric
> x <- c(TRUE, FALSE) ## logical
> x <- c(T, F) ## logical
> x <- c("a", "b", "c") ## character
> x <- 9:29 ## integer
> x <- c(1+0i, 2+4i) ## complex
> x <- vector("numeric", length = 10)
> x
[1] 0 0 0 0 0 0 0 0 0 0
> x <- c(74, 122, 235, 111, 292, 111, 211, 133, 156, 79)
> x1 <- c(74, 122, 235, 111, 292)
> x2 <- c(111, 211, 133, 156, 79)
> x_all <- c(x1, x2)
The c() function can also combine data vectors. For example:
Mixing Objects
What about the following?
When different objects are mixed in a vector, coercion occurs so that every element in the vector is of
the same class.
> y <- c(1.7, "a") ## character
> y <- c(TRUE, 2) ## numeric
> y <- c("a", TRUE) ## character
Explicit Coercion
Objects can be explicitly coerced from one class to another using the as.* functions, if available.
> x <- 0:6
> class(x)
[1] "integer"
> as.numeric(x)
[1] 0 1 2 3 4 5 6
> as.logical(x)
[1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE
> as.character(x)
[1] "0" "1" "2" "3" "4" "5" "6"
Explicit Coercion
Nonsensical coercion results in NAs.
> x <- c("a", "b", "c")
> as.numeric(x)
[1] NA NA NA
Warning message:
NAs introduced by coercion
> as.logical(x)
[1] NA NA NA
> as.complex(x)
[1] 0+0i 1+0i 2+0i 3+0i 4+0i 5+0i 6+0i
Attributes
R objects can have attributes
· names, dimnames
· dimensions (e.g. matrices, arrays)
· class
· length
· other user-defined attributes/metadata
Attributes of an object can be accessed using the attributes() function.
Matrices
Matrices are vectors with a dimension attribute. The dimension attribute is itself an integer vector of length 2
(nrow, ncol).
Matrices are constructed column-wise, so entries can be thought of starting in the “upper left” corner and
running down the columns.
> m <- 1:10
> m
[1] 1 2 3 4 5 6 7 8 9 10
> dim(m)
NULL
> attributes(m)
NULL
> dim(m) <- c(2, 5)
> m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
> dim(m)
[1] 2 5
> attributes(m)
$dim
[1] 2 5
Matrices (cont’d)
Matrices are constructed column-wise, so entries can be thought of starting in the “upper left” corner and
running down the columns.
> m <- matrix(1:6, nrow = 2, ncol = 3)
> m
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
cbind-ing and rbind-ing
Matrices can be created by column-binding or row-binding with cbind() and rbind().
> x <- 1:3
> y <- 10:12
> cbind(x, y)
x y
[1,] 1 10
[2,] 2 11
[3,] 3 12
> rbind(x, y)
[,1] [,2] [,3]
x 1 2 3
y 10 11 12
Exercises
the number of whales beachings per year in Texas during the 1990s was
74 122 235 111 292 111 211 133 156 79
Store this data in R
> whales <- scan()
1: 74 122 235 111 292 111 211 133 156 79
11:
Read 10 items
> whales <- c(74, 122, 235, 111, 292, 111, 211, 133, 156, 79)
or
Exercises (cont’d)
using functions on a data vector
> sum(whales)
[1] 1524
> length(whales)
[1] 10
> sum(whales)/length(whales)
[1] 152.4
> mean(whales)
[1] 152.4
> sort(whales)
[1] 74 79 111 111 122 133 156 211 235 292
> min(whales)
[1] 74
> max(whales)
[1] 292
> range(whales)
[1] 74 292
> diff(whales)
[1] 48 113 -124 181 -181 100 -78 23 -77
> cumsum(whales)
[1] 74 196 431 542 834 945 1156 1289 1445 1524
# total number of beachings
# length of data vector
# average number of beachings
# `mean` function finds average
# the sorted values
# the minimum value
# the maximum value
# `ranges` returns both `min` and `max`
# `diff` returns differences
# cumulative sum
Exercises (cont’d)
The variance: the average squared distance from the mean
> mean(whales)
[1] 152.4
> xbar = mean(whales)
> whales - xbar
[1] -78.4 -30.4 82.6 -41.4 139.6 -41.4 58.6 -19.4 3.6 -73.4
> (whales-xbar)^2
[1] 6146.56 924.16 6822.76 1713.96 19488.16 1713.96
[7] 3433.96 376.36 12.96 5387.56
> sum((whales-xbar)^2)
[1] 46020.4
> n = length(whales)
> n
[1] 10
> sum((whales-xbar)^2) / (n-1)
[1] 5113.378
> var(whales)
[1] 5113.378
> sqrt(sum((whales-xbar)^2) / (n-1))
[1] 71.50789
> sqrt(var(whales))
[1] 71.50789
> sd(whales)
[1] 71.50789
# variance
# variance
# standard deviation
# standard deviation
# standard deviation
# vectorized operation
Exercises (cont’d)
the number of whales beachings per year in Florida during the 1990s was
89 254 306 292 274 233 294 204 204 90
> whales.fla <- scan()
1: 89 254 306 292 274 233 294 204 204 90
11:
Read 10 items
> whales + whales.fla
[1] 163 376 541 403 566 344 505 337 360 169
> whales - whales.fla
[1] -15 -132 -71 -181 18 -122 -83 -71 -48 -11
> t.test(whales, whales.fla, var.equal = T)
Two Sample t-test
data: whales and whales.fla
t = -2.1193, df = 18, p-value = 0.04823
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-142.5803558 -0.6196442
sample estimates:
mean of x mean of y
152.4 224.0
Creating structured data: `seq` & `rep`
simple sequences
> 1:10
[1] 1 2 3 4 5 6 7 8 9 10
> 10:1
[1] 10 9 8 7 6 5 4 3 2 1
> x <- 1:10
> x
[1] 1 2 3 4 5 6 7 8 9 10
> x <- rev(1:10)
> x
[1] 10 9 8 7 6 5 4 3 2 1
arithmetic sequences: a, a+h, a+2h, a+3h, …, a+(n-1)h
> a = 1; h = 4; n = 5;
> a + h*(0:(n-1))
[1] 1 5 9 13 17
> seq(1, 9, by =2)
[1] 1 3 5 7 9
> seq(1, 10, by =2)
[1] 1 3 5 7 9
> seq(2, 10, by =2)
[1] 2 4 6 8 10
the `seq()` function
# odd number
# as 11 > 10, 11 is not included
# even number
Creating structured data: `seq` & `rep`
the `rep()` function
> rep(1, 5)
[1] 1 1 1 1 1
> rep(1:5, 3)
[1] 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
> days=c("mon","tue","wed","thu","fri","sat","sun")
> rep(days, 2)
[1] "mon" "tue" "wed" "thu" "fri" "sat" "sun" "mon" "tue" "wed"
[11] "thu" "fri" "sat" "sun"
# repeat the number `1` five times
# repeat the sequence of values 1-5, three times
# repeat the days of the week twice
Specifying pairs of equal-sized vectors. Each term of the first is repeated the corresponding number of times
in the second
> rep(c("long", "short"), c(1,2))
[1] "long" "short" “short"
> rep(1:4,c(2,2,2,2))
[1] 1 1 2 2 3 3 4 4
> rep(1:4,c(2,1,2,1))
[1] 1 1 2 3 3 4
> rep(1:8, 1:8)
[1] 1 2 2 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 7 7 8 8 8
[32] 8 8 8 8 8
> rep(rep(1:8, 1:8), 3)
[1] 1 2 2 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 7 7 8 8 8
[32] 8 8 8 8 8 1 2 2 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7
[63] 7 7 8 8 8 8 8 8 8 8 1 2 2 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 6 6
[94] 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8
# 1 long and 2 short
> ebay = scan()
1: 88.8 88.3 90.2 93.5 95.2 94.7 99.2 99.4 101.6
10:
Read 9 items
> length(ebay)
[1] 9
> ebay[1]
[1] 88.8
> ebay[9]
[1] 101.6
> ebay[length(ebay)]
[1] 101.6
Accessing Data: vector
eBay’s Friday stock price in two months:
88.8 88.3 90.2 93.5 95.2 94.7 99.2 99.4 101.6
> ebay = scan()
1: 88.8 88.3 90.2 93.5 95.2 94.7 99.2 99.4 101.6
10:
Read 9 items
# length of vector
# get the first value
# get the last value
# get the last value when the length is not know
> ebay[-1]
[1] 88.3 90.2 93.5 95.2 94.7 99.2 99.4 101.6
> ebay[-(1:4)]
[1] 95.2 94.7 99.2 99.4 101.6
Accessing Data: vector (cont’d)
eBay’s Friday stock price in two months:
88.8 88.3 90.2 93.5 95.2 94.7 99.2 99.4 101.6
> ebay[1:4]
[1] 88.8 88.3 90.2 93.5
> ebay[c(1,5,9)]
[1] 88.8 95.2 101.6
# get the first four values
# get the first, fifth, and ninth values
if `i` is between 1 and n (length of x), `x[i]` returns the i-th value of `x`
if `i` is beer than n, a value of NA is return
if `i` is negative and no less than -n, `x[i]` returns all but the i-th value of x.
# all but the first
# all but the 1st to 4th
> ebay[1] = 88.0
> ebay
[1] 88.0 88.3 90.2 93.5 95.2 94.7 99.2 99.4 101.6
> ebay[10:13]=c(97.0, 99.3,102.0,101.8)
> ebay
[1] 88.0 88.3 90.2 93.5 95.2 94.7 99.2 99.4 101.6 97.0
[11] 99.3 102.0 101.8
assigning values to data vector
`x[i] = a` : assign a value of `a` to the i-th element of x (i is positive)
if i is larger than the length of x, then x is enlarged
> ebay > 100
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
[11] FALSE TRUE TRUE
> ebay[ebay>100]
[1] 101.6 102.0 101.8
> which(ebay > 100)
[1] 9 12 13
> ebay[c(9,12,13)]
[1] 101.6 102.0 101.8
> ebay[ebay>1000]
numeric(0)
> sum(ebay > 100)
[1] 3
> sum(ebay > 100)/length(ebay)
[1] 0.2308
Accessing Data: vector (cont’d)
which values of ebay are more than 100?
# number bigger than 100
# proportion bigger
Accessing Data: matrix
> USPersonalExpenditure
1940 1945 1950 1955 1960
Food and Tobacco 22.200 44.500 59.60 73.2 86.80
Household Operation 10.500 15.500 29.00 36.5 46.20
Medical and Health 3.530 5.760 9.71 14.0 21.10
Personal Care 1.040 1.980 2.45 3.4 5.40
Private Education 0.341 0.974 1.80 2.6 3.64
> dim(USPersonalExpenditure)
[1] 5 5
> ncol(USPersonalExpenditure)
[1] 5
> nrow(USPersonalExpenditure)
[1] 5
> colnames(USPersonalExpenditure)
[1] "1940" "1945" "1950" "1955" "1960"
> rownames(USPersonalExpenditure)
[1] "Food and Tobacco" "Household Operation" "Medical and Health"
[4] "Personal Care" "Private Education"
Accessing Data: matrix
> USPersonalExpenditure
1940 1945 1950 1955 1960
Food and Tobacco 22.200 44.500 59.60 73.2 86.80
Household Operation 10.500 15.500 29.00 36.5 46.20
Medical and Health 3.530 5.760 9.71 14.0 21.10
Personal Care 1.040 1.980 2.45 3.4 5.40
Private Education 0.341 0.974 1.80 2.6 3.64
Accessing Data: matrix (cont’d)
USPersonalExpenditure[1,]
looks at row no.1
USPersonalExpenditure[4:5,]
looks at row 4 through 5
USPersonalExpenditure[3,1]
looks at row 3, column 1 USPersonalExpenditure[,2]
looks at column 2
> USPersonalExpenditure[1,]
1940 1945 1950 1955 1960
22.2 44.5 59.6 73.2 86.8
> USPersonalExpenditure[,2]
Food and Tobacco Household Operation Medical and Health Personal Care Private Education
44.500 15.500 5.760 1.980 0.974
> USPersonalExpenditure[3,1]
[1] 3.53
> USPersonalExpenditure[4:5,]
1940 1945 1950 1955 1960
Personal Care 1.040 1.980 2.45 3.4 5.40
Private Education 0.341 0.974 1.80 2.6 3.64
> USPersonalExpenditure[1,3]
[1] 59.6
> USPersonalExpenditure["Food and Tobacco", "1950"]
[1] 59.6
> USPersonalExpenditure[1, "1950"]
[1] 59.6
Accessing Data: matrix (cont’d)
> USPersonalExpenditure[1, c(5, 3, 1)]
1960 1950 1940
86.8 59.6 22.2
>
> USPersonalExpenditure["Food and Tobacco", c("1960", "1950", "1940")]
1960 1950 1940
86.8 59.6 22.2
> USPersonalExpenditure
1940 1945 1950 1955 1960
Food and Tobacco 22.200 44.500 59.60 73.2 86.80
Household Operation 10.500 15.500 29.00 36.5 46.20
Medical and Health 3.530 5.760 9.71 14.0 21.10
Personal Care 1.040 1.980 2.45 3.4 5.40
Private Education 0.341 0.974 1.80 2.6 3.64
Accessing Data: matrix (cont’d)
> sum(USPersonalExpenditure[,1])
[1] 37.611
> sum(USPersonalExpenditure[,2])
[1] 68.714
> sum(USPersonalExpenditure[,3])
[1] 102.56
> sum(USPersonalExpenditure[,4])
[1] 129.7
> sum(USPersonalExpenditure[,5])
[1] 163.14
> sum(USPersonalExpenditure[,6])
Error in USPersonalExpenditure[, 6] : subscript out of bounds
> colSums(USPersonalExpenditure)
1940 1945 1950 1955 1960
37.611 68.714 102.560 129.700 163.140
> rowSums(USPersonalExpenditure)
Food and Tobacco Household Operation Medical and Health Personal Care Private Education
286.300 137.700 54.100 14.270 9.355
產⽣生隨機序列
set.seed(689)
x = rnorm(1000)
head(x)
[1] 0.9684062 -2.1456719 0.5330228 -0.1597912 0.6806083 -0.7543219
hist(x)
set.seed(689)
x = rexp(1000)
head(x)
[1] 0.6671585 1.3973498 0.4059822 1.1404633 1.2143525 0.2488164
hist(x)
set.seed(689)
x = runif(1000)
head(x)
[1] 0.83357924 0.23074833 0.01594958 0.69043398 0.70299110 0.36182904
hist(x)
random numbers draw from a
standard normal distribution
random numbers draw from an
exponential distribution
random numbers draw from an
uniform distribution
set.seed(9.2)
x.samples <- matrix(rnorm(10000*30), nrow = 10000)
par(mfrow=c(1,2))
hist(x.samples)
hist(rowMeans(x.samples))
set.seed(9.2)
x.samples <- matrix(rexp(10000*30), nrow = 10000)
par(mfrow=c(1,2))
hist(x.samples)
hist(rowMeans(x.samples))
set.seed(9.2)
x.samples <- matrix(runif(10000*30), nrow = 10000)
par(mfrow=c(1,2))
hist(x.samples)
hist(rowMeans(x.samples))
# Central Limit Theorem (中央極限定理)
sampling distribution of mean
draw from an uniform distribution
sampling distribution of mean draw
from a standard normal distribution
sampling distribution of mean draw
from an exponential distribution
random numbers draw from a
standard normal distribution
random numbers draw from
an exponential distribution
random numbers draw from
an uniform distribution

More Related Content

What's hot (20)

KEY
Haskellで学ぶ関数型言語
ikdysfm
 
PPTX
Python programming -Tuple and Set Data type
Megha V
 
PDF
Java script objects 1
H K
 
PDF
Cheat sheet python3
sxw2k
 
PDF
Python_ 3 CheatSheet
Dr. Volkan OBAN
 
PDF
Everything is composable
Victor Igor
 
PPTX
R lecture oga
Osamu Ogasawara
 
PDF
Артём Акуляков - F# for Data Analysis
SpbDotNet Community
 
PDF
Vectors data frames
FAO
 
PDF
Python Puzzlers
Tendayi Mawushe
 
PDF
[1062BPY12001] Data analysis with R / week 3
Kevin Chun-Hsien Hsu
 
PDF
Flink Forward Berlin 2017: David Rodriguez - The Approximate Filter, Join, an...
Flink Forward
 
PPTX
Quadratic Expressions
2IIM
 
PPT
Oracle sql ppt2
Madhavendra Dutt
 
PDF
Binary search: illustrated step-by-step walk through
Yoshi Watanabe
 
PDF
Plc (1)
James Croft
 
PDF
Merge sort: illustrated step-by-step walk through
Yoshi Watanabe
 
PPTX
Plc (1)
James Croft
 
PDF
6. Vectors – Data Frames
FAO
 
PDF
Python matplotlib cheat_sheet
Nishant Upadhyay
 
Haskellで学ぶ関数型言語
ikdysfm
 
Python programming -Tuple and Set Data type
Megha V
 
Java script objects 1
H K
 
Cheat sheet python3
sxw2k
 
Python_ 3 CheatSheet
Dr. Volkan OBAN
 
Everything is composable
Victor Igor
 
R lecture oga
Osamu Ogasawara
 
Артём Акуляков - F# for Data Analysis
SpbDotNet Community
 
Vectors data frames
FAO
 
Python Puzzlers
Tendayi Mawushe
 
[1062BPY12001] Data analysis with R / week 3
Kevin Chun-Hsien Hsu
 
Flink Forward Berlin 2017: David Rodriguez - The Approximate Filter, Join, an...
Flink Forward
 
Quadratic Expressions
2IIM
 
Oracle sql ppt2
Madhavendra Dutt
 
Binary search: illustrated step-by-step walk through
Yoshi Watanabe
 
Plc (1)
James Croft
 
Merge sort: illustrated step-by-step walk through
Yoshi Watanabe
 
Plc (1)
James Croft
 
6. Vectors – Data Frames
FAO
 
Python matplotlib cheat_sheet
Nishant Upadhyay
 

Similar to [1062BPY12001] Data analysis with R / week 2 (20)

PPT
R tutorial for a windows environment
Yogendra Chaubey
 
PPTX
R programming language
Alberto Minetti
 
PPT
R Programming Intro
062MayankSinghal
 
PPTX
R Basics
Dr.E.N.Sathishkumar
 
PDF
An overview of Python 2.7
decoupled
 
PDF
A tour of Python
Aleksandar Veselinovic
 
PPTX
A quick introduction to R
Angshuman Saha
 
PDF
Day 1c access, select ordering copy.pptx
Adrien Melquiond
 
ODP
Python Day1
Mantavya Gajjar
 
PDF
Useful javascript
Lei Kang
 
DOCX
NPTEL QUIZ.docx
GEETHAR59
 
KEY
Into Clojure
Alf Kristian Støyle
 
PDF
bobok
Adi Pandarangga
 
PDF
6. R data structures
ExternalEvents
 
PPTX
R교육1
Kangwook Lee
 
PDF
R Programming Homework Help
Statistics Homework Helper
 
PPTX
Seminar PSU 09.04.2013 - 10.04.2013 MiFIT, Arbuzov Vyacheslav
Vyacheslav Arbuzov
 
PPTX
Pythonlearn-08-Lists.pptx
MihirDatir
 
PDF
Learning R
Kamal Gupta Roy
 
PDF
01_introduction_lab.pdf
zehiwot hone
 
R tutorial for a windows environment
Yogendra Chaubey
 
R programming language
Alberto Minetti
 
R Programming Intro
062MayankSinghal
 
An overview of Python 2.7
decoupled
 
A tour of Python
Aleksandar Veselinovic
 
A quick introduction to R
Angshuman Saha
 
Day 1c access, select ordering copy.pptx
Adrien Melquiond
 
Python Day1
Mantavya Gajjar
 
Useful javascript
Lei Kang
 
NPTEL QUIZ.docx
GEETHAR59
 
Into Clojure
Alf Kristian Støyle
 
6. R data structures
ExternalEvents
 
R교육1
Kangwook Lee
 
R Programming Homework Help
Statistics Homework Helper
 
Seminar PSU 09.04.2013 - 10.04.2013 MiFIT, Arbuzov Vyacheslav
Vyacheslav Arbuzov
 
Pythonlearn-08-Lists.pptx
MihirDatir
 
Learning R
Kamal Gupta Roy
 
01_introduction_lab.pdf
zehiwot hone
 
Ad

More from Kevin Chun-Hsien Hsu (20)

PDF
[1062BPY12001] Data analysis with R / April 26
Kevin Chun-Hsien Hsu
 
PDF
[1062BPY12001] Data analysis with R / April 19
Kevin Chun-Hsien Hsu
 
PDF
[1062BPY12001] Data analysis with R / week 4
Kevin Chun-Hsien Hsu
 
PDF
語言議題
Kevin Chun-Hsien Hsu
 
PPTX
Regression 0410
Kevin Chun-Hsien Hsu
 
PDF
Statistical computing 03
Kevin Chun-Hsien Hsu
 
PDF
Statistical computing 01
Kevin Chun-Hsien Hsu
 
PPTX
Statistical computing 00
Kevin Chun-Hsien Hsu
 
PPTX
Chi square
Kevin Chun-Hsien Hsu
 
PPTX
Multiple regression
Kevin Chun-Hsien Hsu
 
PPTX
Model III ANOVA & Simple Main Effects
Kevin Chun-Hsien Hsu
 
PPTX
Essentials of EEG/MEG
Kevin Chun-Hsien Hsu
 
PDF
repeated-measure-ANOVA
Kevin Chun-Hsien Hsu
 
PDF
Kirk' Experimental Design, Chapter 4
Kevin Chun-Hsien Hsu
 
PDF
Kirk' Experimental Design, Chapter 3
Kevin Chun-Hsien Hsu
 
PPTX
APA style
Kevin Chun-Hsien Hsu
 
PDF
資料檢索
Kevin Chun-Hsien Hsu
 
PDF
Kirk' Experimental Design, Chapter 5
Kevin Chun-Hsien Hsu
 
PDF
Kirk' Experimental Design, Chapter 2
Kevin Chun-Hsien Hsu
 
PDF
Kirk' Experimental Design, Chapter 1
Kevin Chun-Hsien Hsu
 
[1062BPY12001] Data analysis with R / April 26
Kevin Chun-Hsien Hsu
 
[1062BPY12001] Data analysis with R / April 19
Kevin Chun-Hsien Hsu
 
[1062BPY12001] Data analysis with R / week 4
Kevin Chun-Hsien Hsu
 
語言議題
Kevin Chun-Hsien Hsu
 
Regression 0410
Kevin Chun-Hsien Hsu
 
Statistical computing 03
Kevin Chun-Hsien Hsu
 
Statistical computing 01
Kevin Chun-Hsien Hsu
 
Statistical computing 00
Kevin Chun-Hsien Hsu
 
Multiple regression
Kevin Chun-Hsien Hsu
 
Model III ANOVA & Simple Main Effects
Kevin Chun-Hsien Hsu
 
Essentials of EEG/MEG
Kevin Chun-Hsien Hsu
 
repeated-measure-ANOVA
Kevin Chun-Hsien Hsu
 
Kirk' Experimental Design, Chapter 4
Kevin Chun-Hsien Hsu
 
Kirk' Experimental Design, Chapter 3
Kevin Chun-Hsien Hsu
 
資料檢索
Kevin Chun-Hsien Hsu
 
Kirk' Experimental Design, Chapter 5
Kevin Chun-Hsien Hsu
 
Kirk' Experimental Design, Chapter 2
Kevin Chun-Hsien Hsu
 
Kirk' Experimental Design, Chapter 1
Kevin Chun-Hsien Hsu
 
Ad

Recently uploaded (20)

PPTX
Aerobic and Anaerobic respiration and CPR.pptx
Olivier Rochester
 
PDF
Indian National movement PPT by Simanchala Sarab, Covering The INC(Formation,...
Simanchala Sarab, BABed(ITEP Secondary stage) in History student at GNDU Amritsar
 
PDF
The Power of Compound Interest (Stanford Initiative for Financial Decision-Ma...
Stanford IFDM
 
PDF
Cooperative wireless communications 1st Edition Yan Zhang
jsphyftmkb123
 
PPTX
Matatag Curriculum English 8-Week 1 Day 1-5.pptx
KirbieJaneGasta1
 
PPTX
Iván Bornacelly - Presentation of the report - Empowering the workforce in th...
EduSkills OECD
 
PDF
CAD25 Gbadago and Fafa Presentation Revised-Aston Business School, UK.pdf
Kweku Zurek
 
PDF
Lesson 1 : Science and the Art of Geography Ecosystem
marvinnbustamante1
 
PPTX
Ward Management: Patient Care, Personnel, Equipment, and Environment.pptx
PRADEEP ABOTHU
 
PPTX
How to Configure Refusal of Applicants in Odoo 18 Recruitment
Celine George
 
PDF
Nanotechnology and Functional Foods Effective Delivery of Bioactive Ingredien...
rmswlwcxai8321
 
PPTX
Comparing Translational and Rotational Motion.pptx
AngeliqueTolentinoDe
 
PPTX
PLANNING A HOSPITAL AND NURSING UNIT.pptx
PRADEEP ABOTHU
 
PPTX
Practice Gardens and Polytechnic Education: Utilizing Nature in 1950s’ Hu...
Lajos Somogyvári
 
PPTX
week 1-2.pptx yueojerjdeiwmwjsweuwikwswiewjrwiwkw
rebznelz
 
PDF
Quiz Night Live May 2025 - Intra Pragya Online General Quiz
Pragya - UEM Kolkata Quiz Club
 
PDF
Free eBook ~100 Common English Proverbs (ebook) pdf.pdf
OH TEIK BIN
 
PDF
Genomics Proteomics and Vaccines 1st Edition Guido Grandi (Editor)
kboqcyuw976
 
PDF
Learning Styles Inventory for Senior High School Students
Thelma Villaflores
 
PDF
Wikinomics How Mass Collaboration Changes Everything Don Tapscott
wcsqyzf5909
 
Aerobic and Anaerobic respiration and CPR.pptx
Olivier Rochester
 
Indian National movement PPT by Simanchala Sarab, Covering The INC(Formation,...
Simanchala Sarab, BABed(ITEP Secondary stage) in History student at GNDU Amritsar
 
The Power of Compound Interest (Stanford Initiative for Financial Decision-Ma...
Stanford IFDM
 
Cooperative wireless communications 1st Edition Yan Zhang
jsphyftmkb123
 
Matatag Curriculum English 8-Week 1 Day 1-5.pptx
KirbieJaneGasta1
 
Iván Bornacelly - Presentation of the report - Empowering the workforce in th...
EduSkills OECD
 
CAD25 Gbadago and Fafa Presentation Revised-Aston Business School, UK.pdf
Kweku Zurek
 
Lesson 1 : Science and the Art of Geography Ecosystem
marvinnbustamante1
 
Ward Management: Patient Care, Personnel, Equipment, and Environment.pptx
PRADEEP ABOTHU
 
How to Configure Refusal of Applicants in Odoo 18 Recruitment
Celine George
 
Nanotechnology and Functional Foods Effective Delivery of Bioactive Ingredien...
rmswlwcxai8321
 
Comparing Translational and Rotational Motion.pptx
AngeliqueTolentinoDe
 
PLANNING A HOSPITAL AND NURSING UNIT.pptx
PRADEEP ABOTHU
 
Practice Gardens and Polytechnic Education: Utilizing Nature in 1950s’ Hu...
Lajos Somogyvári
 
week 1-2.pptx yueojerjdeiwmwjsweuwikwswiewjrwiwkw
rebznelz
 
Quiz Night Live May 2025 - Intra Pragya Online General Quiz
Pragya - UEM Kolkata Quiz Club
 
Free eBook ~100 Common English Proverbs (ebook) pdf.pdf
OH TEIK BIN
 
Genomics Proteomics and Vaccines 1st Edition Guido Grandi (Editor)
kboqcyuw976
 
Learning Styles Inventory for Senior High School Students
Thelma Villaflores
 
Wikinomics How Mass Collaboration Changes Everything Don Tapscott
wcsqyzf5909
 

[1062BPY12001] Data analysis with R / week 2

  • 2. Entering Input 在 R 的提⽰示符號 (prompt, “>”) 右⽅方鍵⼊入指令。 符號 “<-“ 為設定運算⼦子 (the assignment operator). R 語⾔言的語法決定指令是否完備 The # character indicates a comment. Anything to the right of the # (including the # itself) is ignored. > x <- 1 > print(x) [1] 1 > x [1] 1 > msg <- "hello" > x <- ## Incomplete expression > x <- “5 ## Incomplete expression > x <- c(5,) ## Incomplete expression
  • 3. R 物件命名規則 Case sensitive • A and a are different All alphanumeric symbols are allowed (A-Z, a-z, 0-9) • “.”, “_”. Name must start with “.” or a letter. Do not use reserved keywords ❑ 錯誤命名 ■ 3x ■ 3_x ■ 3-x ■ 3.x ■ .3variable ❑ 正確命名 ■ x_3 ■ x3 ■ x.3 ■ taiwan.taipei.x3 ■ .variable
  • 4. Evaluation 執⾏行指令 (evaluation) 之後,其結果會被回傳,可能會直接列印在 console 視窗,或者存⼊入某個變數 The [1] indicates that x is a vector and 5 is the first element. > x <- 5 > x [1] 5 > print(x) [1] 5 ## nothing printed ## auto-printing occurs ## explicit printing The : operator is used to create integer sequences. > x <- 1:20 > x [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 [16] 16 17 18 19 20 Printing
  • 5. > 3 - 4 [1] -1 > 5 * 6 [1] 30 > 7 / 8 [1] 0.875 > 1 + 2 * 3 [1] 7 > (1 + 2) * 3 [1] 9 > 15 / 4 [1] 3.75 > 15 %% 4 [1] 3 > 2^2 [1] 4 > 2^0.5 [1] 1.414214 > 2^ 4.3 [1] 19.69831 > 2^-0.5 [1] 0.7071068 > log(4) # natural log > log10(4) # log in base 10 > log(4,10) # same as above > sqrt(9) # square root > abs(3-4) # absolute value > exp(1) # exponential Using R as a Calculator
  • 6. > squareroot(2) Error: could not find function “squareroot” > sqrt 2 Error: unexpected numeric constant in "sqrt 2" > sqrt(-2) [1] NaN Warning message: In sqrt(-2) : NaNs produced > sqrt(2 + ) [1] 1.414214 Warnings and Errors
  • 7. Objects R has five basic or “atomic” classes of objects: · character · numeric (real numbers) · integer · complex · logical (True/False) The most basic object is a vector · A vector can only contain objects of the same class · BUT: The one exception is a list, which is represented as a vector but can contain objects of different classes (indeed, that’s usually why we use them) Empty vectors can be created with the vector() function.
  • 8. Numbers · Numbers in R a generally treated as numeric objects (i.e. double precision real numbers) · If you explicitly want an integer, you need to specify the L suffix · Ex: Entering 1 gives you a numeric object; entering 1L explicitly gives you an integer. · There is also a special number Inf which represents infinity; e.g. 1 / 0; Inf can be used in ordinary calculations; e.g. 1 / Inf is 0 · The value NaN represents an undefined value (“not a number”); e.g. 0 / 0; NaN can also be thought of as a missing value (more on that later)
  • 9. Creating Vectors The c() function can be used to create vectors of objects. Using the vector() function > x <- c(0.5, 0.6) ## numeric > x <- c(TRUE, FALSE) ## logical > x <- c(T, F) ## logical > x <- c("a", "b", "c") ## character > x <- 9:29 ## integer > x <- c(1+0i, 2+4i) ## complex > x <- vector("numeric", length = 10) > x [1] 0 0 0 0 0 0 0 0 0 0 > x <- c(74, 122, 235, 111, 292, 111, 211, 133, 156, 79) > x1 <- c(74, 122, 235, 111, 292) > x2 <- c(111, 211, 133, 156, 79) > x_all <- c(x1, x2) The c() function can also combine data vectors. For example:
  • 10. Mixing Objects What about the following? When different objects are mixed in a vector, coercion occurs so that every element in the vector is of the same class. > y <- c(1.7, "a") ## character > y <- c(TRUE, 2) ## numeric > y <- c("a", TRUE) ## character
  • 11. Explicit Coercion Objects can be explicitly coerced from one class to another using the as.* functions, if available. > x <- 0:6 > class(x) [1] "integer" > as.numeric(x) [1] 0 1 2 3 4 5 6 > as.logical(x) [1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE > as.character(x) [1] "0" "1" "2" "3" "4" "5" "6"
  • 12. Explicit Coercion Nonsensical coercion results in NAs. > x <- c("a", "b", "c") > as.numeric(x) [1] NA NA NA Warning message: NAs introduced by coercion > as.logical(x) [1] NA NA NA > as.complex(x) [1] 0+0i 1+0i 2+0i 3+0i 4+0i 5+0i 6+0i
  • 13. Attributes R objects can have attributes · names, dimnames · dimensions (e.g. matrices, arrays) · class · length · other user-defined attributes/metadata Attributes of an object can be accessed using the attributes() function.
  • 14. Matrices Matrices are vectors with a dimension attribute. The dimension attribute is itself an integer vector of length 2 (nrow, ncol). Matrices are constructed column-wise, so entries can be thought of starting in the “upper left” corner and running down the columns. > m <- 1:10 > m [1] 1 2 3 4 5 6 7 8 9 10 > dim(m) NULL > attributes(m) NULL > dim(m) <- c(2, 5) > m [,1] [,2] [,3] [,4] [,5] [1,] 1 3 5 7 9 [2,] 2 4 6 8 10 > dim(m) [1] 2 5 > attributes(m) $dim [1] 2 5
  • 15. Matrices (cont’d) Matrices are constructed column-wise, so entries can be thought of starting in the “upper left” corner and running down the columns. > m <- matrix(1:6, nrow = 2, ncol = 3) > m [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6
  • 16. cbind-ing and rbind-ing Matrices can be created by column-binding or row-binding with cbind() and rbind(). > x <- 1:3 > y <- 10:12 > cbind(x, y) x y [1,] 1 10 [2,] 2 11 [3,] 3 12 > rbind(x, y) [,1] [,2] [,3] x 1 2 3 y 10 11 12
  • 17. Exercises the number of whales beachings per year in Texas during the 1990s was 74 122 235 111 292 111 211 133 156 79 Store this data in R > whales <- scan() 1: 74 122 235 111 292 111 211 133 156 79 11: Read 10 items > whales <- c(74, 122, 235, 111, 292, 111, 211, 133, 156, 79) or
  • 18. Exercises (cont’d) using functions on a data vector > sum(whales) [1] 1524 > length(whales) [1] 10 > sum(whales)/length(whales) [1] 152.4 > mean(whales) [1] 152.4 > sort(whales) [1] 74 79 111 111 122 133 156 211 235 292 > min(whales) [1] 74 > max(whales) [1] 292 > range(whales) [1] 74 292 > diff(whales) [1] 48 113 -124 181 -181 100 -78 23 -77 > cumsum(whales) [1] 74 196 431 542 834 945 1156 1289 1445 1524 # total number of beachings # length of data vector # average number of beachings # `mean` function finds average # the sorted values # the minimum value # the maximum value # `ranges` returns both `min` and `max` # `diff` returns differences # cumulative sum
  • 19. Exercises (cont’d) The variance: the average squared distance from the mean > mean(whales) [1] 152.4 > xbar = mean(whales) > whales - xbar [1] -78.4 -30.4 82.6 -41.4 139.6 -41.4 58.6 -19.4 3.6 -73.4 > (whales-xbar)^2 [1] 6146.56 924.16 6822.76 1713.96 19488.16 1713.96 [7] 3433.96 376.36 12.96 5387.56 > sum((whales-xbar)^2) [1] 46020.4 > n = length(whales) > n [1] 10 > sum((whales-xbar)^2) / (n-1) [1] 5113.378 > var(whales) [1] 5113.378 > sqrt(sum((whales-xbar)^2) / (n-1)) [1] 71.50789 > sqrt(var(whales)) [1] 71.50789 > sd(whales) [1] 71.50789 # variance # variance # standard deviation # standard deviation # standard deviation # vectorized operation
  • 20. Exercises (cont’d) the number of whales beachings per year in Florida during the 1990s was 89 254 306 292 274 233 294 204 204 90 > whales.fla <- scan() 1: 89 254 306 292 274 233 294 204 204 90 11: Read 10 items > whales + whales.fla [1] 163 376 541 403 566 344 505 337 360 169 > whales - whales.fla [1] -15 -132 -71 -181 18 -122 -83 -71 -48 -11 > t.test(whales, whales.fla, var.equal = T) Two Sample t-test data: whales and whales.fla t = -2.1193, df = 18, p-value = 0.04823 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -142.5803558 -0.6196442 sample estimates: mean of x mean of y 152.4 224.0
  • 21. Creating structured data: `seq` & `rep` simple sequences > 1:10 [1] 1 2 3 4 5 6 7 8 9 10 > 10:1 [1] 10 9 8 7 6 5 4 3 2 1 > x <- 1:10 > x [1] 1 2 3 4 5 6 7 8 9 10 > x <- rev(1:10) > x [1] 10 9 8 7 6 5 4 3 2 1 arithmetic sequences: a, a+h, a+2h, a+3h, …, a+(n-1)h > a = 1; h = 4; n = 5; > a + h*(0:(n-1)) [1] 1 5 9 13 17 > seq(1, 9, by =2) [1] 1 3 5 7 9 > seq(1, 10, by =2) [1] 1 3 5 7 9 > seq(2, 10, by =2) [1] 2 4 6 8 10 the `seq()` function # odd number # as 11 > 10, 11 is not included # even number
  • 22. Creating structured data: `seq` & `rep` the `rep()` function > rep(1, 5) [1] 1 1 1 1 1 > rep(1:5, 3) [1] 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 > days=c("mon","tue","wed","thu","fri","sat","sun") > rep(days, 2) [1] "mon" "tue" "wed" "thu" "fri" "sat" "sun" "mon" "tue" "wed" [11] "thu" "fri" "sat" "sun" # repeat the number `1` five times # repeat the sequence of values 1-5, three times # repeat the days of the week twice Specifying pairs of equal-sized vectors. Each term of the first is repeated the corresponding number of times in the second > rep(c("long", "short"), c(1,2)) [1] "long" "short" “short" > rep(1:4,c(2,2,2,2)) [1] 1 1 2 2 3 3 4 4 > rep(1:4,c(2,1,2,1)) [1] 1 1 2 3 3 4 > rep(1:8, 1:8) [1] 1 2 2 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 7 7 8 8 8 [32] 8 8 8 8 8 > rep(rep(1:8, 1:8), 3) [1] 1 2 2 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 7 7 8 8 8 [32] 8 8 8 8 8 1 2 2 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 [63] 7 7 8 8 8 8 8 8 8 8 1 2 2 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 6 6 [94] 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 # 1 long and 2 short
  • 23. > ebay = scan() 1: 88.8 88.3 90.2 93.5 95.2 94.7 99.2 99.4 101.6 10: Read 9 items > length(ebay) [1] 9 > ebay[1] [1] 88.8 > ebay[9] [1] 101.6 > ebay[length(ebay)] [1] 101.6 Accessing Data: vector eBay’s Friday stock price in two months: 88.8 88.3 90.2 93.5 95.2 94.7 99.2 99.4 101.6 > ebay = scan() 1: 88.8 88.3 90.2 93.5 95.2 94.7 99.2 99.4 101.6 10: Read 9 items # length of vector # get the first value # get the last value # get the last value when the length is not know
  • 24. > ebay[-1] [1] 88.3 90.2 93.5 95.2 94.7 99.2 99.4 101.6 > ebay[-(1:4)] [1] 95.2 94.7 99.2 99.4 101.6 Accessing Data: vector (cont’d) eBay’s Friday stock price in two months: 88.8 88.3 90.2 93.5 95.2 94.7 99.2 99.4 101.6 > ebay[1:4] [1] 88.8 88.3 90.2 93.5 > ebay[c(1,5,9)] [1] 88.8 95.2 101.6 # get the first four values # get the first, fifth, and ninth values if `i` is between 1 and n (length of x), `x[i]` returns the i-th value of `x` if `i` is beer than n, a value of NA is return if `i` is negative and no less than -n, `x[i]` returns all but the i-th value of x. # all but the first # all but the 1st to 4th > ebay[1] = 88.0 > ebay [1] 88.0 88.3 90.2 93.5 95.2 94.7 99.2 99.4 101.6 > ebay[10:13]=c(97.0, 99.3,102.0,101.8) > ebay [1] 88.0 88.3 90.2 93.5 95.2 94.7 99.2 99.4 101.6 97.0 [11] 99.3 102.0 101.8 assigning values to data vector `x[i] = a` : assign a value of `a` to the i-th element of x (i is positive) if i is larger than the length of x, then x is enlarged
  • 25. > ebay > 100 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE [11] FALSE TRUE TRUE > ebay[ebay>100] [1] 101.6 102.0 101.8 > which(ebay > 100) [1] 9 12 13 > ebay[c(9,12,13)] [1] 101.6 102.0 101.8 > ebay[ebay>1000] numeric(0) > sum(ebay > 100) [1] 3 > sum(ebay > 100)/length(ebay) [1] 0.2308 Accessing Data: vector (cont’d) which values of ebay are more than 100? # number bigger than 100 # proportion bigger
  • 27. > USPersonalExpenditure 1940 1945 1950 1955 1960 Food and Tobacco 22.200 44.500 59.60 73.2 86.80 Household Operation 10.500 15.500 29.00 36.5 46.20 Medical and Health 3.530 5.760 9.71 14.0 21.10 Personal Care 1.040 1.980 2.45 3.4 5.40 Private Education 0.341 0.974 1.80 2.6 3.64 > dim(USPersonalExpenditure) [1] 5 5 > ncol(USPersonalExpenditure) [1] 5 > nrow(USPersonalExpenditure) [1] 5 > colnames(USPersonalExpenditure) [1] "1940" "1945" "1950" "1955" "1960" > rownames(USPersonalExpenditure) [1] "Food and Tobacco" "Household Operation" "Medical and Health" [4] "Personal Care" "Private Education" Accessing Data: matrix
  • 28. > USPersonalExpenditure 1940 1945 1950 1955 1960 Food and Tobacco 22.200 44.500 59.60 73.2 86.80 Household Operation 10.500 15.500 29.00 36.5 46.20 Medical and Health 3.530 5.760 9.71 14.0 21.10 Personal Care 1.040 1.980 2.45 3.4 5.40 Private Education 0.341 0.974 1.80 2.6 3.64 Accessing Data: matrix (cont’d) USPersonalExpenditure[1,] looks at row no.1 USPersonalExpenditure[4:5,] looks at row 4 through 5 USPersonalExpenditure[3,1] looks at row 3, column 1 USPersonalExpenditure[,2] looks at column 2 > USPersonalExpenditure[1,] 1940 1945 1950 1955 1960 22.2 44.5 59.6 73.2 86.8 > USPersonalExpenditure[,2] Food and Tobacco Household Operation Medical and Health Personal Care Private Education 44.500 15.500 5.760 1.980 0.974 > USPersonalExpenditure[3,1] [1] 3.53 > USPersonalExpenditure[4:5,] 1940 1945 1950 1955 1960 Personal Care 1.040 1.980 2.45 3.4 5.40 Private Education 0.341 0.974 1.80 2.6 3.64 > USPersonalExpenditure[1,3] [1] 59.6 > USPersonalExpenditure["Food and Tobacco", "1950"] [1] 59.6 > USPersonalExpenditure[1, "1950"] [1] 59.6
  • 29. Accessing Data: matrix (cont’d) > USPersonalExpenditure[1, c(5, 3, 1)] 1960 1950 1940 86.8 59.6 22.2 > > USPersonalExpenditure["Food and Tobacco", c("1960", "1950", "1940")] 1960 1950 1940 86.8 59.6 22.2 > USPersonalExpenditure 1940 1945 1950 1955 1960 Food and Tobacco 22.200 44.500 59.60 73.2 86.80 Household Operation 10.500 15.500 29.00 36.5 46.20 Medical and Health 3.530 5.760 9.71 14.0 21.10 Personal Care 1.040 1.980 2.45 3.4 5.40 Private Education 0.341 0.974 1.80 2.6 3.64
  • 30. Accessing Data: matrix (cont’d) > sum(USPersonalExpenditure[,1]) [1] 37.611 > sum(USPersonalExpenditure[,2]) [1] 68.714 > sum(USPersonalExpenditure[,3]) [1] 102.56 > sum(USPersonalExpenditure[,4]) [1] 129.7 > sum(USPersonalExpenditure[,5]) [1] 163.14 > sum(USPersonalExpenditure[,6]) Error in USPersonalExpenditure[, 6] : subscript out of bounds > colSums(USPersonalExpenditure) 1940 1945 1950 1955 1960 37.611 68.714 102.560 129.700 163.140 > rowSums(USPersonalExpenditure) Food and Tobacco Household Operation Medical and Health Personal Care Private Education 286.300 137.700 54.100 14.270 9.355
  • 32. set.seed(689) x = rnorm(1000) head(x) [1] 0.9684062 -2.1456719 0.5330228 -0.1597912 0.6806083 -0.7543219 hist(x) set.seed(689) x = rexp(1000) head(x) [1] 0.6671585 1.3973498 0.4059822 1.1404633 1.2143525 0.2488164 hist(x) set.seed(689) x = runif(1000) head(x) [1] 0.83357924 0.23074833 0.01594958 0.69043398 0.70299110 0.36182904 hist(x) random numbers draw from a standard normal distribution random numbers draw from an exponential distribution random numbers draw from an uniform distribution
  • 33. set.seed(9.2) x.samples <- matrix(rnorm(10000*30), nrow = 10000) par(mfrow=c(1,2)) hist(x.samples) hist(rowMeans(x.samples)) set.seed(9.2) x.samples <- matrix(rexp(10000*30), nrow = 10000) par(mfrow=c(1,2)) hist(x.samples) hist(rowMeans(x.samples)) set.seed(9.2) x.samples <- matrix(runif(10000*30), nrow = 10000) par(mfrow=c(1,2)) hist(x.samples) hist(rowMeans(x.samples)) # Central Limit Theorem (中央極限定理) sampling distribution of mean draw from an uniform distribution sampling distribution of mean draw from a standard normal distribution sampling distribution of mean draw from an exponential distribution random numbers draw from a standard normal distribution random numbers draw from an exponential distribution random numbers draw from an uniform distribution