R Tutorial
R Tutorial
R TUTORIAL
BUSINESS ANALYTICS (SOSE 2014)
Martin Wistuba
29/04/2014
4/29/2014
What is R?
Programming language
Software environment
Used by Statisticians and Data Miners
Open Source version of S
Large number of built-in statistical functions
Easily configurable via packages
4/29/2014
Getting R
Download it from:
http://cran.r-project.org/
Manual:
http://cran.r-project.org/doc/manuals/R-intro.pdf
Includes RGui - IDE
4/29/2014
the result
> 1+1
[1] 2
> pi
[1] 3.141593
> sqrt(2)
[1] 1.414214
4/29/2014
Basics - Variables
Variables are created and initialized via name <- value
> x <- 23
> y <- 4
> z <- log( sqrt(x) + y )
> print(z)
[1] 2.174278
>z
[1] 2.174278
4/29/2014
Basics - Printing
print() : prints a single variable or data structure
> x <- 5
> y <- 4
> z <- sqrt(x+y)
> print(z)
[1] 3
> cat("Square root of", x, " plus ", y, " is ", z, "\n")
Square root of 5 plus 4 is 3
4/29/2014
Basics Workspace
The R session workspace stores all the created variables
"z"
4/29/2014
> ls()
[1] "x" "y" "z"
> rm(x)
> ls()
[1] "y" "z"
> rm(y,z)
> ls()
character(0)
4/29/2014
Basics - Vectors
A vector is a list of numeric values and is created as
4/29/2014
10
4/29/2014
11
Basics - Sequences
Create a sequence of numbers via
> 1:14
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14
> seq(-1, 2, 0.3)
[1] -1.0 -0.7 -0.4 -0.1 0.2 0.5 0.8 1.1 1.4 1.7 2.0
4/29/2014
12
4/29/2014
13
4/29/2014
14
4/29/2014
15
4/29/2014
16
Basics - Functions
function(param1,param2,.,paramN)
{
expression 1
expression 2
Conditional Execution
if(cond) expr else expr
return(value)
4/29/2014
Basics - Loops
while(condition) expression
> z <- 0
> while(z < 5){
+ z <- z + 2
+ print(z)
+}
[1] 2
[1] 4
[1] 6
Homework: Search for and learn the for loop
17
4/29/2014
18
4/29/2014
19
> mode(lst[[2]])
[1] character
> lst2 <- list("Z", c(-21,5,7), list(3,"C"))
> print(lst2)
[[1]]
[1] "Z"
[[2]]
[1] -21 5 7
[[3]]
[[3]][[1]]
[1] 3
[[3]][[2]]
[1] "C"
4/29/2014
20
4/29/2014
21
4/29/2014
22
> A[3,1]
[1] 3
> A <- matrix(1:6,3,2)
> print(A)
(content,rows,cols)
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
4/29/2014
> A[1:2,3:4]
[,1] [,2]
[1,] 9 13
[2,] 10 14
23
4/29/2014
24
Source: http://timkienthuc.blogspot.de
4/29/2014
25
4/29/2014
26
Frames
> newRow <- data.frame(
names="josif", grades=2.0)
> scores <- rbind(scores,newRow)
> print(scores)
names grades
1 hans 1.7
2 tim
2.0
3 lukas 3.0
4 jorg
1.3
5 josif
2.0
4/29/2014
27
4/29/2014
28
4/29/2014
label,lbound,ubound
low,0,0.674
mid,0.674,1.64
high,1.64,2.33
> tbl <- read.csv("table.csv")
> print(tbl)
label lbound ubound
1 low 0.000 0.674
2 mid 0.674 1.640
3 high 1.640 2.330
29
4/29/2014
30
4/29/2014
Strings
> name <- "josif"
> surname <- "grabocka"
> fullname <- paste(name,surname)
> print(fullname)
[1] "josif grabocka"
> nchar(fullname)
[1] 14
> substr(fullname,7,10)
[1] "grab
> sub("a", "$", fullname)
[1] "josif gr$bocka"
> gsub("a", "$", fullname)
[1] "josif gr$bock$"
31
4/29/2014
32
Dates
> Sys.Date()
[1] "2013-04-28"
> format(Sys.Date(), "%m/%d/%Y")
[1] "04/28/2013
> s <- as.Date("2013-04-23")
> e <- as.Date("2013-04-30")
> seq(s,e,1)
[1] "2013-04-23" "2013-04-24" "2013-04-25" "2013-04-26"
"2013-04-27" "2013-04-28" "2013-04-29" "2013-04-30"
4/29/2014
33
Graphics - Introduction
Creating plots, charts and visual presentation of results
4/29/2014
34
50 25 85
> Distance <- ds$dist #ds[,1]
> Speed <- ds$speed #ds[,2]
> plot(Distance,Speed)
4/29/2014
35
4/29/2014
36
4/29/2014
37
4/29/2014
38
# output is a list
# output is a vector
Love
Statistics
"Love is great!" "Statistics is great!"
4/29/2014
39
4/29/2014
40
4/29/2014
41
vectors
lst <- lapply(matrix, function)
# output is list
vec <- sapply(matrix, function)
# output is vector
Also apply can be used for data frames similar to
matrices, however the data frame columns must have
identical modes/data types