0% found this document useful (0 votes)
3 views

UNIT 2

The document provides a comprehensive guide on reading and writing files in R, including CSV, Excel, and text files, as well as input and output statements. It covers programming concepts such as condition statements, loops, functions, exception handling, and performance optimization techniques. Additionally, it discusses the use of packages for profiling and benchmarking in R.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

UNIT 2

The document provides a comprehensive guide on reading and writing files in R, including CSV, Excel, and text files, as well as input and output statements. It covers programming concepts such as condition statements, loops, functions, exception handling, and performance optimization techniques. Additionally, it discusses the use of packages for profiling and benchmarking in R.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 47

READING FILES IN R

*CSV files-read.csv()
data<-read.csv(“data.csv”)

*Excel files-library(readxl)
data<-read(“data.xlsx”)

*Other data-read.delim()
data<-read.delim(“datasetfile.text”,header=FALSE)
data
file.choose(),read_tsv()-readr package,read_file()
Writing files in R
*CSV-write.csv()
data<-data.frame(Name=c(“Meera”,”Vani”),Age=c(25,25))
write.csv(data,”output.csv”,row.names=FALSE)
*Excel-xlsx package
write.xlsx(data,file=”result.xlsx”,sheetName=”mydata”,app
end=FALSE)
*Text-.txt
Programming with input and output statements
readline()-input in string and convert to desired data
type
Functions
*n=readline();
*as.integer(n);
*as.numeric(n);
*as.complex(n);
*as.character(n);
as.Date(n);
example
> n<-readline()
555
>n
[1] "555"
> n=as.integer(n)
>n
[1] 555
scan()
input method
list=scan()
list
Reading file using scan()
sf=scan(“fileString.txt”,what=” “)
df=scan(“fileDouble.txt”,what=double())
OUTPUT STATEMENTS IN R

print()
cat()
message()
sprintf()
write.table()
{
str="Hello Hi";
num=123;
dec=34.4;
str;
print(str);
sprintf(“%s is a string",str);
}
[1] "Hello Hi is a string"
sprintf("%d is a number",num);
[1] "123 is a number"
sprintf("%f is a dloat value",dec);
[1] "34.400000 is a dloat value"
> x<-42
> y<-"Hello"
> print(x)
[1] 42
> cat("the value of x is",x,"and y is",y,"\n")
the value of x is 42 and y is Hello
CONDITION STATEMENTS

*If statement
*if-else statement
*if -else if -else statement
*switch statement
if statement
executes the statement if condition is true
Syntax- if(condition)
{
-------code------
}

> x<-10
> if(x>5)
+{
+ cat("x is greater than 5
\n")
+}
x is greater than 5
if- else statement
else block gets executed if the condition is false
syntax- if(condition)
{
---code---
}
else
{
---code---
}

> x<-3
> if(x>5)
+{
+ cat("x is greater than 5\n")
+ }else{
+ cat("x is lesser than 5 \n")
+}
x is lesser than 5
if-else if-else statement
test multiple conditions
syntax-if(condition1){
code---
}else if(condition2){
code---
}else{
code
}
> x<-8
> if(x>10){
+ cat("x is greater than 10 \n")
+ }else if(x>5){
+ cat("x is greater than 5 but lesser than 10\n")
+ }else{
+ cat("x is not greater than 5 \n")
+}
x is greater than 5 but lesser than 10
switch statement
compares different values ,multiway bracnch statemen t
syntax-
switch(expression,
value1={
code---
},
value2={
code---
},
default={
code--
}
}
)
> day<-"sunday"
> switch(1,
+ "monday"=cat("2nd day of the week \n"),
+ "saturday"=cat("weeend \n"),
+ default=cat("regular day \n")
+ )
2nd day of the week
LOOPS
Repeats a block code multiple times ,iteration,condition is met.

*for loop
*while loop
*repeat loop
for loop
entry controlled loop ,condition is tested first and then body is executed

syntax-
for(var in vector)
{
statements;
}
(or)
for(var in seq){
code---
}
for loop 1 to 5 list in concatenation
> for(x in 1:5) > for(x in c(-
+{ 5,6,9,15,20))
+ print(x^2) +{
+} + print(x)
[1] 1 +}
[1] 4 [1] -5
[1] 9 [1] 6
[1] 16 [1] 9
[1] 25 [1] 15
[1] 20
nested for loop for loop for list

> for(x in 1:4) > for (x in


+{ list("INDIA","US","UK"))
+ for(y in 1:x) +{
+{
+ print(x*y)
+ print(x)
+} +}
+} [1] "INDIA"
[1] 1 [1] "US"
[1] 2 [1] "UK"
[1] 4
[1] 3
[1] 6
[1] 9
[1] 4
[1] 8
[1] 12
[1] 16
for loop with break for loop with next
statement statement
> for(x in c(2,4,6,8,10,12))
+{ > for(x in c(2,4,6,8,10,12))
+ if(x==0) +{
+{ + if(x==0)
+ break +{
+} + next
+ print(x) +}
+} + print(x)
[1] 2 +}
[1] 4 [1] 2
[1] 6 [1] 4
[1] 8 [1] 6
[1] 10 [1] 8
[1] 12 [1] 10
[1] 12
for loop to create histogram multiple plots
mat<-matrix(rnorm(50),ncol=2)
> par(mfrow=c(2,2))
> for(i in 1:2){
hist(mat[, i],main=paste("column",i),xlab="values",col="red")
}
while loop
no of iterations of a loop is unknown
syntax
while(condition){
statements
}

> x<-1
> while(x<6)
+{
+ print(x)
+ x=x+1
+}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
repeat loop
excutes till break statement

no condition for exit


syntax
repeat{
statements
if(condition
){
break
}
}
> count<-1 > data<-c("banglore")
> repeat{ > x<-1
+ cat("count",count, "\n") > repeat{
+ count<-count+1 + print(data)
+ if(count>5){ + x<-x+1
+ break + if(x>5)
+} +{
+} + break
count 1 +}
count 2 +}
count 3 [1] "banglore"
count 4 [1] "banglore"
count 5 [1] "banglore"
[1] "banglore"
[1] "banglore"
FUNCTIONS
*Perform task multiple times
*blocks of code can be called multiple times
defining a function

function()

syntax
function_name<-function(arg1,arg2...)
{ code---
return(result)
}
calling a function
> add<-function(a,b){
+ result<-a+b
+ return(result)
+}
> result<-add(3,5)
> cat("the result",result,"\n")
the result 8
>
Built in functions
*sum() > print(sum(5:10))
*max() [1] 45
*min() > print(max(5:10))
*seq() [1] 10
*mean() > print(min(5:10))
[1] 5
> print(seq(5,10))
[1] 5 6 7 8 9 10
> print(mean(5:10))
[1] 7.5
user defined functions

function arguments

named and unamed arguments

res<-add(a=5,b=3)
res1<-add(5,3)
function with default values
power<-function(base,exponent=2){
res<-base^exponent
return(res)
}

ex-2
res1<-power(2)
res2<-power(2,3)
function with single parameter
area_square<-
function(side){
+ area=side*side
+ return(area)
+}
> print(area_square(8))
64
multiple parameter
> si=function(p,r,t)
+{
+ res=(p*r*t)/100
+ return(res)
+}
> print(si(1000,10,2))
[1] 200
without parameter and without return

> fun=function()
+{
+ print("hello")
+}
> print(fun())
[1] "hello"
[1] "hello"
variable scoping
lexical scoping-variables are local by default
<<- assign global variables

Inline functions fun=function(x)x^2*5+x/


3
function() print(fun(5))
lambda or anonymous functions print(fun(-4))
syntax
function(arg1,arg2,....){
code
}
sapply()
numbers<-c(1,2,3,4,5)
sqaure<-sapply(numbers,function(x)x^2)
cat(“squared numbers:”,square,”\n”)
Exceptions
*runtime error due to some conditions at run time
*0 division,file not found,array range.
*Mechanism for handling runtime errors or exceptional
situations that may occur during execution of a
program .
*tryCatch()
*try()
*withCallingHnadlers()
try()
*Helps to continue with program execution even when error occurs
*expression expecting exception can be written in try block
x<-5
> x<6
[1] TRUE
> x>6
[1] FALSE
> try(x>7)
[1] FALSE
tryCatch()
Handles the conditions and controls what happens based
on condition
Main conditions -errors,warnings
Primary tool
structure res<-tryCatch({
x<-5/0
res<-tryCatch({ },error=function(err){
code...
},error=function(err){ cat(“error”,msg(err),”\n”)
code--- }, finally={
},finally={
code-- cat(“the operation is
} complete \n”)
}
})
withCallingHandlers()
alternative to tryCatch() for local handlers tryCatch for existing
check<-function(expression){
withCallingHandlers(expression,
warning=function(w){
message(“warning\n”,w)
},
error=function(e){
message(“error\n”,e)
},
finally={
message(“completed”)
}
}
}
check({10/2})
check({10/0})
check({10/’abc’})
Exception classes
conditionClass()

custom exception-more information


custerror<-function(message){
structure{
list(message=message,call=sys.c
all(-1)),
class=”custerror”
}}
tryCatch({
stop(“custerror(“custom
message”))
},error=function(err){
cat(“custom error
msg”,conditionMessage(err),”\n”)
})
TIMINGS
Timings function

sys.time():current time,elapsed time


system.time()-CPU time for R,execution time ,returns object of
user time ,system time , elapsed time
proc.time()-CPU time,expression sequence ,function call
> system.time(print(5>1))
[1] TRUE > start2<-proc.time()
user system elapsed > si=function(p,r,t)
0 0 0 +{
> start<-Sys.time() + res=(p*r*t)/100
> print(10>=5) + return(res)
[1] TRUE +}
> Sys.time()-start > print(si(1000,10,2))
Time difference of 29.94285 secs [1] 200
> start2<-proc.time() > proc.time()-start2
> print(10>=5) user system elapsed
[1] TRUE 0.11 0.21 50.95
> proc.time()-start2 > proc.time()-start2
user system elapsed user system elapsed
0.05 0.08 28.75 0.14 0.25 70.47
>
OPTIMIZING TIMING * To improve speed and efficiency
*Execution time

*Vectorized operation-Use vectors instead of loops


*Avoid global variables-function arguments instead of global variables
*Efficient Data Structures-data frames,matrix
*Memory management-data.table for large code,ff.Use rm(),gc().
*Parallel processing-parallel package , foreach package.
*Optimize functions- profvis,microenchmark,bench .duplicate caalculations,simplify logic
*Avoid Unnecessary data copies-data.table::set()- modify
*Use effiecient package -data .table
*Caching and Memorization-resuse intermediate result .
*Profiling -Rprof,profvis,system
PACKAGES
Repositoties of R programming packages

CRAN-Comprehensive R Archive Network


*FTP network
Bioconductor-topic specific repository
OSS for bioinformatics
Github-oss projects,unlimited space
Installing packages
install.packages(“package name”)

Update,check installed packages

check- installed.packages()
update-update.packages()

loading packages in R

library(dplyr)
require(dplyr)
Profvis package
Profiling R,visualizing code execution

> profvis({
+ for(i in 1:1000000){
+ sqrt(i)
+}
+}

Microbenchmark package
measuring execution time
Benchmarking with bench package
detailed information
Benchmarking with rbenchmark package
compares execution time of multiple functions

You might also like