R Manual
R Manual
R is usually used along with RStudio, a popular free IDE for the same. The
steps you can follow to set it up on your computer are:
2. Getting Started:
You can follow the following steps to set up and use an elementary script
for R within RStudio.
Set the current working directory of RStudio as the folder where you
would want to store all the code, data files and would act as the root of
paths supplied by you. Do this by going into Session > Set working
directory > Choose directory. (Verify the directory by running getwd()
into the terminal within the IDE)
Create a script window by selecting File > New file > R script. All the
commands hereon would be written into this script file and not directly on
the terminal to ensure code usability. Some useful shortcuts are:
a. Ctrl + R: Run complete or selected fragment of code
b. Ctrl + S/O: Save/Open a file
3. Basic syntax
Depending on the needs, you can program either at R command prompt or
you can use an R script file to write your program. For example,
In R prompt, your program as follows;
> myString <- "Hello, World!"
> print (myString)
[1] "Hello, World!"
print (myString)
4. Data Types There are different classes of the elements, for example,
Logical, Numeric, Integer, Complex, Character, and Raw.
Vectors: Use c() function which means to combine the elements into a
vector. For example,
# Create a vector.
apple <- c('red', 'green', 'yellow')
print(apple)
Matrices
# Create a matrix.
M = matrix( c('a','a','b','c','b','a'), nrow = 2, ncol = 3, byrow = TRUE)
print(M)
Arrays
# Create an array.
a <- array(c('green','yellow'),dim = c(3,3,2))
print(a)
5. R Package:
Check Available R Packages: Get library locations containing R
packages
.libPaths()
# example
# Get the max value(salary) from data frame.
sal <- max(data$salary)
print(sal)
# Get the person detail having max salary.
retval <- subset(data, salary == max(salary))
print(retval)
reaval <- subset( data, dept == "IT")
print(reaval)
info <- subset(data, salary > 600 & dept == "IT")
print(info)
II. Excel file (file name- input. Xlsx): First install xlsx packages
o install.packages("xlsx")
o
# Verify the package is installed.
any(grepl("xlsx",installed.packages()))
o trim is used to drop some observations from both end of the sorted
vector.
o na.rm is used to remove the missing values from the input vector.
o Example:
# Create a vector.
x <- c(12, 7, 3, 4.2, 18, 2, 54, -21, 8, -5)
# Find Mean.
result.mean <- mean(x)
print(result.mean)
# Find Mean.
result.mean <- mean(x,trim = 0.3)
print(result.mean)
# Create a vector.
x <- c(12, 7, 3, 4.2, 18, 2, 54, -21, 8, -5, NA)
# Find mean.
result.mean <- mean(x)
print(result.mean)
o
o
8. R- Distributions
Normal distribution: R has four in built functions to generate normal
distribution. They are described below: #X= vector,
dnorm(x, mean, sd):height of the probability distribution at each point
for a given mean and standard deviation.
pnorm(x, mean, sd): probability of a normally distributed random
number to be less that the value of a given number. It is also called
"Cumulative Distribution Function".
qnorm(p, mean, sd): takes the probability value and gives a number
whose cumulative value matches the probability value
rnorm(n, mean, sd): used to generate random numbers whose
distribution is normal.
Example
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131) # value of
height
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48) # value of weight
print(relation)
print(summary(relation))
# Find weight of a person with height 170.
a <- data.frame(x = 170)
result <- predict(relation,a)
print(result)
# Give the chart file a name.
png(file = "linearregression.png")
Similarly, you can build program for multiple regression, and logistic
regression.
10. Distribution fit: Install “fitdistrplus” package
# Earthquake interevent time
Data<-c(5,6,9,10,11,13,14,23,38,49,50,65,66,66,76,90,105,109,132,175)
a<-c("Weibull","lognormal","gamma", "exponential")
denscomp(list(A,B,C,D),legendtext = a)
cdfcomp(list(A,B,C,D),legendtext=a)
qqcomp(list(A,B,C,D),legendtext=a)
qqcomp(C)