R-Workshop: Training Program On R Programming Basic Concepts
R-Workshop: Training Program On R Programming Basic Concepts
html
R-Workshop
Avinash Mishra
(https://cran.r-project.org/bin/windows/base/ (https://cran.r-project.org/bin/windows/base/))
(https://rstudio.com/products/rstudio/download/ (https://rstudio.com/products/rstudio/download/))
Introduction to R programming
This programming language was named R, based on the first name letter of the two authors (Robert
Gentleman and Ross Ihaka). It is mainly used for data analysis and visualization. It can be learnt by anyone
without prior programming knowledge as its syntax is simple. It is free of cost and can be easily installed on
your computer. R has strong graphic capabilities making data visualization very easy. It has an active
community with several users making it one of the fastest growing programming languages. Due to a large
number of researchers and statisticians using it, new ideas and technologies often appear in the R
community first. It has many different packages to solve all sorts of problems. R was originally used for
academic purpose; it is now being used in industry as well. R can be used for machine learning as well. The
best use of R when it comes to machine learning is in case of exploration or when building one-off models.
R is machine-independent. It supports cross-platform operation as well. Therefore, it can be used on many
different operating systems.
R Studio
It is an integrated development environment which allows us to interact with R more easily. The first time
when we open RStudio, we will see three Windows. The fourth Window will be hidden by default. We can
open this hidden Window by clicking the File drop-down menu, then New File and then R Script.
Change Directory
getwd()
## [1] "/home/gdt"
# setwd("Your Path")
Variables
Variables are used to store the information. A valid variable name contains letter, numbers, dot and
underlines characters. A variable name should start with a letter or the dot not followed by a number. Ex:
var_name - valid but -2var_name – invalid Assignment operator is used to assign value to the variable.
Example:
1 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
x=3
Data Structure
It consists of Vector, List, Matrix, Data frame and Factor.
Example:
x=c(1, 2, 3, 4, 5, 6)
Vector
new <- c("red", "green", "blue")
You can also create a vector using : Operator # Sequence of Numbers from 1 to 10
1:10
## [1] 1 2 3 4 5 6 7 8 9 10
Or you can create vector using seq() function. It works the same as the : operator, except you can specify a
different increment (step size).
seq(from=1,to=10,by=2)
## [1] 1 3 5 7 9
Naming a Vector
Each element of a vector can have a name. It allows you to access individual elements by names. You can
give a name to the vector element with the names() function.
Extracting a value from vector: You can do this by combining square brackets []. Note that vector
positioning starts from
v <- c("a","b","c","d","e","f")
v[3]
2 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
## [1] "c"
You can also select multiple elements at once by using a vector of indexes. Select elements from index 2
to 5
v[2:5]
v[c(1,3,5,6)]
v[-1]
Modify Vector Elements: You use the [] to access the element, and simply assign a new value.
v = c("a","b","c","d","e","f")
v
v[3] = "k"
v
v = c(1,2,3)
v <- c(v,4)
3 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
v1 <- c(1, 2, 3)
v2 <- c(4, 5, 6)
v3 <- c(7, 8, 9)
z <-c(v1, v2, v3)
v1 <- c(11,12,13,14,15)
v2 <- c(1,2,3,4,5)
Addition
v1 + v2
## [1] 12 14 16 18 20
Subtraction
v1 - v2
## [1] 10 10 10 10 10
Multiplication
v1 * v2
## [1] 11 24 39 56 75
Division
v1 / v2
If you want to find out square root or log, then you can use sqrt function or log function
v <- c(1,2,3,4,5)
sqrt(v)
log(v)
For mean, median, standard deviation and variance, use respective functions: v <- c(1,2,3,4,5,6) mean
4 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
mean(v)
## [1] 3
median
median(v)
## [1] 3
standard deviation
sd(v)
## [1] 1.581139
variance
var(v)
## [1] 2.5
Find a Vector Length: To find the total number of elements in a vector, use length() function.
v <- c(1,2,3,4,5)
length(v)
## [1] 5
List:
A list contains a mixture of data types. We can create a list with the help of list() -
Create a list:
A list of integers
#A list of characters
We can understand the contents of a list using structure function str(). It gives internal structure of a list.
5 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
## List of 4
## $ : num 1
## $ : chr "abc"
## $ : num 1.23
## $ : logi TRUE
Extract list elements by position: This is done using square brackets [].
lst[2]
## [[1]]
## [1] "abc"
lst[c(1,3,5)]
## [[1]]
## [1] 1
##
## [[2]]
## [1] 1.23
##
## [[3]]
## NULL
lst[c(-1,-3,-5)]
## [[1]]
## [1] "abc"
##
## [[2]]
## [1] TRUE
Add Elements to a List: You can use same method for modifying elements and adding new one. If the
element is already present in the list, it is updated else, a new element is added to the list.
6 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
## List of 4
## $ : num 1
## $ : num 2
## $ : num 3
## $ : num 4
By using append() method you can append one or more elements to the list.
## List of 6
## $ : num 1
## $ : num 2
## $ : num 3
## $ : chr "a"
## $ : chr "b"
## $ : chr "c"
Remove an Element from a List: To remove a list element, select it by position or by name, and then
assign NULL to it.
## List of 4
## $ : chr "a"
## $ : chr "b"
## $ : chr "d"
## $ : chr "e"
Combine Lists: The c() does a lot more than just creating vectors. It can be used to combine lists into a
new list as well.
## List of 6
## $ : chr "a"
## $ : chr "b"
## $ : chr "c"
## $ : num 1
## $ : num 2
## $ : num 3
7 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
## [1] 4
Matrix:
A matrix is a collection of elements, all the same type, arranged in a two-dimensional layout. ▪ Create a
Matrix: You can create a matrix using the matrix() function and specifying the data and the number of rows
and columns to make the matrix.
## [1] 2 3
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
8 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
## c1 c2 c3
## r1 1 3 5
## r2 2 4 6
m[2,]
## [1] 2 5 8
m[,1]
## [1] 1 2 3
Dataframe:
The easiest way to think of a data frame is as an Excel worksheet. Data frames have no restriction on the
data types of the variables; you can store numeric data, character data, and so on. It is a table in the form of
rows and columns.
Create a dataframe: Ex: Let’s create a data frame to store employee records having following information
Max 26 Chicago
Sam 23 Seattle
3 rows
9 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
df[1]
name
<chr>
Bob
Max
Sam
3 rows
Add New Rows and Columns to Data Frame: You can add new columns to a data frame using the cbind()
function. Say you want to add gender column to existing dataframe, you can do like below:
Max 26 Chicago M
Sam 23 Seattle F
3 rows
Max 26 Chicago
Sam 23 Seattle
4 rows
Sorting dataframe: Ex: Let’s sort the data frame by age (By default, sorting is ascending in R)
df[order(df$age),]
3 Sam 23 Seattle
10 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
2 Max 26 Chicago
3 rows
Reading a file
read_csv(“CSV File”)
library(openxlsx) read_excel(“XLSX File”)
write.xlsx(mval,file=“file.xlsx”,row.names = FALSE,sep = ",overwrite = TRUE)
View(“dataframe”)
PLOTS
Install & Import Libraries
GGPlot syntax
Part 1 : Setup GGPLOT
ggplot(gapminder)
ggplot(gapminder, aes(x=year))
11 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
12 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
13 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
ggplot(gapminder) +
geom_point(aes(x=year, y=lifeExp, color=continent))
14 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
ggplot(gapminder) +
geom_point(aes(x=year, y=lifeExp, color=continent)) +
geom_smooth(aes(x=year, y=lifeExp, color=continent))
15 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
16 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
17 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
Part 5 : Faceting
ggplot(gapminder, aes(x=year, y=lifeExp)) +
geom_point()+
labs(title = "LIFE EXPECTANCY", x="YEAR", y="LIFE AGE EXPECTANCY")+
facet_wrap(~continent)
18 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
Bar Plot
Filter Data for India only
library(dplyr)
##
## Attaching package: 'dplyr'
India=gapminder%>%
filter(country=="India")
head(India)
19 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
6 rows
Box Plot
20 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html
THANKS
21 of 21 29/07/21, 20:52