0% found this document useful (0 votes)
44 views

R-Workshop: Training Program On R Programming Basic Concepts

R is a programming language used for data analysis and visualization. It can be used for machine learning as well. To use R, you need to install R and RStudio. R has various data structures like vectors, lists, matrices, and data frames. Vectors are the basic data structure in R and store elements of the same type. Lists can contain different data types. Common operations on vectors include extracting elements, modifying elements, adding elements, and performing arithmetic operations. Lists are more flexible than vectors as they can contain different data types. Elements can be extracted from and added to lists.

Uploaded by

adepoju kola
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

R-Workshop: Training Program On R Programming Basic Concepts

R is a programming language used for data analysis and visualization. It can be used for machine learning as well. To use R, you need to install R and RStudio. R has various data structures like vectors, lists, matrices, and data frames. Vectors are the basic data structure in R and store elements of the same type. Lists can contain different data types. Common operations on vectors include extracting elements, modifying elements, adding elements, and performing arithmetic operations. Lists are more flexible than vectors as they can contain different data types. Elements can be extracted from and added to lists.

Uploaded by

adepoju kola
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

R-Workshop file:///home/gdt/Decode-Workshop-R.

html

R-Workshop
Avinash Mishra

Training program on R programming basic


concepts
You have to install two things, i.e., R and RStudio. Download links:

(https://cran.r-project.org/bin/windows/base/ (https://cran.r-project.org/bin/windows/base/))

(https://rstudio.com/products/rstudio/download/ (https://rstudio.com/products/rstudio/download/))

Introduction to R programming
This programming language was named R, based on the first name letter of the two authors (Robert
Gentleman and Ross Ihaka). It is mainly used for data analysis and visualization. It can be learnt by anyone
without prior programming knowledge as its syntax is simple. It is free of cost and can be easily installed on
your computer. R has strong graphic capabilities making data visualization very easy. It has an active
community with several users making it one of the fastest growing programming languages. Due to a large
number of researchers and statisticians using it, new ideas and technologies often appear in the R
community first. It has many different packages to solve all sorts of problems. R was originally used for
academic purpose; it is now being used in industry as well. R can be used for machine learning as well. The
best use of R when it comes to machine learning is in case of exploration or when building one-off models.
R is machine-independent. It supports cross-platform operation as well. Therefore, it can be used on many
different operating systems.

R Studio
It is an integrated development environment which allows us to interact with R more easily. The first time
when we open RStudio, we will see three Windows. The fourth Window will be hidden by default. We can
open this hidden Window by clicking the File drop-down menu, then New File and then R Script.

Change Directory

getwd()

## [1] "/home/gdt"

# setwd("Your Path")

Variables
Variables are used to store the information. A valid variable name contains letter, numbers, dot and
underlines characters. A variable name should start with a letter or the dot not followed by a number. Ex:
var_name - valid but -2var_name – invalid Assignment operator is used to assign value to the variable.
Example:

1 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

x=3

Data Structure
It consists of Vector, List, Matrix, Data frame and Factor.

Vector: A vector is a collection of elements, all of same type.


When using R, you will frequently encounter the four basic vector types
logical, character, integer and double or decimal (often called numeric).
Create a vector: We can create a vector using c() function.

Example:

x=c(1, 2, 3, 4, 5, 6)

Vector
new <- c("red", "green", "blue")

You can also create a vector using : Operator # Sequence of Numbers from 1 to 10

1:10

## [1] 1 2 3 4 5 6 7 8 9 10

Or you can create vector using seq() function. It works the same as the : operator, except you can specify a
different increment (step size).

seq(from=1,to=10,by=2)

## [1] 1 3 5 7 9

Naming a Vector

Each element of a vector can have a name. It allows you to access individual elements by names. You can
give a name to the vector element with the names() function.

v = c("Apple", "Banana", "Cherry")


names(v) = c("A", "B", "C")

Extracting a value from vector: You can do this by combining square brackets []. Note that vector
positioning starts from

v <- c("a","b","c","d","e","f")

Select 3rd element

v[3]

2 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

## [1] "c"

You can also select multiple elements at once by using a vector of indexes. Select elements from index 2
to 5

v[2:5]

## [1] "b" "c" "d" "e"

select 1st, 3rd, 5th and 6th element

v[c(1,3,5,6)]

## [1] "a" "c" "e" "f"

Omit first ** by element**

v[-1]

## [1] "b" "c" "d" "e" "f"

Extract element by name

v = c("A"="Apple", "B"="Banana", "C"="Cherry")

Modify Vector Elements: You use the [] to access the element, and simply assign a new value.

v = c("a","b","c","d","e","f")
v

## [1] "a" "b" "c" "d" "e" "f"

v[3] = "k"
v

## [1] "a" "b" "k" "d" "e" "f"

Add Elements to a Vector: This is done using c() function.

v = c(1,2,3)

Add a single value to v

v <- c(v,4)

Combine Multiple Vectors:


Combine three vectors

3 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

v1 <- c(1, 2, 3)
v2 <- c(4, 5, 6)
v3 <- c(7, 8, 9)
z <-c(v1, v2, v3)

Vector arithmetic operations:

v1 <- c(11,12,13,14,15)
v2 <- c(1,2,3,4,5)

Addition

v1 + v2

## [1] 12 14 16 18 20

Subtraction

v1 - v2

## [1] 10 10 10 10 10

Multiplication

v1 * v2

## [1] 11 24 39 56 75

Division

v1 / v2

## [1] 11.000000 6.000000 4.333333 3.500000 3.000000

If you want to find out square root or log, then you can use sqrt function or log function

v <- c(1,2,3,4,5)

sqrt(v)

## [1] 1.000000 1.414214 1.732051 2.000000 2.236068

log(v)

## [1] 0.0000000 0.6931472 1.0986123 1.3862944 1.6094379

For mean, median, standard deviation and variance, use respective functions: v <- c(1,2,3,4,5,6) mean

4 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

mean(v)

## [1] 3

median

median(v)

## [1] 3

standard deviation

sd(v)

## [1] 1.581139

variance

var(v)

## [1] 2.5

Find a Vector Length: To find the total number of elements in a vector, use length() function.

v <- c(1,2,3,4,5)
length(v)

## [1] 5

List:
A list contains a mixture of data types. We can create a list with the help of list() -
Create a list:
A list of integers

lst <- list(1, 2, 3)

#A list of characters

lst <- list("red", "green", "blue")

A list of mixed datatypes

lst <- list(1, "abc", 1.23, TRUE)

We can understand the contents of a list using structure function str(). It gives internal structure of a list.

5 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

lst <- list(1, "abc", 1.23, TRUE)


str(lst)

## List of 4
## $ : num 1
## $ : chr "abc"
## $ : num 1.23
## $ : logi TRUE

Extract list elements by position: This is done using square brackets [].

lst <- list(1, "abc", 1.23, TRUE, 1:3)

extract 2nd element

lst[2]

## [[1]]
## [1] "abc"

select 1st, 3rd and 5th element

lst[c(1,3,5)]

## [[1]]
## [1] 1
##
## [[2]]
## [1] 1.23
##
## [[3]]
## NULL

exclude 1st, 3rd and 5th element

lst[c(-1,-3,-5)]

## [[1]]
## [1] "abc"
##
## [[2]]
## [1] TRUE

Add Elements to a List: You can use same method for modifying elements and adding new one. If the
element is already present in the list, it is updated else, a new element is added to the list.

lst <- list(1, 2, 3)


lst[[4]] <- 4
str(lst)

6 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

## List of 4
## $ : num 1
## $ : num 2
## $ : num 3
## $ : num 4

By using append() method you can append one or more elements to the list.

lst <- list(1, 2, 3)


lst <- append(lst,c("a","b","c"))
str(lst)

## List of 6
## $ : num 1
## $ : num 2
## $ : num 3
## $ : chr "a"
## $ : chr "b"
## $ : chr "c"

Remove an Element from a List: To remove a list element, select it by position or by name, and then
assign NULL to it.

lst <- list("a","b","c","d","e")


lst[[3]] <- NULL
str(lst)

## List of 4
## $ : chr "a"
## $ : chr "b"
## $ : chr "d"
## $ : chr "e"

Combine Lists: The c() does a lot more than just creating vectors. It can be used to combine lists into a
new list as well.

lst1 <- list("a","b","c")


lst2 <- list(1,2,3)
lst <- c(lst1, lst2)
str(lst)

## List of 6
## $ : chr "a"
## $ : chr "b"
## $ : chr "c"
## $ : num 1
## $ : num 2
## $ : num 3

To find the length of a list, use length() function.

7 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

lst <- list(5, 10, 15, 20)


length(lst)

## [1] 4

Matrix:
A matrix is a collection of elements, all the same type, arranged in a two-dimensional layout. ▪ Create a
Matrix: You can create a matrix using the matrix() function and specifying the data and the number of rows
and columns to make the matrix.

m <- matrix(1:6, nrow=2, ncol=3)

Create a character matrix

letters <- c("a","b","c","d","e","f")


m <- matrix(letters, nrow=2, ncol=3)
m

## [,1] [,2] [,3]


## [1,] "a" "c" "e"
## [2,] "b" "d" "f"

Print the dimension of a matrix:

m <- matrix(1:6, nrow=2, ncol=3)


dim(m)

## [1] 2 3

make it 3x2 matrix

dim(m) <- c(3,2)


m

## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6

Naming matrix rows and columns:

m <- matrix(1:6, nrow=2, ncol=3)


rownames(m) <- c("r1","r2")
colnames(m) <- c("c1","c2","c3")
m

8 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

## c1 c2 c3
## r1 1 3 5
## r2 2 4 6

Return the elements at the specified positions:

m <- matrix(1:9, nrow=3, ncol=3)


m

## [,1] [,2] [,3]


## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9

Extract only 2nd row elements

m[2,]

## [1] 2 5 8

Extract only 1st column elements

m[,1]

## [1] 1 2 3

Dataframe:
The easiest way to think of a data frame is as an Excel worksheet. Data frames have no restriction on the
data types of the variables; you can store numeric data, character data, and so on. It is a table in the form of
rows and columns.

Create a dataframe: Ex: Let’s create a data frame to store employee records having following information

name <- c("Bob", "Max", "Sam")


age <- c(25,26,23)
city <- c("New York", "Chicago", "Seattle")
df <- data.frame(name, age, city)
df

name age city


<chr> <dbl> <chr>

Bob 25 New York

Max 26 Chicago

Sam 23 Seattle

3 rows

Extract specific column: Extract 1st column

9 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

df[1]

name
<chr>

Bob

Max

Sam

3 rows

Add New Rows and Columns to Data Frame: You can add new columns to a data frame using the cbind()
function. Say you want to add gender column to existing dataframe, you can do like below:

gender <- factor(c("M", "M", "F"))


cbind(df, gender)

name age city gender


<chr> <dbl> <chr> <fct>

Bob 25 New York M

Max 26 Chicago M

Sam 23 Seattle F

3 rows

To add new rows (observations) to a data frame, use rbind() function.

row <- data.frame(name = "Sam", age = 22, city = "New York")


rbind(df, row)

name age city


<chr> <dbl> <chr>

Bob 25 New York

Max 26 Chicago

Sam 23 Seattle

Sam 22 New York

4 rows

Sorting dataframe: Ex: Let’s sort the data frame by age (By default, sorting is ascending in R)

df[order(df$age),]

  name age city


  <chr> <dbl> <chr>

3 Sam 23 Seattle

10 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

  name age city


  <chr> <dbl> <chr>

1 Bob 25 New York

2 Max 26 Chicago

3 rows

Reading a file

read_csv(“CSV File”)
library(openxlsx) read_excel(“XLSX File”)
write.xlsx(mval,file=“file.xlsx”,row.names = FALSE,sep = ",overwrite = TRUE)
View(“dataframe”)

PLOTS
Install & Import Libraries

GGPlot syntax
Part 1 : Setup GGPLOT

ggplot(gapminder)

ggplot(gapminder, aes(x=year))

11 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

ggplot(gapminder, aes(x=year, y=lifeExp ))

12 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

ggplot(gapminder, aes(x=year, y=lifeExp, color=continent))

Part 2 : Add layers

ggplot(gapminder, aes(x=year, y=lifeExp, color=continent)) +


geom_point()

13 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

ggplot(gapminder) +
geom_point(aes(x=year, y=lifeExp, color=continent))

14 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

ggplot(gapminder) +
geom_point(aes(x=year, y=lifeExp, color=continent)) +
geom_smooth(aes(x=year, y=lifeExp, color=continent))

## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

15 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

Part 3 : Add Labels


ggplot(gapminder, aes(x=year, y=lifeExp, color=continent)) +
geom_point()+
labs(title = "LIFE EXPECTANCY", x="YEAR", y="LIFE AGE EXPECTANCY")

16 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

Part 4 : Set the Theme


ggplot(gapminder, aes(x=year, y=lifeExp, color=continent)) +
geom_point()+
labs(title = "LIFE EXPECTANCY", x="YEAR", y="LIFE AGE EXPECTANCY")+
theme(plot.title=element_text(size=20, face="bold"),
axis.text.x=element_text(size=10),
axis.text.y=element_text(size=10),
axis.title.x=element_text(size=15),
axis.title.y=element_text(size=15))

17 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

Part 5 : Faceting
ggplot(gapminder, aes(x=year, y=lifeExp)) +
geom_point()+
labs(title = "LIFE EXPECTANCY", x="YEAR", y="LIFE AGE EXPECTANCY")+
facet_wrap(~continent)

18 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

Bar Plot
Filter Data for India only

library(dplyr)

##
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':


##
## filter, lag

## The following objects are masked from 'package:base':


##
## intersect, setdiff, setequal, union

India=gapminder%>%
filter(country=="India")
head(India)

country continent year lifeExp pop gdpPercap


<fct> <fct> <int> <dbl> <int> <dbl>

India Asia 1952 37.373 372000000 546.5657

19 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

country continent year lifeExp pop gdpPercap


<fct> <fct> <int> <dbl> <int> <dbl>

India Asia 1957 40.249 409000000 590.0620

India Asia 1962 43.605 454000000 658.3472

India Asia 1967 47.193 506000000 700.7706

India Asia 1972 50.651 567000000 724.0325

India Asia 1977 54.208 634000000 813.3373

6 rows

ggplot(India, aes(x=year, y=lifeExp)) +


geom_bar(stat="identity",position="dodge")+
labs(title = "INDIAN LIFE EXPECTANCY", x="YEAR", y="LIFE AGE EXPECTANCY")+
theme(plot.title=element_text(size=20, face="bold"),
axis.text.x=element_text(size=10),
axis.text.y=element_text(size=10),
axis.title.x=element_text(size=15),
axis.title.y=element_text(size=15))

Box Plot

20 of 21 29/07/21, 20:52
R-Workshop file:///home/gdt/Decode-Workshop-R.html

ggplot(gapminder, aes(x=continent, y=gdpPercap)) +


geom_boxplot()+
labs(title = "GDP FLOAT DATA", x="Continents", y="GDP")+
theme(plot.title=element_text(size=20, face="bold"),
axis.text.x=element_text(size=10),
axis.text.y=element_text(size=10),
axis.title.x=element_text(size=15),
axis.title.y=element_text(size=15))

THANKS

21 of 21 29/07/21, 20:52

You might also like