SlideShare a Scribd company logo
Introduction to R Programming
A Session
By
Vaibhav Kumar
Dept. of CSE
DIT University, Dehradun
Vaibhav Kumar, DIT University, Dehradun
R
• R is a programming language and software environment for statistical
analysis, graphics representation and reporting.
• R was created by Ross Ihaka and Robert Gentleman at the University
of Auckland, New Zealand.
• R is freely a
• It was named R, based on the first letter of first name of the two R
authors (Robert Gentleman and Ross Ihaka).
Vaibhav Kumar, DIT University, Dehradun
Features of R
• R is a well-developed, simple and effective programming language
which includes conditionals, loops, user defined recursive functions
and input and output facilities.
• R has an effective data handling and storage facility.
• R provides a suite of operators for calculations on arrays, lists, vectors
and matrices.
• R provides a large, coherent and integrated collection of tools for data
analysis.
• R provides graphical facilities for data analysis and display either
directly at the computer or printing at the papers.
Vaibhav Kumar, DIT University, Dehradun
A Simple Example
• A simple program to write “Hello” cab be written in R as:
>print(“Hello”)
• To add two numbers, a program can be written as:
>Print(2+3)
The first program can also be written as:
>message=“Hello”
>print(message)
Vaibhav Kumar, DIT University, Dehradun
Data Types and Objects in R
• While using any programming language, we must define the data type
of variables; means which type of data the variable will store.
• Some popularly used data types in R are: Logical, Numeric, Integer,
Complex, Character, Raw.
• Some frequently used objects in R are: Vectors, Lists, Matrices, Arrays,
Factors, Data Frames.
Vaibhav Kumar, DIT University, Dehradun
Vectors
• A function c() is used to combine the elements of a vectore
Example:
fruits=c(“Apple”, “Orange”, “Banana”)
print(fruits)
• When we execute the above code, we will get the following output:
“Apple” “Orange” “Banana”
Vaibhav Kumar, DIT University, Dehradun
Lists
• A list is an R-object which can contain many different types of elements
inside it like vectors, functions and even another list inside it.
Example
list1=list(c(“Apple”, “Orange”, “Banana”), c(2, 3, 5), 14.5)
print(list1)
When we execute the above code, we will get the following output:
[1] “Apple” “Orange” “Banana”
[2] 2 3 5
[3] 14.5
Vaibhav Kumar, DIT University, Dehradun
Matrices
• A matrix in R can be created using a vector input to the matrix
function.
Example:
M=matrix(c(1, 2,3,4,5,6,7,8,9),ncol=3,nrow=3)
When we execute the above code, we will get the following output:
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
Vaibhav Kumar, DIT University, Dehradun
Data Frames
• Data frames are tabular data objects.
• Unlike a matrix in data frame each column can contain different modes of data.
• Data Frames are created using the data.frame() function.
Example:
>BMI=data.frame(
Name=c(“Vaibhav”, “Nitin”, “Aakash”),
Height=c(170, 169,175),
Weight=c(80, 75,78),
Age=c(30,30,29))
>print(BMI)
When we run the above code, we will get the following output:
Name Height Weight Age
1 Vaibhav 170 80 30
2 Nitin 169 75 30
3 Aakash 175 78 29
Vaibhav Kumar, DIT University, Dehradun
R-Excel File
• Microsoft Excel is the most widely used spreadsheet program which
stores data in the .xls or .xlsx format.
• R can read directly from these files using some excel specific
packages.
• We will have to run the following codes to install the package in R to
access excel files.
install.packages(“xlsx”)
library(“xlsx”)
(Note: Java environment must be installed before running these codes)
Vaibhav Kumar, DIT University, Dehradun
Reading the Excel File
• Let we have an excel file: marks.xlsx in the current working directory*, then
we will have to run the following code to read this file:
data=read.xlsx(“marks.xlsx”, sheetIndex=1)
print(data)
• To make a sub data frame from the main data frame, we can run the
following code
NameMarks=data.frame(data$Name, data$Final)
When we execute the above code, we can see the data of entire file which is
loaded into the data frame: data
(*.we can see the current working directory through the function getwd())
Vaibhav Kumar, DIT University, Dehradun
Statistical Operations in R
• Let us consider a vector of elements as:
values=c(4, 5, 8, 9, 2, 5, 3, 6, 9, 8, 1 ,4)
• Mean: mean(values)
• Mode: mode(values)
• Median: Median(values)
• Let us consider the previous example of marks, if we want to see the
Mean, Mode or Median of Final marks of students then we will have
to run mean(data$Final), median(data$Final).
Vaibhav Kumar, DIT University, Dehradun
Regression Analysis
• Regression analysis is a very widely used statistical tool to establish a
relationship model between two variables-predictor and response.
• The general mathematical equation for a linear regression is −
y=ax+b
Where y is the response variable, x is the predictor variable and a and b
are the constants known as coefficients of regression.
• In R, lm() function is used to create a relationship model between
these two variables.
Vaibhav Kumar, DIT University, Dehradun
Example of Regression Analysis
• Let us the example of marks of students.
• Suppose we are to analyze the relation between class test marks and final
marks of the students.
• Let y=data$Final, x=data$ClassTest
Then the relation can be created through the code:
relation=lm(y~x)
We can see the relation by running the following code:
print(relation)
• Summary of the relation can be seen through: summary(relation)
(Note: since we are working on very less amount of data, values may not be
acceptable)
Vaibhav Kumar, DIT University, Dehradun
Graphical Visualization of Regression
• Regression analysis in previous example can be visualized graphically as:
>png(file=“MarksRegression.png”)
>plot(x, y, col=“Blue”, main=“Class Test and Final Marks”,
abline(lm(y~x)), cex=1.3, pch=16, xlab=“Class Test”, ylab=“Final
Marks”)
>dev.off()
By running the above code, we can see a regression line of the relation
between class test and final marks.
Vaibhav Kumar, DIT University, Dehradun
Prediction
• By using the regression analysis, we can predict the value of response variable for
a new predictor value through predict() function.
• Consider the previous example, where if we need to predict the final marks of a
student on the basis of his marks in class test.
Let we are to predict final marks if marks in class test is 10.
>a=data.frame(x=10)
>result=predict(relation, a)
>print(relation)
(Note: result will be in highly acceptable range if we have a large data set to create
the model)
Vaibhav Kumar, DIT University, Dehradun
Multiple Regression
• Multiple regression is an extension of linear regression into
relationship between more than two variables.
• In simple linear relation we have one predictor and one response
variable, but in multiple regression we have more than one predictor
variable and one response variable.
• It can be expressed as:
Y=a+b1X1+b2X2+….+bnXn
Where, Y is the response variable, a, b1, b2,…,bn are the coefficients
and X1, X2,….,Xn are the predictor variables.
Vaibhav Kumar, DIT University, Dehradun
Multiple Regression in R
• Let us consider an example where result of students consists of Mid-Term Exams,
Class Tests, Quiz and Final Marks.
• Let we are to create a relation to analyze how Final marks are depending on Mid-
Term Exams, Class Tests and Quiz.
Let we have an another data set NewData which consists all these marks. Then a
relation can be created as:
Mul_Regr=lm(NewData$Final~NewData$MidTerm+NewData$Cla
ssTest+NewData$Quiz, data=NewData)
We can see this relation by
print(Mul_Regr)
Vaibhav Kumar, DIT University, Dehradun
Pie Chart
• In R the pie chart is created using the pie() function.
• Example:
x=c(20, 10, 40, 30)
labels=c(“Dehradun”, “Roorkee”, “Delhi”, “Ghaziabad”)
png(file=“PieChart.png”)
pie(x,labels)
dev.off()
Vaibhav Kumar, DIT University, Dehradun
Bar Chart
• Consider the final marks of students. It can be plotted through bar
chart as:
png(file=“BarChart.png”)
barplot(data$Final)
dev.off()
Vaibhav Kumar, DIT University, Dehradun
Histogram
• Consider the example of marks again. Let we are to plot the
histogram of final marks.
>png(file=“Histogram.png”)
>hist(data$Final, xlab=“Final Marks”, col=“Blue”,
border=“Red”)
>dev.off()
Vaibhav Kumar, DIT University, Dehradun
Thank You
Vaibhav Kumar, DIT University, Dehradun
Ad

Recommended

R programming language
R programming language
Keerti Verma
 
R Programming
R Programming
Abhishek Pratap Singh
 
R programming
R programming
TIB Academy
 
R programming
R programming
Nandhini G
 
R programming slides
R programming slides
Pankaj Saini
 
A short tutorial on r
A short tutorial on r
Ashraf Uddin
 
R programming presentation
R programming presentation
Akshat Sharma
 
R language
R language
Kìshør Krîßh
 
R programming
R programming
Shantanu Patil
 
R programming Language , Rahul Singh
R programming Language , Rahul Singh
Ravi Basil
 
R language
R language
SubramanianMuthusamy3
 
Class ppt intro to r
Class ppt intro to r
JigsawAcademy2014
 
How to get started with R programming
How to get started with R programming
Ramon Salazar
 
Introduction to statistical software R
Introduction to statistical software R
Paola Pozzolo - La tua statistica
 
R programming
R programming
Pooja Sharma
 
R programming for data science
R programming for data science
Sovello Hildebrand
 
LSESU a Taste of R Language Workshop
LSESU a Taste of R Language Workshop
Korkrid Akepanidtaworn
 
Workshop presentation hands on r programming
Workshop presentation hands on r programming
Nimrita Koul
 
R programming Fundamentals
R programming Fundamentals
Ragia Ibrahim
 
R programming groundup-basic-section-i
R programming groundup-basic-section-i
Dr. Awase Khirni Syed
 
Introduction to R programming
Introduction to R programming
Victor Ordu
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics Platform
Syracuse University
 
Introduction to R
Introduction to R
Ajay Ohri
 
R language
R language
Isra El Isa
 
R Course Online
R Course Online
bestonlinecoursescoupon
 
R programming language: conceptual overview
R programming language: conceptual overview
Maxim Litvak
 
Introduction to R
Introduction to R
Kazuki Yoshida
 
Relational Model,relational calulus.pptx
Relational Model,relational calulus.pptx
prachi gat
 
316_16SCCCS4_2020052505222431.pptdatabasex
316_16SCCCS4_2020052505222431.pptdatabasex
abhaysonone0
 
17641.ppt
17641.ppt
AhmedAbdalla903058
 

More Related Content

What's hot (19)

R programming
R programming
Shantanu Patil
 
R programming Language , Rahul Singh
R programming Language , Rahul Singh
Ravi Basil
 
R language
R language
SubramanianMuthusamy3
 
Class ppt intro to r
Class ppt intro to r
JigsawAcademy2014
 
How to get started with R programming
How to get started with R programming
Ramon Salazar
 
Introduction to statistical software R
Introduction to statistical software R
Paola Pozzolo - La tua statistica
 
R programming
R programming
Pooja Sharma
 
R programming for data science
R programming for data science
Sovello Hildebrand
 
LSESU a Taste of R Language Workshop
LSESU a Taste of R Language Workshop
Korkrid Akepanidtaworn
 
Workshop presentation hands on r programming
Workshop presentation hands on r programming
Nimrita Koul
 
R programming Fundamentals
R programming Fundamentals
Ragia Ibrahim
 
R programming groundup-basic-section-i
R programming groundup-basic-section-i
Dr. Awase Khirni Syed
 
Introduction to R programming
Introduction to R programming
Victor Ordu
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics Platform
Syracuse University
 
Introduction to R
Introduction to R
Ajay Ohri
 
R language
R language
Isra El Isa
 
R Course Online
R Course Online
bestonlinecoursescoupon
 
R programming language: conceptual overview
R programming language: conceptual overview
Maxim Litvak
 
Introduction to R
Introduction to R
Kazuki Yoshida
 
R programming Language , Rahul Singh
R programming Language , Rahul Singh
Ravi Basil
 
How to get started with R programming
How to get started with R programming
Ramon Salazar
 
R programming for data science
R programming for data science
Sovello Hildebrand
 
Workshop presentation hands on r programming
Workshop presentation hands on r programming
Nimrita Koul
 
R programming Fundamentals
R programming Fundamentals
Ragia Ibrahim
 
R programming groundup-basic-section-i
R programming groundup-basic-section-i
Dr. Awase Khirni Syed
 
Introduction to R programming
Introduction to R programming
Victor Ordu
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics Platform
Syracuse University
 
Introduction to R
Introduction to R
Ajay Ohri
 
R programming language: conceptual overview
R programming language: conceptual overview
Maxim Litvak
 

Similar to R programming (20)

Relational Model,relational calulus.pptx
Relational Model,relational calulus.pptx
prachi gat
 
316_16SCCCS4_2020052505222431.pptdatabasex
316_16SCCCS4_2020052505222431.pptdatabasex
abhaysonone0
 
17641.ppt
17641.ppt
AhmedAbdalla903058
 
Slides on introduction to R by ArinBasu MD
Slides on introduction to R by ArinBasu MD
SonaCharles2
 
17641.ppt
17641.ppt
vikassingh569137
 
Basics of R-Progranmming with instata.ppt
Basics of R-Progranmming with instata.ppt
geethar79
 
CuRious about R in Power BI? End to end R in Power BI for beginners
CuRious about R in Power BI? End to end R in Power BI for beginners
Jen Stirrup
 
How to obtain and install R.ppt
How to obtain and install R.ppt
rajalakshmi5921
 
Lecture-2-Relational-Algebra-and-SQL-Advanced-DataBase-Theory-MS.pdf
Lecture-2-Relational-Algebra-and-SQL-Advanced-DataBase-Theory-MS.pdf
ssuserf86fba
 
Data Types of R.pptx
Data Types of R.pptx
Ramakrishna Reddy Bijjam
 
R tutorial
R tutorial
Richard Vidgen
 
Ggplot2 v3
Ggplot2 v3
Josh Doyle
 
1_Introduction.pptx
1_Introduction.pptx
ranapoonam1
 
Lect - 12 solve d.pptx
Lect - 12 solve d.pptx
SumeetRathi5
 
Database design for HPC
Database design for HPC
National Institute of Technology Durgapur
 
R Programming Language
R Programming Language
NareshKarela1
 
QPOfutyfurfugfuyttruft7rfu65rfuyt PPT - Copy.ppt
QPOfutyfurfugfuyttruft7rfu65rfuyt PPT - Copy.ppt
ahmed518927
 
ch02-240507064009-ac337bf1 .ppt
ch02-240507064009-ac337bf1 .ppt
iamayesha2526
 
Query optimization and processing for advanced database systems
Query optimization and processing for advanced database systems
meharikiros2
 
description description description description
description description description description
ibrahimradwan14
 
Relational Model,relational calulus.pptx
Relational Model,relational calulus.pptx
prachi gat
 
316_16SCCCS4_2020052505222431.pptdatabasex
316_16SCCCS4_2020052505222431.pptdatabasex
abhaysonone0
 
Slides on introduction to R by ArinBasu MD
Slides on introduction to R by ArinBasu MD
SonaCharles2
 
Basics of R-Progranmming with instata.ppt
Basics of R-Progranmming with instata.ppt
geethar79
 
CuRious about R in Power BI? End to end R in Power BI for beginners
CuRious about R in Power BI? End to end R in Power BI for beginners
Jen Stirrup
 
How to obtain and install R.ppt
How to obtain and install R.ppt
rajalakshmi5921
 
Lecture-2-Relational-Algebra-and-SQL-Advanced-DataBase-Theory-MS.pdf
Lecture-2-Relational-Algebra-and-SQL-Advanced-DataBase-Theory-MS.pdf
ssuserf86fba
 
1_Introduction.pptx
1_Introduction.pptx
ranapoonam1
 
Lect - 12 solve d.pptx
Lect - 12 solve d.pptx
SumeetRathi5
 
R Programming Language
R Programming Language
NareshKarela1
 
QPOfutyfurfugfuyttruft7rfu65rfuyt PPT - Copy.ppt
QPOfutyfurfugfuyttruft7rfu65rfuyt PPT - Copy.ppt
ahmed518927
 
ch02-240507064009-ac337bf1 .ppt
ch02-240507064009-ac337bf1 .ppt
iamayesha2526
 
Query optimization and processing for advanced database systems
Query optimization and processing for advanced database systems
meharikiros2
 
description description description description
description description description description
ibrahimradwan14
 
Ad

Recently uploaded (20)

Grote OSM datasets zonder kopzorgen bij Reijers
Grote OSM datasets zonder kopzorgen bij Reijers
jacoba18
 
apidays Singapore 2025 - What exactly are AI Agents by Aki Ranin (Earthshots ...
apidays Singapore 2025 - What exactly are AI Agents by Aki Ranin (Earthshots ...
apidays
 
METHODS OF DATA COLLECTION (Research methodology)
METHODS OF DATA COLLECTION (Research methodology)
anwesha248
 
11th International Conference on Data Mining (DaMi 2025)
11th International Conference on Data Mining (DaMi 2025)
rinzindorjej
 
最新版美国佐治亚大学毕业证(UGA毕业证书)原版定制
最新版美国佐治亚大学毕业证(UGA毕业证书)原版定制
Taqyea
 
SQL-Demystified-A-Beginners-Guide-to-Database-Mastery.pptx
SQL-Demystified-A-Beginners-Guide-to-Database-Mastery.pptx
bhavaniteacher99
 
Media_Literacy_Index_of_Media_Sector_Employees.pdf
Media_Literacy_Index_of_Media_Sector_Employees.pdf
OlhaTatokhina1
 
apidays New York 2025 - API Security and Observability at Scale in Kubernetes...
apidays New York 2025 - API Security and Observability at Scale in Kubernetes...
apidays
 
apidays Singapore 2025 - Building Finance Innovation Ecosystems by Umang Moon...
apidays Singapore 2025 - Building Finance Innovation Ecosystems by Umang Moon...
apidays
 
Residential Zone 4 for industrial village
Residential Zone 4 for industrial village
MdYasinArafat13
 
MEDIA_LITERACY_INDEX_OF_EDUCATORS_ENG.pdf
MEDIA_LITERACY_INDEX_OF_EDUCATORS_ENG.pdf
OlhaTatokhina1
 
Attendance Presentation Project Excel.pptx
Attendance Presentation Project Excel.pptx
s2025266191
 
Addressing-the-Air-Quality-Crisis-in-New-Delhi.pptx
Addressing-the-Air-Quality-Crisis-in-New-Delhi.pptx
manpreetkaur3469
 
FME Beyond Data Processing: Creating a Dartboard Accuracy App
FME Beyond Data Processing: Creating a Dartboard Accuracy App
jacoba18
 
What is FinOps as a Service and why is it Trending?
What is FinOps as a Service and why is it Trending?
Amnic
 
apidays Singapore 2025 - Enhancing Developer Productivity with UX (Government...
apidays Singapore 2025 - Enhancing Developer Productivity with UX (Government...
apidays
 
unit- 5 Biostatistics and Research Methodology.pdf
unit- 5 Biostatistics and Research Methodology.pdf
KRUTIKA CHANNE
 
SAP_S4HANA_EWM_Food_Processing_Industry.pptx
SAP_S4HANA_EWM_Food_Processing_Industry.pptx
vemulavenu484
 
Hypothesis Testing Training Material.pdf
Hypothesis Testing Training Material.pdf
AbdirahmanAli51
 
apidays New York 2025 - Life is But a (Data) Stream by Sandon Jacobs (Confluent)
apidays New York 2025 - Life is But a (Data) Stream by Sandon Jacobs (Confluent)
apidays
 
Grote OSM datasets zonder kopzorgen bij Reijers
Grote OSM datasets zonder kopzorgen bij Reijers
jacoba18
 
apidays Singapore 2025 - What exactly are AI Agents by Aki Ranin (Earthshots ...
apidays Singapore 2025 - What exactly are AI Agents by Aki Ranin (Earthshots ...
apidays
 
METHODS OF DATA COLLECTION (Research methodology)
METHODS OF DATA COLLECTION (Research methodology)
anwesha248
 
11th International Conference on Data Mining (DaMi 2025)
11th International Conference on Data Mining (DaMi 2025)
rinzindorjej
 
最新版美国佐治亚大学毕业证(UGA毕业证书)原版定制
最新版美国佐治亚大学毕业证(UGA毕业证书)原版定制
Taqyea
 
SQL-Demystified-A-Beginners-Guide-to-Database-Mastery.pptx
SQL-Demystified-A-Beginners-Guide-to-Database-Mastery.pptx
bhavaniteacher99
 
Media_Literacy_Index_of_Media_Sector_Employees.pdf
Media_Literacy_Index_of_Media_Sector_Employees.pdf
OlhaTatokhina1
 
apidays New York 2025 - API Security and Observability at Scale in Kubernetes...
apidays New York 2025 - API Security and Observability at Scale in Kubernetes...
apidays
 
apidays Singapore 2025 - Building Finance Innovation Ecosystems by Umang Moon...
apidays Singapore 2025 - Building Finance Innovation Ecosystems by Umang Moon...
apidays
 
Residential Zone 4 for industrial village
Residential Zone 4 for industrial village
MdYasinArafat13
 
MEDIA_LITERACY_INDEX_OF_EDUCATORS_ENG.pdf
MEDIA_LITERACY_INDEX_OF_EDUCATORS_ENG.pdf
OlhaTatokhina1
 
Attendance Presentation Project Excel.pptx
Attendance Presentation Project Excel.pptx
s2025266191
 
Addressing-the-Air-Quality-Crisis-in-New-Delhi.pptx
Addressing-the-Air-Quality-Crisis-in-New-Delhi.pptx
manpreetkaur3469
 
FME Beyond Data Processing: Creating a Dartboard Accuracy App
FME Beyond Data Processing: Creating a Dartboard Accuracy App
jacoba18
 
What is FinOps as a Service and why is it Trending?
What is FinOps as a Service and why is it Trending?
Amnic
 
apidays Singapore 2025 - Enhancing Developer Productivity with UX (Government...
apidays Singapore 2025 - Enhancing Developer Productivity with UX (Government...
apidays
 
unit- 5 Biostatistics and Research Methodology.pdf
unit- 5 Biostatistics and Research Methodology.pdf
KRUTIKA CHANNE
 
SAP_S4HANA_EWM_Food_Processing_Industry.pptx
SAP_S4HANA_EWM_Food_Processing_Industry.pptx
vemulavenu484
 
Hypothesis Testing Training Material.pdf
Hypothesis Testing Training Material.pdf
AbdirahmanAli51
 
apidays New York 2025 - Life is But a (Data) Stream by Sandon Jacobs (Confluent)
apidays New York 2025 - Life is But a (Data) Stream by Sandon Jacobs (Confluent)
apidays
 
Ad

R programming

  • 1. Introduction to R Programming A Session By Vaibhav Kumar Dept. of CSE DIT University, Dehradun Vaibhav Kumar, DIT University, Dehradun
  • 2. R • R is a programming language and software environment for statistical analysis, graphics representation and reporting. • R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand. • R is freely a • It was named R, based on the first letter of first name of the two R authors (Robert Gentleman and Ross Ihaka). Vaibhav Kumar, DIT University, Dehradun
  • 3. Features of R • R is a well-developed, simple and effective programming language which includes conditionals, loops, user defined recursive functions and input and output facilities. • R has an effective data handling and storage facility. • R provides a suite of operators for calculations on arrays, lists, vectors and matrices. • R provides a large, coherent and integrated collection of tools for data analysis. • R provides graphical facilities for data analysis and display either directly at the computer or printing at the papers. Vaibhav Kumar, DIT University, Dehradun
  • 4. A Simple Example • A simple program to write “Hello” cab be written in R as: >print(“Hello”) • To add two numbers, a program can be written as: >Print(2+3) The first program can also be written as: >message=“Hello” >print(message) Vaibhav Kumar, DIT University, Dehradun
  • 5. Data Types and Objects in R • While using any programming language, we must define the data type of variables; means which type of data the variable will store. • Some popularly used data types in R are: Logical, Numeric, Integer, Complex, Character, Raw. • Some frequently used objects in R are: Vectors, Lists, Matrices, Arrays, Factors, Data Frames. Vaibhav Kumar, DIT University, Dehradun
  • 6. Vectors • A function c() is used to combine the elements of a vectore Example: fruits=c(“Apple”, “Orange”, “Banana”) print(fruits) • When we execute the above code, we will get the following output: “Apple” “Orange” “Banana” Vaibhav Kumar, DIT University, Dehradun
  • 7. Lists • A list is an R-object which can contain many different types of elements inside it like vectors, functions and even another list inside it. Example list1=list(c(“Apple”, “Orange”, “Banana”), c(2, 3, 5), 14.5) print(list1) When we execute the above code, we will get the following output: [1] “Apple” “Orange” “Banana” [2] 2 3 5 [3] 14.5 Vaibhav Kumar, DIT University, Dehradun
  • 8. Matrices • A matrix in R can be created using a vector input to the matrix function. Example: M=matrix(c(1, 2,3,4,5,6,7,8,9),ncol=3,nrow=3) When we execute the above code, we will get the following output: [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9 Vaibhav Kumar, DIT University, Dehradun
  • 9. Data Frames • Data frames are tabular data objects. • Unlike a matrix in data frame each column can contain different modes of data. • Data Frames are created using the data.frame() function. Example: >BMI=data.frame( Name=c(“Vaibhav”, “Nitin”, “Aakash”), Height=c(170, 169,175), Weight=c(80, 75,78), Age=c(30,30,29)) >print(BMI) When we run the above code, we will get the following output: Name Height Weight Age 1 Vaibhav 170 80 30 2 Nitin 169 75 30 3 Aakash 175 78 29 Vaibhav Kumar, DIT University, Dehradun
  • 10. R-Excel File • Microsoft Excel is the most widely used spreadsheet program which stores data in the .xls or .xlsx format. • R can read directly from these files using some excel specific packages. • We will have to run the following codes to install the package in R to access excel files. install.packages(“xlsx”) library(“xlsx”) (Note: Java environment must be installed before running these codes) Vaibhav Kumar, DIT University, Dehradun
  • 11. Reading the Excel File • Let we have an excel file: marks.xlsx in the current working directory*, then we will have to run the following code to read this file: data=read.xlsx(“marks.xlsx”, sheetIndex=1) print(data) • To make a sub data frame from the main data frame, we can run the following code NameMarks=data.frame(data$Name, data$Final) When we execute the above code, we can see the data of entire file which is loaded into the data frame: data (*.we can see the current working directory through the function getwd()) Vaibhav Kumar, DIT University, Dehradun
  • 12. Statistical Operations in R • Let us consider a vector of elements as: values=c(4, 5, 8, 9, 2, 5, 3, 6, 9, 8, 1 ,4) • Mean: mean(values) • Mode: mode(values) • Median: Median(values) • Let us consider the previous example of marks, if we want to see the Mean, Mode or Median of Final marks of students then we will have to run mean(data$Final), median(data$Final). Vaibhav Kumar, DIT University, Dehradun
  • 13. Regression Analysis • Regression analysis is a very widely used statistical tool to establish a relationship model between two variables-predictor and response. • The general mathematical equation for a linear regression is − y=ax+b Where y is the response variable, x is the predictor variable and a and b are the constants known as coefficients of regression. • In R, lm() function is used to create a relationship model between these two variables. Vaibhav Kumar, DIT University, Dehradun
  • 14. Example of Regression Analysis • Let us the example of marks of students. • Suppose we are to analyze the relation between class test marks and final marks of the students. • Let y=data$Final, x=data$ClassTest Then the relation can be created through the code: relation=lm(y~x) We can see the relation by running the following code: print(relation) • Summary of the relation can be seen through: summary(relation) (Note: since we are working on very less amount of data, values may not be acceptable) Vaibhav Kumar, DIT University, Dehradun
  • 15. Graphical Visualization of Regression • Regression analysis in previous example can be visualized graphically as: >png(file=“MarksRegression.png”) >plot(x, y, col=“Blue”, main=“Class Test and Final Marks”, abline(lm(y~x)), cex=1.3, pch=16, xlab=“Class Test”, ylab=“Final Marks”) >dev.off() By running the above code, we can see a regression line of the relation between class test and final marks. Vaibhav Kumar, DIT University, Dehradun
  • 16. Prediction • By using the regression analysis, we can predict the value of response variable for a new predictor value through predict() function. • Consider the previous example, where if we need to predict the final marks of a student on the basis of his marks in class test. Let we are to predict final marks if marks in class test is 10. >a=data.frame(x=10) >result=predict(relation, a) >print(relation) (Note: result will be in highly acceptable range if we have a large data set to create the model) Vaibhav Kumar, DIT University, Dehradun
  • 17. Multiple Regression • Multiple regression is an extension of linear regression into relationship between more than two variables. • In simple linear relation we have one predictor and one response variable, but in multiple regression we have more than one predictor variable and one response variable. • It can be expressed as: Y=a+b1X1+b2X2+….+bnXn Where, Y is the response variable, a, b1, b2,…,bn are the coefficients and X1, X2,….,Xn are the predictor variables. Vaibhav Kumar, DIT University, Dehradun
  • 18. Multiple Regression in R • Let us consider an example where result of students consists of Mid-Term Exams, Class Tests, Quiz and Final Marks. • Let we are to create a relation to analyze how Final marks are depending on Mid- Term Exams, Class Tests and Quiz. Let we have an another data set NewData which consists all these marks. Then a relation can be created as: Mul_Regr=lm(NewData$Final~NewData$MidTerm+NewData$Cla ssTest+NewData$Quiz, data=NewData) We can see this relation by print(Mul_Regr) Vaibhav Kumar, DIT University, Dehradun
  • 19. Pie Chart • In R the pie chart is created using the pie() function. • Example: x=c(20, 10, 40, 30) labels=c(“Dehradun”, “Roorkee”, “Delhi”, “Ghaziabad”) png(file=“PieChart.png”) pie(x,labels) dev.off() Vaibhav Kumar, DIT University, Dehradun
  • 20. Bar Chart • Consider the final marks of students. It can be plotted through bar chart as: png(file=“BarChart.png”) barplot(data$Final) dev.off() Vaibhav Kumar, DIT University, Dehradun
  • 21. Histogram • Consider the example of marks again. Let we are to plot the histogram of final marks. >png(file=“Histogram.png”) >hist(data$Final, xlab=“Final Marks”, col=“Blue”, border=“Red”) >dev.off() Vaibhav Kumar, DIT University, Dehradun
  • 22. Thank You Vaibhav Kumar, DIT University, Dehradun