0% found this document useful (0 votes)
122 views58 pages

Unit 1 Introduction to R.pptx

This document serves as an introduction to R, a programming language designed for statistical computing and data analysis. It covers the key features of R, typical applications, installation resources, and the integrated development environment (IDE) of RStudio. The document also discusses data types, operators, and how to get help within R.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
122 views58 pages

Unit 1 Introduction to R.pptx

This document serves as an introduction to R, a programming language designed for statistical computing and data analysis. It covers the key features of R, typical applications, installation resources, and the integrated development environment (IDE) of RStudio. The document also discusses data types, operators, and how to get help within R.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

Unit 1 : Introduction to R

Statistical Programming in R
(FT-305B)

SPR/FT 305B/JK 1
Business Analytics Specialization sem 3
• Introduction to BA

• Predictive Modelling

• Statistical Programming in R

SPR/FT 305B/JK 2
SPR/FT 305B/JK 3
SPR/FT 305B/JK 4
SPR/FT 305B/JK 5
CONTENT
• What is R?
• Why R?
• R resources
• Installing Base R and R studio
• R Cosmetics
• The four panes in R studio
• Directory , library , packages

SPR/FT 305B/JK 6
What is R?
R is a programming language and environment
specifically designed for statistical computing, data
analysis, and graphical representation. It is widely used
among statisticians, data scientists, and researchers for
various types of data manipulation, analysis, and
visualization tasks.

SPR/FT 305B/JK 7
Why R?
• Open source
• Interactive
• Operates across various platforms (Window,
Mac,Unix, Linux)
• Scripts and objects can be shared across various
platforms
• Large community for support
• Add – ons are easy
• Versatile and poweful- analytics and visualizations

SPR/FT 305B/JK 8
Key Features of R

1.Statistical Analysis such as regression, hypothesis testing, time series


analysis, and machine learning.
2.Data Manipulation: R includes packages like dplyr and tidyr for efficient data
wrangling and manipulation, allowing users to clean, transform, and prepare data
for analysis.
3.Data Visualization capabilities, particularly with libraries like ggplot2, which
allows users to create complex, publication-quality visualizations with ease.
4.Reproducible Research: R supports reproducible research, which means
the code can be rerun to yield the same results, a crucial aspect for
validation in research and data science.
5.Extensibility with Packages: R has a vast ecosystem of packages
(libraries) that extend its functionality. The Comprehensive R Archive
Network (CRAN) hosts thousands of packages contributed by the R
community, covering everything from basic data manipulation to advanced
machine learning algorithms.

SPR/FT 305B/JK 9
Typical Applications of R
•Statistical Analysis and Hypothesis Testing: Researchers use R
for conducting tests and drawing inferences from data.

•Machine Learning: R has packages like caret and randomForest


that support machine learning tasks such as classification,
regression, and clustering.

•Data Visualization: Researchers and data analysts create detailed,


customizable visualizations with packages like ggplot2 and plotly.

•Big Data Analytics: With packages like sparklyr and dplyr, R can
handle larger datasets and connect with big data tools.

•Data Wrangling: Libraries like dplyr, tidyr, and data.table simplify


tasks such as filtering, reshaping, and summarizing data.

SPR/FT 305B/JK 10
R Resources
1. Installing R- two software packages: Base-R, and RStudio. Base-R is the
basic software which contains the R programming language. RStudio is
software that makes R programming easier. They are totally free and
open source. (comprehensive r archive network)

http://cran.r-project.org/bin/windows/base/
http://www.rstudio.com/products/rstudio/download/
2. Online resources:
http://www.r-bloggers.com/
http://blog.revolutionanalytics.com/
www.stackoverflow.com
3. Books:
R for Data Science by Garrett Grolemund and Hadley Wickham
R for beginners by Sandeep Rakshit, TMH publication

SPR/FT 305B/JK 11
R Console-Cosmetics

SPR/FT 305B/JK 12
R Studio -Cosmetics

SPR/FT 305B/JK 13
INTEGRATED DEVELOPMENT ENVIRONMENT- IDE
• An IDE consolidate the different aspects of writing a computer
program thus enhances productivity
• Single application: editing source code, building executables,
and debugging.
– Editing source code (Syntax highlighting (visual clues- smart
indentation , color coding) & Autocomplete features)
– Build executables ( compiling & executing code)
– Debugging ( debugging tools and hints like code completion)

• IDE for R- IntelliJ, Eclipse, Visual studio, R


studio

SPR/FT 305B/JK 14
The four panes in R Studio

SPR/FT 305B/JK 15
Source: Note pad for code

SPR/FT 305B/JK 16
• Write the script
• Edit the script
• Save the script
• Run/ Rerun command to get output
• Line by line execution or selected
section/complete script can be executed
• Key bindings- Ctrl +S, Ctrl +Enter,
Ctrl+shift+Enter, Ctrl + N

SPR/FT 305B/JK 17
Console: R’s Heart

SPR/FT 305B/JK 18
• R evaluates the code here
• Directly code can be typed and executed in
console but code can’t be saved
• General practice is to code in source and send
it to console by run command where it
generates output

SPR/FT 305B/JK 19
Environment/History

SPR/FT 305B/JK 20
• Environment:
Data objects being defined in the current R
session
Data objects- information (observations,
rows)

• History: R codes previously evaluated on


console

SPR/FT 305B/JK 21
Files / Plots / Packages / Help

SPR/FT 305B/JK 22
Files :
• The files panel gives you access to the file directory on
your hard drive.
• Feature of the “Files” panel is that you can use it to set
your working directory - once you navigate to a folder
you want to read and save files to, click “More” and then
“Set As Working Directory.”

Plots
• Plot graphs.There are buttons for opening the plot
in a separate window and exporting/ saving the
plot as a pdf or jpeg

SPR/FT 305B/JK 23
Packages :
• R packages installed on your harddrive
• R packages loaded in the current session
(checked ).
• Help - Help menu for R functions. You can
either type the name of a function in the
search window, or use the code to search for a
function with the name
• ?hist
• ? t.test
• Vignettes – Tutorial help to use R package

SPR/FT 305B/JK 24
Help, Demonstration, Examples, Packages and
Libraries

25
To start R, double click on the icon .
Then we get the following Gui (Graphic user interface) window
screen

26
HELP :

1) Start R software and click the help button in the toolbar of


the R Gui (Graphic user interface) window.

27
2. Search for help in Google www.google.com

3. If you need help with a function, then type question mark


followed by the nae of the function. For example,
?read.table to get help for function read.table.

28
29
…continued

All minor details and explanations of all arguments are given.

30
4. Sometimes, you want to search by the subject on which we
want help (e.g. data input). In such a case, type
help.search("data input")

31
Clicking over the link give required information

32
4. 'help()' for on-line help,

or 'help.start()’ for an HTML browser interface to help.

33
34
5) Other useful functions are find and apropos.

6) The find function tells us what package something is in.

For example
> find("lowess") returns
1 "package:stats"

35
Getting Help in R
7) The apropos returns a character vector giving the names of all
objects in the search list that match your enquiry.
apropos("lm") returns

36
•To see a worked example just type the
function name, e.g., lm
•for linear models:

•example(lm)

•and we see the printed and graphical output


produced by the lm
•function.
37
…and other details follow further 14
…and it continues
39
40
How to quit in R

Type 'q()' to quit R.

41
Libraries in R
R provides many functions and one can also write own.
Functions and datasets are organised into libraries

To use a library, simply type the library function with the


name of the library in brackets.

library(.)

For example, to load the spatial library type:

library(spatial)

42
Libraries in R
Examples of libraries that come as a part of base package in R.

MASS : package associated with Venables and Ripley’s book


entitled Modern Applied Statistics using S-Plus.

mgcv : generalized additive models.

43
It is easy to use the help function to discover the contents of
library packages.

Here is how we find out about the contents of the spatial


library:
library(help=spatial) returns
Information on package
‘spatial’ Description:
Package: spatial
Priority: recommended
Version: 7.3-8
followed by a list of all the functions and data sets.
44
45
The base R package contains programs for basic operations.

It does not contain some of the libraries necessary for advanced


statistical work.

Specific requirements are met by special packages.

They are downloaded and their downloading is very simple.

46
To install any package,

•run the R program,

•then on the command line, usethe install.packages


function to download the libraries we want.

47
Examples :
•The package rmeta contains the statistical tools for
meta analysis.
•The package Agreement contains statistical tools for
measuring agreement.

The packages rmeta or Agreement can be installed by


install.packages("rmeta")

install.packages("Agreement"

)
Then we get… 24
49
50
Data types in R
Data Type Example R Code
Character "good", "TRUE", “23.4” name <- "John"

Doubles 12.3, 5, 999 x <- 5.67


(Numeric)
Integer 2L, 34L, 0L y <- 3L
(Numeric)
Logical data TRUE (T)/FALSE(F) is_valid <- TRUE

Complex 3 + 2i Z<- 3+2i

Gender, gender <- factor(c("Male",


Factor classes,educational "Female", "Male", "Female"))
level

51
Objects in R

SPR/FT 305B/JK 52
Movie Data
boxoffic
movie year genre time rating
e
Whatever
2009 35.0 Comedy 92 PG-13
Works
It Follows 2015 15.0 Horror 97 R
Love and Mercy 2015 15.0 Drama 120 R

The Goonies 1985 62.0 Adventure 90 PG


Jiro Dreams of
2012 3.0 Documentary 81 G
Sushi
There Will be
2007 10.0 Drama 158 R
Blood
Moon 2009 321.0 Science Fiction 97 R
Spice World 1988 79.0 Comedy -84 PG-13
Serenity 2005 39.0 Science Fiction 119 PG-13
Finding Vivian
2014 1.5 Documentary 84 Unrated
Maier 53
Operators in R

Operator Description
+ Addition
– Subtraction
* Multiplication
/ Division
^ or ** Exponentiation

SPR/FT 305B/JK 54
Logical Operators in R

Operator Name Example


== Equal x == y
!= Not equal x != y
> Greater than x>y
< Less than x<y
>= Greater than or equal to x >= y
<= Less than or equal to x <= y

SPR/FT 305B/JK 55
Logical Operators in R

Operator Description
& Element-wise Logical AND operator. It returns TRUE if both
elements are TRUE
&& Logical AND operator - Returns TRUE if both statements are TRUE
| Elementwise- Logical OR operator. It returns TRUE if one of the
statement is TRUE
|| Logical OR operator. It returns TRUE if one of the statement is
TRUE.
! Logical NOT - returns FALSE if statement is TRUE

SPR/FT 305B/JK 56
Boolean Values inR
• TRUE (T)
• FALSE (F)

SPR/FT 305B/JK 57
Miscellanous Operators in R

Operator Description Example


: Creates a series of numbers in a sequence x <- 1:10
%in% Find out if an element belongs to a vector x %in% y
%*% Matrix Multiplication x <- Matrix1 %*%
Matrix2

SPR/FT 305B/JK 58

You might also like