0% found this document useful (0 votes)
23 views

Practicaal Session Lecture3-Set Up For R Programming Language For Data Analytics

Uploaded by

hoangha43kd
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Practicaal Session Lecture3-Set Up For R Programming Language For Data Analytics

Uploaded by

hoangha43kd
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

R programming Language for Data Analytics

Learning Objective
I. Setting up the computers.
II. Resources needed for practical sessions.
III. Installing packages and loading library.
IV. How to get the datasets.

Resources you will need.


I. R-Studio installation and RStudio cloud
II. Jupyter notebook for R (Project Jupyter | Try Jupyter)
III. Any other R IDE.
IV. Datasets are in each library.

Install R and R studio and open the interface.

1
Setting up the folder/Directory >>> see below

2
Code to set working directory [This does not work on the online system as at the time of this tutorial]

setwd("C:/Users/ebenu/Downloads/COMP1810Web AnalyticsLectures/Lecture Slide 1810/Practicals-Weekly")

Importing data into R


Reading Data Files
Reading or importing data into RStudio is quite simple and also s
Reading a CSV File
Using the code
data <- read.csv(file="data.csv", header=TRUE, sep=",");

Below is the data.csv file that will be loaded into RStudio for this practical session

3
Read in Excel file
If the file is in excel use below code
data <- read.xlsx(file="data.xlsx", 1);

4
file is the location of the Excel file. 1 refers to sheet number 1.

///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

See the data by using >>>>>>data;

Selecting Data
You can select a few columns “x” and “x3” from the data using a vector:

5
Installing packages and loading library.
[tidyverse is the almighty package for data analytics]
install.packages("packagename")
install.packages("tidyverse ")

install.packages("vioplot")
install.packages(" babynames ")
Install more than one package
install.packages(c("vioplot", "MASS"))
To remove package
remove.packages("vioplot")
Loading a package
library(dplyr)
library(babynames)
To check all function in a loaded package
ls("package:babynames")
ls("package:dplyr")

How to get the datasets.


I. Most data set may be provided by your tutor.

6
II. All packages have datasets associated with it, you will be shown how to access
them. data()
III. To list all data in a particular package eg ggplot2
try(data(package = "ggplot2") )

III. You may download dataset from University of California datasets from
link https://archive.ics.uci.edu/ml/datasets.php
IV. You may download dataset from Kaggle
link. https://www.kaggle.com/datasets
V. Others.

Online Jupyter IDE

TYPE https://jupyter.org/try

/////////////////////////////////////////////////////////////////////////////////////////////

///////////////////////////////////////////////////////////////////////////////////////////////////////

7
To list all data in a particular package eg ggplot2

try(data(package = "ggplot2") )

Getting Help
To get help with a function, datasets etc use

Help(datasets) or Help(function) or ?dataset

8
unloaded and dplyr overwrites some functions
Some function and library were overwritten by dplyr during installation of tidyverse like
nycflights13 install and load it to have access to the flight datasets used :: see below

install.packages("nycflights13")
library(nycflights13)
flights

9
filter(flights, month == 1, day == 1)

//////////////////////////////////////////////////////////////////////////////////////////////////////////////

(dec25 <- filter(flights, month == 12, day == 25))

Using the dplyr

The functions of dplyr are “ssmaf”

• Pick observations by their values (filter()).


• Reorder the rows (arrange()).
• Pick variables by their names (select()).
• Create new variables with functions of existing variables (mutate()).
• Collapse many values down to a single summary (summarize()).

///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

Comparisons Operators

R Comparisons Operators are >, >=, <=, != (not equal), and == (equal).

10
Find the flight departing November or December

filter(flights, month == 11 | month == 12)

All this code produce the same result

filter(flights, month == 11 | month == 12)

nov_dec <- filter(flights, month %in% c(11, 12))

(y <- filter(flights, month == c(11, 12)))

find flights that weren’t delayed (on arrival or departure) by more than two

this two code produce the same result.

filter(flights, !(arr_delay > 120 | dep_delay > 120))


filter(flights, arr_delay <= 120, dep_delay <= 120)

11

You might also like