Practicaal Session Lecture3-Set Up For R Programming Language For Data Analytics
Practicaal Session Lecture3-Set Up For R Programming Language For Data Analytics
Learning Objective
I. Setting up the computers.
II. Resources needed for practical sessions.
III. Installing packages and loading library.
IV. How to get the datasets.
1
Setting up the folder/Directory >>> see below
2
Code to set working directory [This does not work on the online system as at the time of this tutorial]
Below is the data.csv file that will be loaded into RStudio for this practical session
3
Read in Excel file
If the file is in excel use below code
data <- read.xlsx(file="data.xlsx", 1);
4
file is the location of the Excel file. 1 refers to sheet number 1.
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
Selecting Data
You can select a few columns “x” and “x3” from the data using a vector:
5
Installing packages and loading library.
[tidyverse is the almighty package for data analytics]
install.packages("packagename")
install.packages("tidyverse ")
install.packages("vioplot")
install.packages(" babynames ")
Install more than one package
install.packages(c("vioplot", "MASS"))
To remove package
remove.packages("vioplot")
Loading a package
library(dplyr)
library(babynames)
To check all function in a loaded package
ls("package:babynames")
ls("package:dplyr")
6
II. All packages have datasets associated with it, you will be shown how to access
them. data()
III. To list all data in a particular package eg ggplot2
try(data(package = "ggplot2") )
III. You may download dataset from University of California datasets from
link https://archive.ics.uci.edu/ml/datasets.php
IV. You may download dataset from Kaggle
link. https://www.kaggle.com/datasets
V. Others.
TYPE https://jupyter.org/try
/////////////////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////////////////////////
7
To list all data in a particular package eg ggplot2
try(data(package = "ggplot2") )
Getting Help
To get help with a function, datasets etc use
8
unloaded and dplyr overwrites some functions
Some function and library were overwritten by dplyr during installation of tidyverse like
nycflights13 install and load it to have access to the flight datasets used :: see below
install.packages("nycflights13")
library(nycflights13)
flights
9
filter(flights, month == 1, day == 1)
//////////////////////////////////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
Comparisons Operators
R Comparisons Operators are >, >=, <=, != (not equal), and == (equal).
10
Find the flight departing November or December
find flights that weren’t delayed (on arrival or departure) by more than two
11