0% found this document useful (0 votes)

54 views

(Tutorial) The 10 Most Important Packages in R For Data Science - DataCamp

The document discusses the 10 most important packages in R for data science. These include ggplot2 for data visualization, data.table for fast data manipulation of large datasets, dplyr for data wrangling, tidyr for tidying data, shiny for building web apps, plotly for interactive graphs, knitr for reporting, mlr3 for machine learning workflows, xgboost for gradient boosting, and caret for predictive modeling tools. Instructions are provided on installing and loading each package.

Uploaded by

Gabriel Hi

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views

(Tutorial) The 10 Most Important Packages in R For Data Science - DataCamp

Uploaded by

Gabriel Hi

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

1/1/2021 (Tutorial) The 10 Most Important Packages in R for Data Science - DataCamp

Buy an annual subscription and save 75% now!

Offer ends in 11 days 02 hrs 14 mins 23 secs

Log in Create Free Account

Olivia Smith
August 30th, 2020

R PROGRAMMING

The 10 Most Important Packages in R for Data

Science
Learn about different packages in R used for data science. Including
how to load them and different resources you can use to advance
your skills with them.

R is the most popular language for Data Science. There are many packages and libraries
provided for doing different tasks. For example, there is dplyr and data.table for data
manipulation, whereas libraries like ggplot2 for data visualization and data cleaning
library like tidyr . Also, there is a library like 'Shiny' to create a Web application and
knitr for the Report generation where nally mlr3 , xgboost , and caret are used in
Machine Learning.

1. ggplot2
ggplot2 is based on the 'Grammar of Graphics", which is a popular data visualization
library. Graphs with one variable, two variables, and three variables, along with both
categorical and numerical data, can be built. Also, grouping can be done through symbol,

https://www.datacamp.com/community/tutorials/top-ten-most-important-packages-in-r-for-data-science 1/8
1/1/2021 (Tutorial) The 10 Most Important Packages in R for Data Science - DataCamp

size, color, etc. The interactive graphics can be made with the help of plot.ly , where
the 3D image should be made from plot3D .

You can easily install the package ggplot2 in R's console as seen below:

install.packages("ggplot2")

You can easily load the package ggplot2 by using the following syntax:

library(ggplot2)

The following tutorials on DataCamp provide much detailed knowledge about 'ggplot2'.

1. Data Visualization with ggplot2 (Part 1)

2. Data Visualization with ggplot2 (Part 2)

3. Data Visualization with ggplot2 (Part 3)

2. data.table
data.table is the fastest package that can handle a vast amount of data during data
manipulation. It is mostly used for health care domains for genomic data and elds like
business for predictive analytics. Also, the data size ranges from more than 10 GB to
100GB.

You can easily install the package data.table in R's console as seen below:

install.packages("data.table")

You can easily load the package data.table in R as seen below:

library(data.table)

You can look up to following tutorial and course in the DataCamp:

1. Data Analysis in R, the data.table Way.

https://www.datacamp.com/community/tutorials/top-ten-most-important-packages-in-r-for-data-science 2/8
1/1/2021 (Tutorial) The 10 Most Important Packages in R for Data Science - DataCamp

2. A data.table R Tutorial: Intro to DT[i, j, by].

3. dplyr
dplyr is the package which is used for data manipulation by providing different sets of
verbs like select() , arrange() , filter() , summarise() , and mutate() . It can also
work with computational backends like dplyr , sparklyr , and dtplyr .

1. You can install dplyr through using the tidyverse package, which will come with
the package dplyr .

install.packages("tidyverse")

2. Alternatively, you can install dplyr using the following command.

install.packages("dplyr")

3. You can load the package by using the following command.

library(dplyr)

The following tutorial and course in DataCamp provide detailed knowledge of dplyr .

1. Data Manipulation with dplyr

2. Joining Data with dplyr

3. Introduction to the Tidyverse

4. tidyr
tidyr helps to create tidy data. The signi cant amount of work mostly goes on when
cleaning and tidying the data. Basically, tidy data consists of those datasets where every
cell acts as a single value, where every row is an observation, and every column is
variable.

You can install tidyr using the following command.

https://www.datacamp.com/community/tutorials/top-ten-most-important-packages-in-r-for-data-science 3/8
1/1/2021 (Tutorial) The 10 Most Important Packages in R for Data Science - DataCamp

install.packages("tidyr")

You can load tidyr using the following command.

library(tidyr)

The following tutorial in DataCamp provides detailed knowledge in tidyr . Cleaning Data
in R

5. Shiny
Shiny can be used to build the web application without requiring JavaScript. It can be
used together with htmlwidgets, JavaScript actions, and CSS themes to have extended
features. Also, it can be used to build dashboards along with the standalone web
applications.

You can install the Shiny package by the following command.

install.packages("shiny")

You can load Shiny using the following command.

library(shiny)

You can visit the link mentioned below to learn more about Shiny .
Shiny Fundamentals with R

6. plotly
plotly is the graphing library used to create graphs that are interactive and can also be
used with JavaScript known as plotly.js .

You can install the plotly package by the following command.

install.packages("plotly")

https://www.datacamp.com/community/tutorials/top-ten-most-important-packages-in-r-for-data-science 4/8
1/1/2021 (Tutorial) The 10 Most Important Packages in R for Data Science - DataCamp

You can load plotly using the following command.

library(plotly)

You can visit the link mentioned below to learn more about plotly .
Intermediate Interactive Data Visualization with plotly in R

7. knitr
knitr is the package mostly used for research. It is reproducible, used for report
creation, and integrates with various types of code structures like LaTeX, HTML,
Markdown, LyX, etc. It was inspired by Sweave and has extended the features by adding
lots of packages like a weaver, animation, cacheSweave, etc.

You can install the knitr package by the following command.

install.packages("knitr")

You can load knitr using the following command.

library(knitr)

You can visit the link mentioned below to learn more about knitr .
Reporting with R Markdown

8. mlr3
mlr3 package is created for doing Machine Learning. It is also ef cient, which supports
Object-Oriented programming where 'R6' objects are being provided along with machine
learning work ow. It is also seen as one of the extensible frameworks for clustering,
regression, classi cation, and survival analysis.

You can install the mlr3 package by the following command.

install.packages("mlr3")

https://www.datacamp.com/community/tutorials/top-ten-most-important-packages-in-r-for-data-science 5/8
1/1/2021 (Tutorial) The 10 Most Important Packages in R for Data Science - DataCamp

You can load knitr using the following command.

library(mlr3)

You can visit the link mentioned below to learn more about mlr3 .
mlr3Book

9. XGBoost
XGBoost is an implementation of the gradient boosting framework. It also provides an
interface for R where the model in R's caret package is also present. Its speed and
performance are faster than the implementation in H20, Spark, and Python. This
package's primary use case is for machine learning tasks like classi cation, ranking
problems, and regression.

You can install the XGBoost package by the following command.

install.packages('xgboost')

You can load XGBoost using the following command.

library(xgboost)

You can visit the link mentioned below to learn more about XGBoost .
Extreme Gradient Boosting with XGBoost

10. Caret
A caret package is a short form of Classi cation And Regression Training used for
predictive modeling where it provides the tools for the following process.

1. Pre-Processing: Where data is pre-processed and also the missing data is

checked.preprocess() is provided by caret for doing such task.

2. Data splitting: Splitting the training data into two similar categorical data sets is done.

https://www.datacamp.com/community/tutorials/top-ten-most-important-packages-in-r-for-data-science 6/8
1/1/2021 (Tutorial) The 10 Most Important Packages in R for Data Science - DataCamp

3. Feature selection: Techniques which is most suitable like Recursive Feature selection
can be used.

4. Training Model: caret provides many packages for machine learning algorithms.

5. Resampling for model tuning: The model can be tuned using repeated k-fold, k-fold,
etc. Also, the parameter can be tuned using 'tuneLength.'

6. Variable importance estimation: vlamp() can be used for any model to access the
variable importance estimation.

You can install the caret package by the following command.

install.packages('caret')

You can load caret using the following command.

library(caret)

You can visit the link mentioned below to learn more about caret from the author "Max
Kuhn".
Machine Learning with caret in R

Congratulations
Congratulations, you have made it to the end of this tutorial!

In this tutorial, you've learned about different packages in R used for the Data Science
process. This tutorial focused on installation, loading, and nally, getting the resources to
DataCamp for learning about these packages.

https://www.datacamp.com/community/tutorials/top-ten-most-important-packages-in-r-for-data-science 7/8
1/1/2021 (Tutorial) The 10 Most Important Packages in R for Data Science - DataCamp

0
13

Subscribe to RSS

About Terms Privacy

https://www.datacamp.com/community/tutorials/top-ten-most-important-packages-in-r-for-data-science 8/8

Kassambara, Alboukadel - Machine Learning Essentials - Practical Guide in R (2018)
100% (1)
Kassambara, Alboukadel - Machine Learning Essentials - Practical Guide in R (2018)
424 pages
R Graphics Essentials For Great Data Visualization 9781979748100 C
No ratings yet
R Graphics Essentials For Great Data Visualization 9781979748100 C
257 pages
Drive Into Danger
100% (1)
Drive Into Danger
17 pages
Norvasc Drug Card
No ratings yet
Norvasc Drug Card
1 page
Cape Connection and Narrative Reflection
No ratings yet
Cape Connection and Narrative Reflection
10 pages
Tidyverse Handout
No ratings yet
Tidyverse Handout
30 pages
En Tanagra Scilab Data Mining PDF
No ratings yet
En Tanagra Scilab Data Mining PDF
12 pages
Instant download R for Data Science 2nd Edition (Early Release) Hadley Wickham pdf all chapter
No ratings yet
Instant download R for Data Science 2nd Edition (Early Release) Hadley Wickham pdf all chapter
47 pages
ACFrOgBH3QzJqtesK4NqhLXNa89YjuS3PaAHn6kik2EC-R4sYvVX0XGFvE8x_Ht58eFFQEc9gzIMgpDiuPIQZWqTXZsOizAWpAQYieh_XY81COXksihekdcTTl6I_u_q0yu-dJYvyI2TJ-67I7L6sC0OM0Q0Rq9vdhlbv9SV2PsshAItQ_Jw3yJvbsJm
No ratings yet
ACFrOgBH3QzJqtesK4NqhLXNa89YjuS3PaAHn6kik2EC-R4sYvVX0XGFvE8x_Ht58eFFQEc9gzIMgpDiuPIQZWqTXZsOizAWpAQYieh_XY81COXksihekdcTTl6I_u_q0yu-dJYvyI2TJ-67I7L6sC0OM0Q0Rq9vdhlbv9SV2PsshAItQ_Jw3yJvbsJm
12 pages
89430
No ratings yet
89430
40 pages
Potential of The R Packages in Engineering
No ratings yet
Potential of The R Packages in Engineering
14 pages
Lab 1 (with Answers)
No ratings yet
Lab 1 (with Answers)
44 pages
Bt1101 l1 Lab - Basics of R Ay2425
No ratings yet
Bt1101 l1 Lab - Basics of R Ay2425
43 pages
AIAM_2023_DIY1
No ratings yet
AIAM_2023_DIY1
3 pages
Scikit-Learn Cookbook Sample Chapter
No ratings yet
Scikit-Learn Cookbook Sample Chapter
52 pages
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell - All Chapters Are Available In PDF Format For Download
100% (6)
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell - All Chapters Are Available In PDF Format For Download
37 pages
Sanju - R
No ratings yet
Sanju - R
34 pages
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidellinstant download
100% (5)
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidellinstant download
34 pages
Indian Institute of Management Bangalore: PGP 4 Term 2019
No ratings yet
Indian Institute of Management Bangalore: PGP 4 Term 2019
2 pages
Indian Institute of Management Bangalore: PGP 4 Term 2020
No ratings yet
Indian Institute of Management Bangalore: PGP 4 Term 2020
2 pages
5700 Mlogit
No ratings yet
5700 Mlogit
38 pages
Ad3301 Data Exploration and Visualization
No ratings yet
Ad3301 Data Exploration and Visualization
24 pages
A Gentle Guide To Tidy Statistics in R (Part 2) - by Thomas Mock - Towards Data Science
No ratings yet
A Gentle Guide To Tidy Statistics in R (Part 2) - by Thomas Mock - Towards Data Science
27 pages
Working With R
No ratings yet
Working With R
54 pages
Lab3 Instructions
No ratings yet
Lab3 Instructions
25 pages
Compositional Data With R Boogaart
No ratings yet
Compositional Data With R Boogaart
9 pages
seminar_1 2
No ratings yet
seminar_1 2
14 pages
DG High Availability
No ratings yet
DG High Availability
6 pages
R Language Lab Manual Lab 1
100% (1)
R Language Lab Manual Lab 1
33 pages
Useful R Packages
No ratings yet
Useful R Packages
73 pages
R PROGRAMMING LAB(20) (1)
No ratings yet
R PROGRAMMING LAB(20) (1)
46 pages
Data Guard Best Practices
No ratings yet
Data Guard Best Practices
7 pages
XGBoost R Tutorial
100% (1)
XGBoost R Tutorial
10 pages
Ebook: Data Visualization Tools For Users (English)
No ratings yet
Ebook: Data Visualization Tools For Users (English)
26 pages
Description Start Here If... : Evaluation
No ratings yet
Description Start Here If... : Evaluation
5 pages
Spatial Analysis
No ratings yet
Spatial Analysis
24 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
42 pages
Bigmemory Vignette
No ratings yet
Bigmemory Vignette
16 pages
Chapter-7-slides (1)
No ratings yet
Chapter-7-slides (1)
104 pages
Tutorial 1 - Questions
No ratings yet
Tutorial 1 - Questions
1 page
Practicaal Session Lecture3-Set Up For R Programming Language For Data Analytics
No ratings yet
Practicaal Session Lecture3-Set Up For R Programming Language For Data Analytics
11 pages
Week 5 Database
No ratings yet
Week 5 Database
68 pages
Data Analytics Lesson 10 Notes
No ratings yet
Data Analytics Lesson 10 Notes
7 pages
RP LAB
No ratings yet
RP LAB
26 pages
Download Study Resources for Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell
100% (9)
Download Study Resources for Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell
25 pages
Pyton Ls Dyna
No ratings yet
Pyton Ls Dyna
9 pages
Ds Manual
No ratings yet
Ds Manual
38 pages
Accessing and Working With Statsbomb Data in R
No ratings yet
Accessing and Working With Statsbomb Data in R
27 pages
Ad3301 Data Exploration and Visualization
100% (3)
Ad3301 Data Exploration and Visualization
30 pages
Automated DataGuard Best Practices
No ratings yet
Automated DataGuard Best Practices
40 pages
Notes 03 R Large Data
No ratings yet
Notes 03 R Large Data
8 pages
Project 1
No ratings yet
Project 1
4 pages
Machine_learning_lab_manual r
No ratings yet
Machine_learning_lab_manual r
32 pages
Dev
No ratings yet
Dev
33 pages
Libraries To Load SMDM
No ratings yet
Libraries To Load SMDM
1 page
Knitr Manual
No ratings yet
Knitr Manual
11 pages
DL Student Lab Manual
No ratings yet
DL Student Lab Manual
81 pages
DSP-Lab REC-553 final for print
No ratings yet
DSP-Lab REC-553 final for print
59 pages
Extract Essbase Outline To SQL Database
No ratings yet
Extract Essbase Outline To SQL Database
21 pages
R Programming - a Comprehensive Guide: Software
From Everand
R Programming - a Comprehensive Guide: Software
Editor IJSMI
No ratings yet
The Data Detective's Toolkit: Cutting-Edge Techniques and SAS Macros to Clean, Prepare, and Manage Data
From Everand
The Data Detective's Toolkit: Cutting-Edge Techniques and SAS Macros to Clean, Prepare, and Manage Data
Kim Chantala
No ratings yet
Data Driven Guide for Python Programming : Master Essentials to Advanced Data Structures
From Everand
Data Driven Guide for Python Programming : Master Essentials to Advanced Data Structures
Younes Hamdani
No ratings yet
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet
(Tutorial) Matrices in R - DataCamp
No ratings yet
(Tutorial) Matrices in R - DataCamp
7 pages
Objects and Classes in R - DataCamp PDF
No ratings yet
Objects and Classes in R - DataCamp PDF
11 pages
Southern United States
No ratings yet
Southern United States
3 pages
BANOVA
No ratings yet
BANOVA
44 pages
(Tutorial) PCH in R - DataCamp
No ratings yet
(Tutorial) PCH in R - DataCamp
7 pages
Machine Learning Modelling in R
No ratings yet
Machine Learning Modelling in R
1 page
Datashet DFPlayer Mini SKU DFR0299
No ratings yet
Datashet DFPlayer Mini SKU DFR0299
10 pages
Ada in Action: (With Practical Programming Examples)
100% (1)
Ada in Action: (With Practical Programming Examples)
153 pages
FTezDAQ - Manual1 - 3-1
No ratings yet
FTezDAQ - Manual1 - 3-1
29 pages
Filtros FIR - Ejemplos de Diseño
No ratings yet
Filtros FIR - Ejemplos de Diseño
112 pages
chtp7TIF 07
0% (1)
chtp7TIF 07
13 pages
EECE/CS 253 Image Processing: Lecture Notes: Rotating Images
No ratings yet
EECE/CS 253 Image Processing: Lecture Notes: Rotating Images
21 pages
Strong lensing optical depths in a ΛCDM universe II: the influence of the stellar mass in galaxies
No ratings yet
Strong lensing optical depths in a ΛCDM universe II: the influence of the stellar mass in galaxies
11 pages
Definition of Critical Thinking-Schafersman
No ratings yet
Definition of Critical Thinking-Schafersman
2 pages
Solarized Palawan Quotation Format
No ratings yet
Solarized Palawan Quotation Format
12 pages
Geometry Notes Centres of Triangles
No ratings yet
Geometry Notes Centres of Triangles
6 pages
Pro-Cut Grinder KG32XP Parts Manual 19767
No ratings yet
Pro-Cut Grinder KG32XP Parts Manual 19767
4 pages
Support & Learning: Autodesk Knowledge Network (/)
No ratings yet
Support & Learning: Autodesk Knowledge Network (/)
5 pages
Gateway and Apple: Two Different Journey Into Retailing: Aiub/Fba/Mba/Scm
No ratings yet
Gateway and Apple: Two Different Journey Into Retailing: Aiub/Fba/Mba/Scm
1 page
Live All Questions Final2021
No ratings yet
Live All Questions Final2021
50 pages
GCB-DS New PDF
No ratings yet
GCB-DS New PDF
20 pages
Gastrointestinal
No ratings yet
Gastrointestinal
6 pages
Lubri
No ratings yet
Lubri
42 pages
VIP 3 REF Exam
No ratings yet
VIP 3 REF Exam
8 pages
ITE7 Chp11
No ratings yet
ITE7 Chp11
182 pages
Marietta Directive
No ratings yet
Marietta Directive
4 pages
Ileo Biliar Inukai 2019
No ratings yet
Ileo Biliar Inukai 2019
4 pages
Learning Together: A Guide To Interactive, Co-Operative and Collaborative Learning
No ratings yet
Learning Together: A Guide To Interactive, Co-Operative and Collaborative Learning
6 pages
Noah ST John Afformations Elite Coaching Call 2
100% (1)
Noah ST John Afformations Elite Coaching Call 2
39 pages
MBA-DATA ANALYTICS - Data Science and Business Analysis - Unit 5
No ratings yet
MBA-DATA ANALYTICS - Data Science and Business Analysis - Unit 5
44 pages
MODULE 2 (Chapter 2.1)
No ratings yet
MODULE 2 (Chapter 2.1)
12 pages
(eBook PDF) The Art of Public Speaking 13th Edition by Stephen Lucas pdf download
100% (8)
(eBook PDF) The Art of Public Speaking 13th Edition by Stephen Lucas pdf download
54 pages
SAB-HSE-012-Rev.03 - Lifting Accessories Logbook
No ratings yet
SAB-HSE-012-Rev.03 - Lifting Accessories Logbook
1 page
Domestic Ground Source Heat Pumps Design Installation
No ratings yet
Domestic Ground Source Heat Pumps Design Installation
24 pages
Nail Care DLL
No ratings yet
Nail Care DLL
25 pages
Backhaul Movil - Modulo 2
100% (1)
Backhaul Movil - Modulo 2
218 pages
Gold and Silver Unit Conversion Chart
No ratings yet
Gold and Silver Unit Conversion Chart
1 page
Css Geography Repeated Questions
No ratings yet
Css Geography Repeated Questions
6 pages
B2.2 - Unit 3 - Practice Quiz Bai Lam
No ratings yet
B2.2 - Unit 3 - Practice Quiz Bai Lam
4 pages
Research Paper
No ratings yet
Research Paper
7 pages
UM B ING MA 2021 (Wajib)
100% (1)
UM B ING MA 2021 (Wajib)
10 pages

(Tutorial) The 10 Most Important Packages in R For Data Science - DataCamp

Uploaded by

(Tutorial) The 10 Most Important Packages in R For Data Science - DataCamp

Uploaded by

1/1/2021 (Tutorial) The 10 Most Important Packages in R for Data Science - DataCamp

Buy an annual subscription and save 75% now!

Log in Create Free Account

The 10 Most Important Packages in R for Data

1. Data Visualization with ggplot2 (Part 1)

2. Data Visualization with ggplot2 (Part 2)

3. Data Visualization with ggplot2 (Part 3)

You can easily load the package data.table in R as seen below:

You can look up to following tutorial and course in the DataCamp:

1. Data Analysis in R, the data.table Way.

2. A data.table R Tutorial: Intro to DT[i, j, by].

2. Alternatively, you can install dplyr using the following command.

3. You can load the package by using the following command.

1. Data Manipulation with dplyr

2. Joining Data with dplyr

3. Introduction to the Tidyverse

You can install tidyr using the following command.

You can load tidyr using the following command.

You can install the Shiny package by the following command.

You can load Shiny using the following command.

You can install the plotly package by the following command.

You can load plotly using the following command.

You can install the knitr package by the following command.

You can load knitr using the following command.

You can install the mlr3 package by the following command.

You can load knitr using the following command.

You can install the XGBoost package by the following command.

You can load XGBoost using the following command.

1. Pre-Processing: Where data is pre-processed and also the missing data is

You can install the caret package by the following command.

You can load caret using the following command.

About Terms Privacy

You might also like