RP LAB
RP LAB
install.packages("packageName")
Example:
install.packages("ggplot2")
library(packageName)
1
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
Example:
library(ggplot2)
Updating Packages
Updating packages in R is an important practice to ensure you're
working with the latest features, bug fixes, and security
improvements.
Why Update Packages?
New Features: Updated packages often come with new
functionality that can enhance your workflow.
Bug Fixes: Updates fix bugs in older versions, improving the
stability and performance of your code.
Security Patches: Some updates include important security
patches.
Compatibility: New versions of R may deprecate or change
certain functions, and updated packages ensure compatibility.
1. Update All Packages
The update.packages() function in R checks for updates for all
installed packages and installs the latest versions from CRAN.
Command: update.packages()
2
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
What Happens:
R will go through all the installed packages and compare the
versions with those available on CRAN.
If a newer version of a package is available, R will prompt you
to update it.
You can choose to either update all packages automatically or
manually decide which ones to update.
If a package is dependent on other packages, it will ensure those
dependencies are also up-to-date.
2. Update a Specific Package
If you want to update a specific package instead of all of them, you
can use the install.packages() function and specify the name of the
package you wish to update.
Command: install.packages("packageName")
What Happens:
This command will download and install the latest version of
the specified package from CRAN.
If the package is already installed, it will simply overwrite the
old version with the new one.
Optional Arguments in update.packages()
You can customize the update process using optional arguments:
update.packages(ask = FALSE)
The update.packages(ask = FALSE) command is a useful way to
update all installed packages in R without manually confirming
each update. By default, when you run update.packages(), R will
3
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
ask for confirmation before updating each package. Using ask =
FALSE automates the process, so R will update all outdated
packages automatically.
Command:
update.packages(ask = FALSE)
Explanation:
ask = FALSE: This argument tells R not to ask for user
confirmation and automatically updates all packages that have
newer versions available.
Effect: All installed packages that have updates will be updated
to the latest versions from CRAN without any prompts, making
the process faster and more convenient.
When to Use This:
If you have many packages installed and want to update all of
them at once without manually confirming each one.
If you are running the command in a script or non-interactive
session where you don’t want prompts.
update.packages(ask = TRUE)
The update.packages(ask = TRUE) command is used to update all
installed packages in R, but with manual confirmation for each
package that has an update available.
Command:
update.packages(ask = TRUE)
4
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
Explanation:
ask = TRUE: This argument tells R to prompt you before
updating each package. For each outdated package, R will ask
if you want to update it. You can choose whether to update a
specific package or skip it.
Effect: You get more control over the update process, allowing
you to selectively update certain packages while skipping
others.
When to Use This:
If you want to review and decide which packages to update
individually.
If you are concerned about package dependencies or want to
avoid updating certain packages that you may not need in their
latest version.
Interaction:
When you run this command, R will ask for confirmation similar to
this:
5
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
o Change the library path to a directory where you have
write permission.
Package Not Available for R Version:
o Check if the package is compatible with your version of
R.
o Consider updating R to the latest version.
Internet Connectivity Issues:
o Ensure you have a stable internet connection.
o Use setInternet2(TRUE) on Windows if necessary.
Corrupted Package Installation:
o Remove the problematic package and reinstall it.
To uninstall or remove an installed package in R, you can use the
remove.packages() function. This will completely uninstall the
specified package from your R environment.
Syntax:
remove.packages("packageName")
remove.packages("ggplot2")
6
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
Explanation:
remove.packages("packageName"): This function will
delete the specified package from your R library.
Once removed, the package will no longer be available for use
until you reinstall it using install.packages().
Notes:
If you want to remove multiple packages, you can call
remove.packages() multiple times or loop through a list of
packages.
DEFAULT PACKAGES
When you install R, it comes with several default packages that
provide essential functionality for working with data, performing
statistical analyses, and generating graphics. These default packages
are part of the R base environment and are automatically available
when you start using R.
These packages are all part of the core R distribution and are
available for immediate use without needing to install them
separately.
list of the default packages in R:
1. base
7
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
Provides fundamental functions for R programming, including
basic mathematical operations, data input/output, and
environment management.
2. stats
Includes a wide range of statistical functions such as probability
distributions, statistical tests, and linear modeling.
3. graphics
Contains functions for creating base R graphics such as plot(),
hist(), and other visualizations.
4. grDevices
Handles graphics devices and provides functions for managing
graphical parameters, colors, and fonts.
5. utils
Provides various utility functions, including help
documentation, package management, reading/writing data,
and workspace management.
6. datasets
Contains preloaded datasets such as iris, mtcars, airquality, and
more for use in examples and demonstrations.
7. methods
Provides functions for object-oriented programming using S4
classes and methods in R.
8. tools
Includes tools for package development, including checking
and building R packages, and other development utilities.
8
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
9. compiler
Contains tools for byte-code compilation, which can help speed
up the execution of R code.
10. parallel
Provides functions for parallel computing, enabling multi-core
processing and execution of code across multiple threads.
11. tcltk
Offers an interface to the Tcl/Tk GUI toolkit for building
graphical user interfaces (GUIs) in R.
12. grid
Implements a grid graphics system, which allows for the
creation of advanced, highly customizable graphical layouts.
13. splines
Provides functions for spline interpolation and smoothing,
useful for curve fitting and modeling.
14. stats4
Includes functions for maximum likelihood estimation using
S4 classes and methods.
15. lattice
A package for creating advanced multi-panel plots, particularly
useful for displaying multivariate data.
16. Matrix
Contains functions for working with dense and sparse matrices,
including matrix algebra and operations.
17. foreign
9
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
Allows importing data from various external formats, including
SPSS, Stata, and SAS.
18. survival
Includes functions for survival analysis, particularly useful in
time-to-event data modeling.
19. boot
Provides functions for bootstrapping, including resampling
techniques for statistical accuracy estimation.
20. cluster
Contains methods for cluster analysis, including hierarchical
and k-means clustering.
21. nlme
Used for fitting and analyzing linear and nonlinear mixed-
effects models.
22. rpart
Offers functions for recursive partitioning and regression trees.
23. mgcv
Provides functions for generalized additive models (GAMs)
with smooth terms.
24. nnet
Includes functions for training neural networks, particularly for
classification and regression tasks.
25. codetools
Helps with code analysis and debugging by providing tools for
static analysis of R code.
10
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
26. MASS
Includes functions and datasets from the book "Modern
Applied Statistics with S," providing tools for statistical
modeling.
27. spatial
Contains functions for spatial statistics, including spatial data
modeling and analysis.
28. KernSmooth
Provides functions for kernel smoothing and density
estimation.
29. boot
Offers methods for bootstrapping and other resampling
techniques.
Additional Tips
Check Package Documentation:
o Use help(package = "packageName") to view the
package documentation.
Install Older Versions of Packages:
Use the remotes package to install specific versions.
install.packages("remotes")
remotes::install_version("packageName", version = "x.y.z")
11
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
some essential packages in R that are widely used for various data
analysis, visualization, and statistical tasks:
1. ggplot2
Purpose: Data visualization.
Description: A powerful package for creating static,
animated, and interactive graphics using the grammar of
graphics.
Install: install.packages("ggplot2")
2. dplyr
Purpose: Data manipulation.
Description: A popular package for data wrangling, it
provides functions to filter, select, arrange, mutate, and
summarize data frames efficiently.
Install: install.packages("dplyr")
3. tidyr
Purpose: Data tidying.
Description: Helps tidy data by converting it from wide to
long format or vice versa, and handling missing values.
Install: install.packages("tidyr")
4. readr
Purpose: Data import.
Description: Provides fast and friendly functions for reading
data into R, such as CSV files and flat files.
Install: install.packages("readr")
5. stringr
12
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
Purpose: String manipulation.
Description: Provides a set of consistent and easy-to-use
functions to work with strings (e.g., pattern matching,
splitting, and replacing).
Install: install.packages("stringr")
6. lubridate
Purpose: Date and time manipulation.
Description: Simplifies working with date-time data, offering
easy parsing, manipulation, and arithmetic for dates and
times.
Install: install.packages("lubridate")
7. data.table
Purpose: Data manipulation.
Description: Fast and efficient data manipulation on large
datasets, an alternative to dplyr.
Install: install.packages("data.table")
8. caret
Purpose: Machine learning.
Description: A comprehensive package for building machine
learning models, providing tools for pre-processing, training,
and evaluating models.
Install: install.packages("caret")
9. shiny
Purpose: Web application development.
13
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
Description: Allows you to build interactive web applications
directly in R.
Install: install.packages("shiny")
10. plotly
Purpose: Interactive visualizations.
Description: Enables the creation of interactive web-based
visualizations, integrating seamlessly with ggplot2.
Install: install.packages("plotly")
11. forecast
Purpose: Time series analysis.
Description: Provides tools for forecasting time series data,
including ARIMA models, exponential smoothing, and more.
Install: install.packages("forecast")
12. xts
Purpose: Time series manipulation.
Description: Handles irregular time-series data and helps in
working with high-frequency data.
Install: install.packages("xts")
13. knitr
Purpose: Report generation.
Description: Simplifies the process of creating dynamic
reports, combining R code and its output with text (Markdown
or LaTeX).
Install: install.packages("knitr")
14
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
14. Rmarkdown
Purpose: Dynamic documents.
Description: Allows you to create dynamic reports in various
formats (HTML, PDF, Word) by integrating code and
narratives.
Install: install.packages("rmarkdown")
15. xgboost
Purpose: Machine learning.
Description: An efficient and scalable implementation of
gradient boosting for classification and regression.
Install: install.packages("xgboost")
16. tibble
Purpose: Data frames.
Description: Provides a modern take on data frames, offering
more consistent handling of column data.
Install: install.packages("tibble")
17. MASS
Purpose: Statistical functions.
Description: Contains functions and datasets from the book
"Modern Applied Statistics with S."
Install: install.packages("MASS")
18. httr
Purpose: HTTP requests.
15
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
Description: Simplifies working with URLs and APIs,
making it easier to download data from the web.
Install: install.packages("httr")
19. jsonlite
Purpose: JSON handling.
Description: Provides easy-to-use tools for parsing and
creating JSON data.
Install: install.packages("jsonlite")
20. sf
Purpose: Spatial data.
Description: Handles simple features for geographical data,
used for spatial data manipulation.
Install: install.packages("sf")
16
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
WEEK-2
Data Types in R
1. Numeric: Real numbers, decimals, and whole numbers.
2. Integer: Whole numbers with L.
3. Character: Text strings.
4. Logical: Boolean values TRUE and FALSE.
5. Complex: Numbers with real and imaginary parts.
6. Raw: Binary data (rarely used).
7. Factor: Categorical data with levels.
8. Date/Time: Date and time values.
CLASS():
In R, the class() function is used to determine the class or type of an
object. Understanding the class of an object is crucial for knowing
how to work with it and what functions are applicable to it.
Syntax
class(object)
17
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
1. Numeric
The numeric data type is used to store real numbers, including
integers and decimals. R automatically treats numbers as numeric
unless explicitly specified otherwise.
Usage:
Numbers with decimals are treated as numeric.
Integers without a decimal are also treated as numeric unless
defined as integers.
18
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
2. Integer
An integer is a numeric data type that specifically stores whole
numbers. You can create integers by adding the letter L after the
number.
Usage and Syntax:
Use L after the number to specify an integer.
3. Character
The character data type is used for storing strings of text. Strings
must be enclosed in quotes (either single or double quotes).
19
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
Usage and Syntax:
Create a character string by enclosing text in quotation marks.
Combining Strings:
full_name <- paste("First", "Last")
print(full_name)
20
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
4. Logical
The logical data type is used to store Boolean values (TRUE or
FALSE).
Usage and Syntax:
Logical values are TRUE or FALSE (can also be abbreviated
as T or F).
21
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
Combining Logical Values:
is_raining <- TRUE
is_sunny <- FALSE
weather_check <- is_raining & is_sunny
print(weather_check)
22
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
Understanding Double in R
In R, the term "double" refers to a specific type of numeric
data that can store decimal values with double-precision floating
points. It is one of the most common numeric data types in R, often
referred to simply as numeric, as R uses double precision for all
numeric values by default.
Understanding Double in R
A double type can hold both integers and decimal numbers,
but it can store them with a higher level of precision than single-
precision numbers (float). In R, numeric is synonymous with
double.
Double vs Integer
Even though integers like 42 seem like whole numbers, R
stores numbers as double by default unless explicitly specified as
integer (using L after the number).
23
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
is.function()
In R, is. functions are used to check if an object belongs to a specific
class or type. These functions are useful for ensuring that data is in
the expected format before performing operations on it.
1. is.numeric(): Checks for numeric type (includes double and
integer).
2. is.integer(): Checks specifically for integer type.
3. is.double(): Checks specifically for double type.
4. is.character(): Checks for character type.
5. is.logical(): Checks for logical type.
6. is.factor(): Checks for factor type.
24
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
7. is.data.frame(): Checks for data frame type.
8. is.list(): Checks for list type.
9. is.matrix(): Checks for matrix type.
10. is.null(): Checks if an object is NULL.
typeof() function
In R, the typeof() function is used to determine the internal type or
storage mode of an object. It provides more detailed information
about the underlying representation of the data compared to the
class() function, which returns the object's class or type in a more
general sense.
typeof(): Returns the internal storage mode of an object, such
as "double", "integer", "character", "logical", "complex", "list",
"data.frame", and "NULL".
class(): Provides a high-level description of the object's class,
which may include multiple classes.
double: A type of numeric data that includes decimal points.
typeof() returns "double".
numeric: A general term for numeric data that defaults to
double precision. typeof() returns "double".
integer: Whole numbers stored as integers. typeof() returns
"integer".
5. Complex
The complex data type is used to store complex numbers,
which have both a real and an imaginary component.
Usage and Syntax:
A complex number is represented by adding i after the
imaginary part.
25
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)
# Complex data type
comp_val <- 3 + 2i
class(comp_val)
26
DS504PC: R PROGRAMMING LAB R22 B.Tech. CSE (Data Science)