0% found this document useful (0 votes)
15 views

UNIT 1

unit 1 R programming

Uploaded by

rajkumaramirtha3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

UNIT 1

unit 1 R programming

Uploaded by

rajkumaramirtha3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

HISTORY AND OVERVIEW OF R PROGRAMMING:

Certainly! Let’s explore the history and overview of R programming.

1. Origins of R:
o R was conceived in 1992 by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand.
o The language was named R as a tribute to its creators (the first letters of their names).
o Although R wasn’t publicly released until 1995, its development was inspired by the limitations encountered with the S
language.
2. S Language:
o The S language was developed by John Chambers and others at Bell Telephone Laboratories (originally part of AT&T Corp).
o Initially implemented as Fortran libraries, S was an internal statistical analysis environment.
o Early versions lacked functions for statistical modeling.
o In 1988, S was rewritten in C, leading to the version we have today (Version 3).
o The book Statistical Models in S documented its statistical analysis functionality.
3. S Evolution and Ownership:
o Since the early '90s, the life of S has taken a winding path.
o In 1993, Bell Labs granted StatSci (later Insightful Corp.) an exclusive license for S.
o In 2004, Insightful purchased S from Lucent for $2 million.
o Insightful sold its implementation as S-PLUS (with additional features).
o In 2008, Insightful was acquired by TIBCO, which now owns and develops the S language.
4. S Philosophy:
o The S philosophy emphasizes ease of data analysis.
o It allows users to start interactively without consciously thinking of themselves as programmers.
o As needs grow, users can gradually transition into programming.

In summary, R, a modern implementation of S, has become the de facto language for data science due to its flexibility, power, and
expressiveness. If you have more questions or need further details, feel free to ask!

Sources:

1. History of R Programming Language


2. Introduction to R
3. GeeksforGeeks - R Programming Language

DATA STRUCTURES :

Explore

Certainly! Let’s explore the essential data structures in R programming. These structures allow us to organize and manipulate data
effectively. Here are the key ones:

1. Vectors:
o A vector is an ordered collection of basic data types of a given length.
o All elements in a vector must be of the same data type (homogeneous).
o Vectors are generally created using the c() function.
o Example:
o # Creating a numeric vector
o x <- c(1, 5, 4, 9, 0)
o typeof(x) # Returns "double"
o length(x) # Returns 5
o
o # Creating a character vector
o y <- c("apple", "banana", "cherry")
o typeof(y) # Returns "character"
2. Lists:
o A list is a generic object consisting of an ordered collection of objects.
o Lists can hold different data types (heterogeneous).
o Example:
o # Creating a list
o empId <- c(1, 2, 3, 4)
o empName <- c("Debi", "Sandeep", "Subham", "Shiba")
o numberOfEmp <- 4
o empList <- list(empId, empName, numberOfEmp)
o print(empList)
3. Dataframes:
o Dataframes are two-dimensional tabular data structures.
o They are heterogeneous and commonly used for data analysis.
o Each column must have the same number of items, and each item in a column must be of the same data type.
o Example:
o # Creating a dataframe
o Name <- c("Amiya", "Raj", "Asish")
o Language <- c("R", "Python", "Java")
o Age <- c(22, 25, 45)
o df <- data.frame(Name, Language, Age)
o print(df)

Remember, these data structures play a crucial role in organizing and analyzing data in R! Feel free to ask if you need more details or have other
questions.
Sources:

1. DataMentor - R Vector (With Examples)


2. GeeksforGeeks - R Array
3. Programiz - R Array (with Examples)
4. Introduction to Basic Elements in R

FUNCTIONS:

Certainly! In R programming, functions play a crucial role. Let’s dive into the details:

1. What Is a Function in R?
o A function is a set of statements organized together to perform a specific task.
o It’s an object that executes a predefined sequence of commands when called.
o Functions can be built-in (provided by R) or user-defined (created by you).
2. Built-in Functions:
o R offers many helpful built-in functions for various tasks:
▪ min(), max(), mean(), median(): Compute statistics.
▪ sum(): Calculate the sum of a numeric vector.
▪ range(): Find the minimum and maximum values.
▪ abs(): Get the absolute value of a number.
▪ str(): Display the structure of an R object.
▪ length(): Count items in a vector or list.
▪ sort(): Sort a vector.
▪ exists(): Check if a variable exists.
o Example:
o vector <- c(3, 5, 2, 3, 1, 4)
o print(min(vector))
o print(mean(vector))
o print(median(vector))
o print(sum(vector))
o print(range(vector))
o print(length(vector))
o print(sort(vector, decreasing=TRUE))
o print(exists('vector')) # Note the quotation marks
3. Creating User-Defined Functions:
o To create your own function, use the function() keyword.
o Syntax:
o my_function <- function(parameters) {
o # Function body
o # Perform specific tasks
o }
o Example:
o # Custom function to add two numbers
o add_numbers <- function(a, b) {
o result <- a + b
o return(result)
o }
o # Calling the function
o print(add_numbers(10, 20)) # Output: 30

Remember, functions help modularize code, improve readability, and avoid repetition. Feel free to ask if you need more examples or have other
questions!

Sources:

1. Dataquest - How to Write Functions in R (with Examples)


2. TutorialsPoint - R Functions
3. Programiz - R Functions (with Examples)
4. DataCamp - Using Functions in R Tutorial

SESSION:

Certainly! Let’s explore the concept of sessions in R programming.

1. What Is an R Session?
o An R session refers to the environment where you interact with R.
oDuring a session, you can execute R commands, create and manipulate objects, and perform data analysis.
oThe session includes any user-defined objects (such as vectors, matrices, data frames, lists, and functions).
2. Workspace and Saving Sessions:
o The workspace is your current R working environment within a session.
o It holds all the objects you create or load during your work.
o At the end of an R session, you can save an image of the current workspace. This image is automatically reloaded the next time
you start R.
o Saving the workspace allows you to continue where you left off, with all your variables and data intact.
3. Exiting an R Session:
o To exit an R session, you can:
▪ Close the R console or RStudio.
▪ Use the q() function (type q() and press Enter).
4. Listing Objects and Removing Them:
o To list all objects in the current session, use ls().
o To remove an object, use rm(object_name).
5. Setting Working Directories:
o You can set the current working directory using setwd("path/to/directory").
o To get the current working directory, use getwd().

Remember, managing sessions helps organize your work and ensures continuity across R sessions. If you have more questions or need further
details, feel free to ask!

Sources:

1. R Tutorial For Beginners


2. R Tutorial: A Quick Beginner’s Guide to Using R
3. R sessions - CountBio

VARIABLES :
Explore

Certainly! In R programming, variables play a crucial role in storing and managing data. Let’s explore them:

1. What Are Variables in R?


o A variable is a reserved memory location used to store data values.
o Variables can hold various types of data, such as numbers, text, vectors, or even entire datasets.
o In R, variables are created dynamically when you assign a value to them.
2. Creating Variables in R:
o R does not require explicit variable declaration.
o You can create a variable by assigning a value using the <- or = operator.
o Examples:
o # Using equal operator
o age = 30
o name <- "John"
o
o # Creating a numeric vector
o numbers <- c(1, 2, 3, 4, 5)
3. Rules for Naming Variables:
o A valid variable name in R:
▪ Consists of letters, numbers, dot (.), and underscore (_) characters.
▪ Starts with a letter or a dot (not followed by a number).
▪ Cannot start with a number or an underscore.
▪ Should not be a reserved keyword (e.g., TRUE, FALSE).
o Examples:
▪ Valid: var1, .my_var, age_group
▪ Invalid: 2var, _count, var$1
4. Checking Variable Types:
o Use the class() function to determine the data type of a variable.
o Example:
o my_var <- "Hello, R!"
o print(class(my_var)) # Output: "character"

Remember, variables allow you to store and manipulate data efficiently in R. If you have more questions or need further details, feel free to ask!

Sources:

1. GeeksforGeeks - Creating, Naming, and Using Variables in R


2. W3Schools - R Variables
3. DataCamp - Variables in R Tutorial

DATA TYPES :
Certainly! In R programming, understanding data types is essential. Let’s explore the fundamental data types in R:

1. Numeric:
o Represents real numbers (with or without decimal points).
o Default type for numbers in R.
o Example:
o x <- 5.6
o typeof(x) # Output: "double"
2. Integer:
o Represents whole numbers (integers).
o You can use the L suffix to explicitly declare an integer.
o Example:
o y <- 5L
o typeof(y) # Output: "integer"
3. Logical:
o Represents Boolean values (TRUE or FALSE).
o Used for logical operations and conditions.
o Example:
o z <- TRUE
o typeof(z) # Output: "logical"
4. Complex:
o Represents complex numbers (with real and imaginary parts).
o Written as a + bi, where i is the imaginary unit.
o Example:
o w <- 1 + 2i
o typeof(w) # Output: "complex"
5. Character:
o Represents text or strings.
o Enclosed in double or single quotes.
o Example:
o name <- "Hello, R!"
o typeof(name) # Output: "character"
6. Raw:
o Represents raw bytes (binary data).
o Created using the as.raw() function.
o Example:
o raw_value <- as.raw(255)
o typeof(raw_value) # Output: "raw"

Remember, choosing the right data type ensures efficient memory usage and accurate computations in R! If you have more questions or need
further details, feel free to ask.

Sources:

1. GeeksforGeeks - R Data Types


2. Programiz - R Data Types
3. W3Schools - R Data Types
4. Medium - Ultimate Beginner’s Guide to Data Types in R

VECTORS :

Certainly! In R programming, vectors are fundamental data structures that allow you to store and manipulate elements of the same data type.
Let’s explore vectors in more detail:

1. What Is a Vector in R?
o A vector is a one-dimensional array that holds elements of the same data type.
o It can store numeric values, characters, logical values, and more.
o Vectors are the building blocks of R and play a crucial role in various data manipulation and analysis tasks.
2. Creating Vectors:
o Vectors are generally created using the c() function (which stands for “combine” or “concatenate”).
o Example:
o # Creating a numeric vector
o x <- c(1, 5, 4, 9, 0)
o typeof(x) # Output: "double"
o length(x) # Output: 5
o
o # Creating a character vector
o y <- c("apple", "banana", "cherry")
o typeof(y) # Output: "character"
3. Accessing Elements of a Vector:
o Elements of a vector can be accessed using vector indexing.
o Vector indexing starts from 1 (unlike most programming languages where it starts from 0).
o Example:
o # Accessing specific elements
o x[3] # Returns the 3rd element (4)
o x[c(2, 4)] # Returns the 2nd and 4th elements (5, 9)
o
o # Using negative indexing to exclude elements
o x[-1] # Returns all elements except the 1st one
4. Creating Sequences with seq():
o The seq() function generates sequences with specific step sizes or lengths.
o Example:
o seq(1, 3, by = 0.2) # Generates a sequence from 1 to 3 with a step size of 0.2
o seq(1, 5, length.out = 4) # Generates a sequence from 1 to 5 with 4 elements

Remember, vectors are versatile and efficient for storing and manipulating data in R. If you have more questions or need further examples, feel
free to ask!

Sources:

1. DataMentor - R Vector (With Examples)


2. Understanding R Vectors: A Comprehensive Guide with Examples
3. GeeksforGeeks - R Vector
4. An Introduction to Programming with R - Chapter 3

SCALARS:

Certainly! In R programming, a scalar refers to a single value—a basic building block for more complex data structures. Let’s explore scalars
briefly:
1. Definition:
o A scalar is the simplest object type in R.
o It represents a single value, such as a number or a name.
2. Examples of Scalars:
o Numeric: Decimal numbers (e.g., 1.5) or whole numbers (integers).
o x <- 1.5
o typeof(x) # Output: "double"
o Character: Strings (sequences of characters enclosed in quotes).
o name <- "John"
o typeof(name) # Output: "character"
o Logical: Boolean values (TRUE or FALSE).
o flag <- TRUE
o typeof(flag) # Output: "logical"
3. Remember:
o Scalars are fundamental for computations and data manipulation in R.
o They serve as the building blocks for more complex data structures like vectors and matrices.

Feel free to ask if you need further details or have other questions!

Sources:

1. Study.com - Scalar Data Type in R Programming


2. An Introduction to R - Data Structures
3. YaRrr! The Pirate’s Guide to R - Scalars

CONCLUSION:

Certainly! In R programming, the concept of a conclusion typically refers to summarizing the results or findings of an analysis or project. Let’s
explore this further:

1. Concluding an Analysis:
o When you perform data analysis or statistical modeling in R, you often arrive at insights or results.
o The conclusion is where you summarize these findings and draw meaningful insights.
o It’s essential to communicate your conclusions clearly and concisely.
2. Steps for a Good Conclusion:
o Summarize Results: Briefly state the key findings from your analysis.
o Interpretation: Explain what these findings mean in the context of your problem or research question.
o Recommendations: If applicable, provide recommendations or next steps based on your results.
3. Example: Suppose you analyzed a dataset on customer churn for a telecom company using R. Your conclusion might be:

“After analyzing the data, we found that customers with longer contract durations and higher monthly charges are less likely to
churn. We recommend focusing on retention strategies for high-value customers.”

Remember, a well-crafted conclusion helps stakeholders understand the implications of your work. If you have more questions or need further
details, feel free to ask!

Sources:

• DataCamp - Writing a Conclusion


• R-Bloggers - How to Write a Conclusion for a Research Paper
• Coursera - Data Science Capstone: Writing a Conclusion

DATA FRAMES :
Explore

Certainly! In R programming, a data frame is a versatile and widely used data structure. Let’s dive into the details:

1. What Is a Data Frame?


o A data frame is a two-dimensional tabular structure.
o It resembles a spreadsheet or a database table.
o Each column can contain different types of data (numeric, character, logical, etc.).
o Rows represent observations or records, and columns represent variables or attributes.
2. Creating Data Frames:
o You can create a data frame using the data.frame() function.
o Example:
o # Creating a simple data frame
o df <- data.frame(
o Name = c("Alice", "Bob", "Charlie"),
o Age = c(25, 30, 22),
o Score = c(85, 92, 78)
o )
o print(df)
3. Accessing Data Frames:
o Use indexing to access specific rows or columns.
o Examples:
o # Accessing columns
o df$Name # Access the 'Name' column
o df[, "Age"] # Access the 'Age' column
o
o # Accessing rows
o df[2, ] # Access the second row
4. Manipulating Data Frames:
o Add rows using rbind() and columns using cbind().
o Remove rows or columns using indexing.
o Example:
o # Adding a new row
o new_row <- c("David", 28, 90)
o df <- rbind(df, new_row)
o
o # Adding a new column
o df$Gender <- c("F", "M", "M", "M")
o
o # Removing a column
o df <- df[, -3] # Remove the 'Score' column
5. Summary and Analysis:
o Use summary() to get summary statistics.
o Perform further analysis (e.g., regression, visualization) on data frames.

Data frames are essential for data manipulation, exploration, and statistical modeling in R. If you have more questions or need additional
examples, feel free to ask!

Sources:

1. W3Schools - R Data Frames


2. GeeksforGeeks - R Data Frames
3. Programiz - R Data Frame
4. Statistics Globe - What is a Data Frame in R?

LISTS :
Certainly! In R programming, a list is a versatile and widely used data structure. Let’s dive into the details:

1. What Is a List in R?
o A list is an ordered collection of objects (elements).
o Unlike vectors, lists can contain elements of different data types (heterogeneous).
o You can think of a list as a container that holds various data objects together.
2. Creating Lists:
o To create a list, use the list() function.
o Example:
o # Creating a simple list
o my_list <- list(
o Name = "Alice",
o Age = 30,
o Scores = c(85, 92, 78)
o )
o print(my_list)
3. Accessing List Components:
o You can access list components by name or index.
o Examples:
o # Accessing by name
o my_list$Name # Access the 'Name' component
o my_list$Scores # Access the 'Scores' component
o
o # Accessing by index
o my_list[[2]] # Access the second component (Age)
4. Named List Components:
o Naming list components makes it easier to access them.
o Example:
o named_list <- list(
o name = "Sudheer",
o age = 25,
o city = "Delhi"
o )
o print(named_list)
Lists are powerful for storing heterogeneous data and organizing complex structures in R. If you have more questions or need additional
examples, feel free to ask!

Sources:

• GeeksforGeeks - R Lists
• An Introduction to R - Lists and Data Frames
• DataFlair - R List Tutorial

MATRICES:

Certainly! In R programming, a matrix is a two-dimensional arrangement of data in rows and columns. It resembles a spreadsheet or a table.
Here are the key points about matrices:

1. Creating a Matrix in R:
o To create a matrix in R, use the matrix() function.
o You need to provide the set of elements (values) in the vector, along with the desired number of rows and columns.
o By default, matrices are filled column-wise.
o Example:
o # Creating a 3x3 matrix
o A <- matrix(
o c(1, 2, 3, 4, 5, 6, 7, 8, 9),
o nrow = 3,
o ncol = 3,
o byrow = TRUE
o )
o print(A)
2. Special Matrices in R:
o R allows you to create various types of special matrices:
▪ Constant Matrix: Filled with a single constant value.
▪ Diagonal Matrix: Non-diagonal elements are zeros.
▪ Identity Matrix: Diagonal elements are ones, and others are zeros.
o Examples:
o # Constant matrix (filled with 5)
o B <- matrix(5, 3, 3)
o print(B)
o
o # Diagonal matrix (with elements 5, 3, 3 on the diagonal)
o C <- diag(c(5, 3, 3), 3, 3)
o print(C)
o
o # Identity matrix (3x3)
o D <- diag(1, 3, 3)
o print(D)
3. Matrix Metrics:
o You can obtain information about a matrix:
▪ Number of rows: nrow(A)
▪ Number of columns: ncol(A)
▪ Dimensions: dim(A)
▪ Total number of elements: length(A)
o Example:
o cat("Number of rows:", nrow(A), "\n")
o cat("Number of columns:", ncol(A), "\n")
o cat("Total elements:", length(A), "\n")

Matrices are essential for linear algebra, statistical modeling, and data manipulation in R. If you have more questions or need further examples,
feel free to ask!

Sources:

1. GeeksforGeeks - R Matrices
2. DataMentor - R Matrix (With Examples)
3. A Short Introduction to R Programming - Matrices
4. javatpoint - R Matrix

ARRAY:
Certainly! In R programming, arrays are essential data structures that allow you to store and manipulate data in multiple dimensions. Let’s
explore arrays in more detail:

1. What Is an Array in R?
o An array is a multi-dimensional data structure that can hold elements of the same data type.
o Unlike vectors (which are one-dimensional), arrays can have more than two dimensions.
o Arrays are useful for representing data in a tabular format with rows and columns.
2. Creating Arrays:
o You can create an array using the array() function.
o Specify the data elements, dimensions (rows, columns, and matrices), and optionally provide names for dimensions.
o Example:
o # Creating a 3x3x2 array
o my_array <- array(
o data = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12),
o dim = c(3, 3, 2),
o dimnames = list(c("Row1", "Row2", "Row3"), c("Col1", "Col2", "Col3"), c("Matrix1", "Matrix2"))
o )
o print(my_array)
3. Accessing Array Elements:
o Use indexing to access specific elements within the array.
o Example:
o # Accessing specific elements
o my_array[2, 3, 1] # Access the element in the second row, third column, and first matrix
4. Special Types of Arrays:
o Matrices: A two-dimensional array (special case of an array).
o Data Frames: A type of array with named columns (used for tabular data).
o Lists: A flexible array that can hold elements of different data types.

Arrays are powerful for handling multi-dimensional data, especially in scientific computing and data analysis. If you have more questions or
need further examples, feel free to ask!

Sources:

1. TutorialsPoint - R Arrays
2. GeeksforGeeks - R Arrays
3. RDocumentation - array function
4. DataCamp - Arrays in R Tutorial

CLASSES:

Certainly! In R programming, classes play a crucial role in object-oriented programming. Let’s explore the different class systems in R:

1. S3 Class:
o S3 class is the most common and straightforward class system in R.
o It allows you to create objects with associated attributes (such as names or labels).
o S3 classes are used for many built-in R functions and packages.
o Example:
o # Creating an S3 class object
o student1 <- list(name = "John", age = 21, GPA = 3.5)
o class(student1) <- "Student_Info"
2. S4 Class:
o S4 class provides a more formal and structured approach to object-oriented programming.
o You define classes explicitly using the setClass() function.
o S4 classes have slots (member variables) with defined data types.
o Example:
o # Creating an S4 class
o setClass("Student_Info", slots = list(name = "character", age = "numeric", GPA = "numeric"))
o student2 <- new("Student_Info", name = "Alice", age = 22, GPA = 3.8)
3. Reference Class:
o Reference class (also known as RC or R5) is a more recent addition to R.
o It provides a more traditional object-oriented programming experience.
o Reference classes have mutable state (unlike S3 and S4).
o Example:
o # Creating a reference class
o setRefClass("Person", fields = list(name = "character", age = "numeric"))
o person1 <- new("Person", name = "Bob", age = 30)
In summary, classes in R allow you to define custom data structures and methods, making your code more organized and reusable. Each class
system has its own features, so choose the one that best fits your needs!

Sources:

1. Programiz - R Objects and Classes


2. GeeksforGeeks - Classes in R Programming
3. DataMentor - R Classes and Objects

DATA INPUT/OUTPUT:

Certainly! Let’s explore data input and output in R programming. Managing data is crucial for any data analysis or statistical modeling task.
Here are the key aspects of data input and output:

1. Data Input:
o Reading Data from External Sources:
▪ R allows you to read data from various formats, including:
▪ Text Files: Use functions like read.table(), read.csv(), or read.delim() to read data from plain text files.
▪ Excel Files: Use packages like readxl or openxlsx to read data from Excel spreadsheets.
▪ Database Connections: Connect to databases (e.g., MySQL, PostgreSQL) using packages like RMySQL,
RPostgreSQL, or odbc.
▪ Web Services: Fetch data from APIs or web services using packages like httr or jsonlite.
o Interactive Input:
▪ Use functions like readline() or scan() to read input directly from the user via the console.
2. Data Output:
o Writing Data to External Files:
▪ Save your results or data to external files:
▪ Text Files: Use functions like write.table() or write.csv() to save data to plain text files.
▪ Excel Files: Use packages like writexl or openxlsx to write data to Excel files.
▪ Other Formats: Save data in formats like JSON, XML, or HDF5 using relevant packages.
o Printing Output:
▪ Use functions like print(), cat(), or writeLines() to display output in the console.
▪ You can also redirect output to a file using sink().
3. Example:
o Reading data from a CSV file:
o # Read data from a CSV file
o my_data <- read.csv("my_data.csv")
o Writing data to a text file:
o # Save results to a text file
o write.table(my_results, "output.txt", sep = "\t", row.names = FALSE)

Remember, efficient data input/output is essential for effective data analysis and reporting in R. If you have more questions or need further
examples, feel free to ask!

Sources:

• DataFlair - Input-Output Features in R Programming


• Net-Informations - Input and Output in R programming
• Learn eTutorials - Input & Output Functions in R
• TechVidvan - Input & Output Functions in R

DATA STORAGE FORMATS:


Explore

Certainly! In R programming, data storage formats are essential for managing and exchanging data. Let’s explore some common data storage
formats:

1. Textual Formats:
o CSV (Comma-Separated Values):
▪ CSV files store tabular data with values separated by commas.
▪ Widely used for data exchange between different software.
▪ Read using functions like read.csv() or read.table().
o TSV (Tab-Separated Values):
▪ Similar to CSV, but values are separated by tabs.
▪ Useful when data contains commas.
▪ Read using functions like read.delim().
2. Object Representation Formats:
o These formats represent R objects in a textual form:
▪ dput(): Deparses an R object into R code.
▪ dump(): Writes multiple R objects to a file.
▪ saveRDS(): Saves an R object in a binary format.
▪ loadRDS(): Reads an R object from a binary file.
3. Binary Formats:
o RDS (R Data Serialization):

Binary format for saving R objects.

Efficient and preserves metadata.

Read using readRDS().
4. Other Formats:
o Excel Files: Use packages like readxl or openxlsx.
o Stata Files: Use write.dta() from the foreign package.
o JSON, XML, HDF5: For specialized data storage needs.

Remember to choose the appropriate format based on your use case, compatibility, and efficiency! If you need more details or examples, feel
free to ask.

Sources:

• Using Textual and Binary Formats for Storing Data


• Data Structures in R - Scaler Topics
• Efficiently Saving and Sharing Data in R
• Data Structures in R

SUBSETTING OBJECTS :

Certainly! In R programming, subsetting allows you to extract specific elements from an object (such as vectors, data frames, or matrices).
There are several ways to perform subsetting, depending on the type of object and your requirements. Let’s explore some common methods:

1. Using Square Brackets ([]):


o The [ ] operator is versatile and widely used for subsetting.
o You can use it to access elements from vectors, data frames, and matrices.
o Examples:
▪ Subsetting a vector:
▪ x <- 1:10
▪ x[3] # Access the third element (value: 3)
▪ x[2:5] # Access elements 2 to 5 (values: 2, 3, 4, 5)
▪ Subsetting a data frame:
▪ # Suppose 'df' is a data frame with columns 'Name', 'Age', and 'Score'
▪ df[1:5, "Name"] # Access the first 5 names
▪ df[df$Age > 30, ] # Access rows where Age is greater than 30
2. Using the Dollar Sign ($):
o If the elements in a data frame are named (e.g., columns), you can use the $ operator.
o Example:
o # Access the 'Age' column from the data frame 'df'
o df$Age
3. Using Functions (e.g., subset()):
o The subset() function allows you to create conditional or logical subsets.
o Example:
o # Select rows where 'Score' is greater than 80
o high_scores <- subset(df, Score > 80)

Remember, subsetting helps you extract relevant information from your data objects efficiently. If you have specific requirements or need more
examples, feel free to ask!

VECTORIZATION:

Certainly! Vectorization is a powerful concept in R programming that allows you to perform operations on entire vectors or arrays at once,
rather than using explicit loops. Let’s dive into the details:

1. What Is Vectorization?
o Vectorization refers to the practice of applying an operation to an entire vector (or array) of data elements simultaneously.
o Instead of using explicit loops (like for loops), vectorized functions take advantage of optimized low-level code to process data
efficiently.
o Vectorization is a key feature of R, making code concise, readable, and computationally efficient.
2. Examples of Vectorized Operations:
o Element-Wise Arithmetic:
▪ You can perform arithmetic operations (addition, subtraction, multiplication, division) on entire vectors without explicit
loops.
▪ Example:
▪ x <- 1:5
▪ y <- 2:6
▪ z <- x + y # Element-wise addition
o Logical Operations:
▪ Logical operations (e.g., &, |, ==, !=) work element-wise on vectors.
▪ Example:
▪ is_even <- x %% 2 == 0 # Check if elements are even
o Math Functions:
▪ Functions like sqrt(), log(), sin(), etc., operate element-wise on vectors.
▪ Example:
▪ sin_values <- sin(x)
3. Advantages of Vectorization:
o Speed: Vectorized operations are faster than explicit loops.
o Readability: Code is concise and easier to understand.
o Efficiency: R’s optimized C/Fortran code handles the low-level details.

Remember, whenever possible, leverage vectorization in R to write efficient and elegant code! If you have more questions or need further
examples, feel free to ask.

Sources:

• Multiple Ways of Doing Vectorization in R – Speeding up For Loops


• R Programming for Data Science - Vectorized Operations
• R for Novices: Vectorization
• GeeksforGeeks - How to Create, Access, and Modify Vector Elements in R

You might also like