0% found this document useful (0 votes)

19 views

lab-record

Uploaded by

vyastanay30

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

lab-record

Uploaded by

vyastanay30

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

lOMoARcPSD|41453364

Lab Record 21BCG10126 - hgv 7huyh bihkbih

Computer Science (Anna University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Downloaded by Tanay Vyas ([email protected])
lOMoARcPSD|41453364

VIT Bhopal University

NAS1001 – Associative Data Analytics (LTP-4)

Slot: B11+B12+B13+B14
Class ID: BL2023241000207
FALL SEMESTER 2023-2024

Course Instructor: Dr. D Lakshmi

Name of the Student: Aniket Shrivastava

List of Experiments

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

List of Challenging Experiments (Indicative) SLO:

1,2,5,9,12

1. Understanding of R System and installation and configuration of R 1-4

Environment and R-Studio, Understanding R Packages, their installation
and management

2. Understanding of nuts and bolts of R: 4-5

a. R program Structure
b. R Data Type, Command Syntax and Control Structures
c. File Operations in R

3. Excel and R integration with R connector. 5-7

4. Preparing Data in R 7-9

a. Data Cleaning
b. Data imputation
c. Data conversion

5. Outliers detection using R 9-12

6. Correlation and Regression Analysis in R 10-13

7. Clustering Algorithms implementation using R 13-15

8. Classification Algorithm implementation using R 15-17

Classification (Spam/Not spam)

9. Case study on Stock Market Analysis and applications. Stock data can be 17-19
obtained from Yahoo! Finance, Google Finance. A team of students can
apply statistical modeling on the stock data to uncover hidden patterns. R
provides tools for moving averages, auto regression and time-series
analysis which forms the crux of financial applications.

10. Detect credit card fraudulent transactions - The dataset can be obtained 19-20
from Kaggle. The team will use a variety of machine learning algorithms
that will be able to discern fraudulent from non-fraudulent one.

Experiment No: 1

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

Aim: Understanding of R System and installation and configuration of R Environment and R-

Studio, Understanding R Packages, their installation and management

Data Description: R is a programming language for statistical computing and graphics

supported by the R Core Team and the R Foundation for Statistical Computing.
Designed by: Ross Ihaka, Robert Gentleman

Installing R:

Download R:

1. Go to the R Project's official website: https://www.r-project.org/

2. Click on the "CRAN" link under the "Download and Install R" section.
3. For Windows: Double-click the downloaded executable file and follow the installation
instructions.
4. For macOS: Double-click the downloaded package file and follow the installation
instructions.
5. For Linux: Follow the installation instructions specific to your Linux distribution.

Installing RStudio:

Download RStudio:

1. Go to the RStudio download page: https://www.rstudio.com/products/rstudio/download/

2. Under "RStudio Desktop," click the appropriate download link for your operating system
(Windows, macOS, or Linux).
3. Install RStudio:
4. For Windows: Double-click the downloaded installer and follow the installation
instructions.
5. For macOS: Double-click the downloaded disk image (.dmg) file, drag the RStudio icon
to the Applications folder, and then open RStudio from the Applications folder.
6. For Linux: Follow the installation instructions specific to your Linux distribution.

Installing R packages
It is a fundamental part of working with R. R packages contain pre-built functions, data sets, and
documentation that extend the capabilities of the R programming language. Here are the steps
to install R packages using the R console within RStudio:

Open RStudio:
Launch RStudio on your computer.

Open R Console:

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

Once RStudio is open, you'll see several panels. The left-top panel is the R Console. This is
where you can directly interact with R by typing commands.

Install a Package:
To install an R package, you'll use the install.packages() function followed by the name of the
package you want to install. For example, to install the "ggplot2" package, type the following
command in the R Console and press Enter: install.packages("ggplot2")

Load the Package:

After installing a package, you need to load it into your R session to use its functions. Use the
library() function for this purpose. For example, to load the "ggplot2" package, type:
library(ggplot2)

Experiment No: 2

Aim: Understanding of nuts and bolts of R:

a. R program Structure
b. R Data Type, Command Syntax and Control Structures
c. File Operations in R

Data Description

a. R Program Structure: An R program consists of a series of commands

that are executed sequentially. These commands can be typed directly into
the R console or saved in a script file with a .R extension.

b. R Data Types, Command Syntax, and Control Structures: R

supports various data types, including numeric, character, logical, factor, and
more. Here's a quick overview: Numeric: Used for storing numeric values
(integers or decimals). Character: Used for storing text data. Logical:
Represents binary values TRUE or FALSE. Factor: Represents categorical data
with levels or categories.

c. File Operations in R: R provides functions to perform various file

operations:

R Code

a. R Program Structure:

library(package_name)

print(result)
my_function <- function(arg1, arg2) {
return(result)

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

result <- my_function(value1, value2)

b. R Data Types, Command Syntax, and Control Structures:

x <- 5
name <- "John"
is_valid <- TRUE
sum_result <- 3 + 7

c. File Operations in R:
Reading files
# Reading text files
data <- read.table("data.txt", header = TRUE)

# Reading CSV files

data <- read.csv("data.csv")

# Reading Excel files (requires 'readxl' package)

library(readxl)
data <- read_excel("data.xlsx")

Writing files
# Writing data to text file
write.table(data, "output.txt", sep = "\t", row.names = FALSE)

# Writing data to CSV file

write.csv(data, "output.csv", row.names = FALSE)

# Writing data to Excel file (requires 'openxlsx' package)

library(openxlsx)
write.xlsx(data, "output.xlsx")

Experiment No: 3

Aim: Excel and R integration with R connector.

Data Description:
In this example, the CSV file has two columns:
experience_years: This column represents the number of years of experience each person
has.
salary: This column contains the corresponding salary for each person based on their
experience.
Sample rows and columns

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

R Code
> install.packages("csv")
> library("csv")
> Salary_Dataset = read.csv(file.choose(), 1)
> Salary_Dataset

Sample Input and Output

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

Experiment No: 4

Aim: Preparing Data in R

a. Data Cleaning
b. Data imputation
c. Data conversion

Data Description

In this example, the CSV file has two columns:

experience_years: This column represents the number of years of experience each person
has.
salary: This column contains the corresponding salary for each person based on their
experience.

Sample rows and columns

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

R Code
# Load libraries
library(dplyr)
library(missForest)

# Read dataset
data <- read.csv("data.csv")

# Data Cleaning
cleaned_data <- data %>%
distinct() %>%
select(-Irrelevant_Column)

# Check for missing values

missing_values <- sum(is.na(cleaned_data))

if (missing_values > 0) {
# Data Imputation
imputed_data <- missForest(cleaned_data, verbose = TRUE)
} else {
imputed_data <- cleaned_data
}

# Data Conversion (if needed)

imputed_data$Categorical_Column <- as.factor(imputed_data$Categorical_Column)

# Display prepared dataset

print(imputed_data)

Sample Input and Output

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

Experiment No: 5

Aim: Outliers detection using R

Data Description
In this example, the CSV file has two columns:
experience_years: This column represents the number of years of experience each person
has.
salary: This column contains the corresponding salary for each person based on their
experience.

Sample rows and columns

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

R Code

Sample Input and Output

Experiment No: 6

Aim: Correlation and Regression Analysis in R

Data Description

In this example, the CSV file has two columns:

experience_years: This column represents the number of years of experience each person
has.
salary: This column contains the corresponding salary for each person based on their
experience.

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

Sample rows and columns

R Code

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

Experiment No: 7

Aim: Clustering Algorithms implementation using R

Sample rows and columns

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

R Code

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

Sample Input and Output

Experiment No: 8

Aim: Classification Algorithm implementation using R

Classification (Spam/Not spam)

R Code

# Load required libraries

library(tm) # Text mining
library(e1071) # For Naive Bayes classifier
library(caret) # For model evaluation

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

# Load the SpamAssassin dataset (replace with your actual file path)
spam_data <- read.csv("path/to/spamassassin_data.csv", stringsAsFactors = FALSE)

# Preprocess the text data

corpus <- Corpus(VectorSource(spam_data$text))
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, stopwords("en"))
corpus <- tm_map(corpus, stripWhitespace)

# Create a document-term matrix

dtm <- DocumentTermMatrix(corpus)

# Convert the document-term matrix to a data frame

spam_df <- as.data.frame(as.matrix(dtm))
colnames(spam_df) <- make.names(colnames(spam_df))

# Combine with labels

spam_df$label <- spam_data$label

# Split data into training and testing sets

set.seed(123)
train_indices <- sample(1:nrow(spam_df), 0.7 * nrow(spam_df))
train_data <- spam_df[train_indices, ]
test_data <- spam_df[-train_indices, ]

# Train a Naive Bayes classifier

naive_bayes_model <- naiveBayes(label ~ ., data = train_data)

# Make predictions
predictions <- predict(naive_bayes_model, newdata = test_data, type = "class")

# Evaluate the model

conf_matrix <- confusionMatrix(predictions, test_data$label)
print(conf_matrix)

Sample Input and Output

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

Experiment No: 9

Aim:Case study on Stock Market Analysis and applications. Stock data can be obtained from
Yahoo! Finance, Google Finance. A team of students can apply statistical modeling on the stock
data to uncover hidden patterns. R provides tools for moving averages, auto regression and
time-series analysis which forms the crux of financial applications.

Data Description

Stock data imported from Yahoo FInances.

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

R Code
# Load required libraries
library(dplyr)
library(lubridate)

# Read the stock data CSV file (or load data from API)
stock_data <- read.csv("stock_data.csv")

# Convert date column to Date format

stock_data$Date <- ymd(stock_data$Date)

# Calculate 50-day and 200-day moving averages

stock_data$MA_50 <- SMA(stock_data$Close, n = 50)
stock_data$MA_200 <- SMA(stock_data$Close, n = 200)

# Load required library

library(forecast)

# Convert data to time series format

stock_ts <- ts(stock_data$Close, frequency = 365)

# Fit auto-regression model (ARIMA)

ar_model <- auto.arima(stock_ts)

# Load required libraries

library(ggplot2)
library(forecast)

# Decompose time series into trend, seasonal, and residual components

decomposed <- decompose(stock_ts)

# Plot decomposed components

plot(decomposed)

# Create a time series plot of stock prices and moving averages

ggplot(stock_data, aes(x = Date)) +
geom_line(aes(y = Close, color = "Stock Price")) +
geom_line(aes(y = MA_50, color = "50-day MA")) +
geom_line(aes(y = MA_200, color = "200-day MA")) +
labs(title = "Stock Price and Moving Averages", y = "Price") +
scale_color_manual(values = c("Stock Price" = "blue", "50-day MA" = "red", "200-day MA" =
"green"))

Sample Input and Output

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

Experiment No: 10

Aim: Detect credit card fraudulent transactions - The dataset can be obtained from Kaggle. The
team will use a variety of machine learning algorithms that will be able to discern fraudulent
from non-fraudulent one.

Data Description
The dataset was obtained from Kaggle

R Code
# Load required libraries
library(AnomalyDetection)
library(randomForest)

# Load the CreditCardFraud dataset

data("CreditCardFraud")

# Split data into training and testing sets (70% training, 30% testing)
set.seed(123)
train_indices <- sample(1:nrow(CreditCardFraud), 0.7 * nrow(CreditCardFraud))
train_data <- CreditCardFraud[train_indices, ]
test_data <- CreditCardFraud[-train_indices, ]

# Build Random Forest model

rf_model <- randomForest(Class ~ ., data = train_data, ntree = 100)

# Make predictions
predictions <- predict(rf_model, newdata = test_data)

Downloaded by Tanay Vyas ([email protected])

lOMoARcPSD|41453364

# Calculate accuracy
accuracy <- sum(predictions == test_data$Class) / nrow(test_data)
print(paste("Accuracy score on Test Data: :", accuracy))

Sample Input and Output

Downloaded by Tanay Vyas ([email protected])

Deployment Cisco SD-WAN LAB On EVE-NG PDF
67% (3)
Deployment Cisco SD-WAN LAB On EVE-NG PDF
56 pages
40 R Programming Interview Questions & Answers For All Levels - DataCamp
No ratings yet
40 R Programming Interview Questions & Answers For All Levels - DataCamp
22 pages
Https Tutorials Iq Harvard Edu R Rstatistics Rstatistics HTML
No ratings yet
Https Tutorials Iq Harvard Edu R Rstatistics Rstatistics HTML
25 pages
Learn R Programming in 24 Hours
From Everand
Learn R Programming in 24 Hours
Alex Nordeen
No ratings yet
r Programming Lab
No ratings yet
r Programming Lab
26 pages
data analytics lab manual using R programming
No ratings yet
data analytics lab manual using R programming
27 pages
Vinit R Programming
No ratings yet
Vinit R Programming
39 pages
Research Methodology in commerce lab file - Nehal Garg
No ratings yet
Research Methodology in commerce lab file - Nehal Garg
110 pages
R PROGRAMMING LAB(20) (1)
No ratings yet
R PROGRAMMING LAB(20) (1)
46 pages
R Programming Lab
No ratings yet
R Programming Lab
46 pages
Ad3301 Data Exploration and Visualization
No ratings yet
Ad3301 Data Exploration and Visualization
24 pages
CS-605 Data - Analytics - Lab Complete Manual (2) - 1672730238
No ratings yet
CS-605 Data - Analytics - Lab Complete Manual (2) - 1672730238
56 pages
Dhruv Manchanda roc
No ratings yet
Dhruv Manchanda roc
53 pages
Ad3301 Data Exploration and Visualization
No ratings yet
Ad3301 Data Exploration and Visualization
38 pages
R Programming Lab Manual (1)
No ratings yet
R Programming Lab Manual (1)
73 pages
Statistical Models Using R
No ratings yet
Statistical Models Using R
6 pages
DS Manual
No ratings yet
DS Manual
29 pages
Dev Record Edited-4
No ratings yet
Dev Record Edited-4
69 pages
Dev
No ratings yet
Dev
33 pages
Statistics With R-Programming Lab Manual
100% (9)
Statistics With R-Programming Lab Manual
35 pages
R Programming
No ratings yet
R Programming
20 pages
R Manual
No ratings yet
R Manual
48 pages
Lecture R
No ratings yet
Lecture R
201 pages
DSF Gourav-2
No ratings yet
DSF Gourav-2
30 pages
ML File
No ratings yet
ML File
12 pages
Intro To R
No ratings yet
Intro To R
19 pages
Big Data Analytics (Bda) : Laboratory Workbook
No ratings yet
Big Data Analytics (Bda) : Laboratory Workbook
20 pages
Mining Weather Data Using Rattle
No ratings yet
Mining Weather Data Using Rattle
6 pages
DSA Lab Manual
No ratings yet
DSA Lab Manual
28 pages
RMC Lovish
No ratings yet
RMC Lovish
41 pages
DSA lab manual pgms_fINAL
No ratings yet
DSA lab manual pgms_fINAL
34 pages
R-Lab-Manual-Final With Out-Put
100% (1)
R-Lab-Manual-Final With Out-Put
38 pages
practical-7
No ratings yet
practical-7
8 pages
Experiment 1 PDF
No ratings yet
Experiment 1 PDF
7 pages
Experiment_1
No ratings yet
Experiment_1
7 pages
Ad3301 Data Exploration and Visualization
100% (3)
Ad3301 Data Exploration and Visualization
30 pages
Data Science in R Interview Questions and Answers
No ratings yet
Data Science in R Interview Questions and Answers
56 pages
ziyaul 12
No ratings yet
ziyaul 12
26 pages
Fast R
No ratings yet
Fast R
43 pages
DM Lab
No ratings yet
DM Lab
27 pages
Compositional Data With R Boogaart
No ratings yet
Compositional Data With R Boogaart
9 pages
PushpendraLabFile
No ratings yet
PushpendraLabFile
51 pages
DA_Lab_Week-1
No ratings yet
DA_Lab_Week-1
7 pages
Data Analytics
No ratings yet
Data Analytics
4 pages
DS Manual-1
No ratings yet
DS Manual-1
29 pages
E5 - Statistical Analysis Using R
100% (1)
E5 - Statistical Analysis Using R
45 pages
1.R Unit 1
No ratings yet
1.R Unit 1
49 pages
Data analysis using R(Student copy) (1)
No ratings yet
Data analysis using R(Student copy) (1)
79 pages
DATA ANALYTICS LAB MANUAL
No ratings yet
DATA ANALYTICS LAB MANUAL
57 pages
r20 Datamining Lab (2-2 Sem Lab)
No ratings yet
r20 Datamining Lab (2-2 Sem Lab)
41 pages
Ad3301-Data-Exploration-And-Visualization Lab Manual
No ratings yet
Ad3301-Data-Exploration-And-Visualization Lab Manual
24 pages
R-Codes SCS1621
No ratings yet
R-Codes SCS1621
151 pages
R Language Lab Manual Lab 1
100% (1)
R Language Lab Manual Lab 1
33 pages
DE&V RECORD
No ratings yet
DE&V RECORD
36 pages
Useful R Packages
No ratings yet
Useful R Packages
73 pages
Learning R programming
No ratings yet
Learning R programming
3 pages
SCSA4001- r Program
No ratings yet
SCSA4001- r Program
151 pages
Daniel Sam Joseph: Informatics Practices Project XII
No ratings yet
Daniel Sam Joseph: Informatics Practices Project XII
20 pages
R HelpGuige
No ratings yet
R HelpGuige
5 pages
Wa0008.
No ratings yet
Wa0008.
57 pages
R Programming - a Comprehensive Guide: Software
From Everand
R Programming - a Comprehensive Guide: Software
Editor IJSMI
No ratings yet
Final Presentation About Hotel
No ratings yet
Final Presentation About Hotel
26 pages
Practical 3
No ratings yet
Practical 3
5 pages
Introducing .NET MAUI: Build and Deploy Cross-platform Applications Using C# and .NET Multi-platform App UI 1st Edition Shaun Lawrence download
No ratings yet
Introducing .NET MAUI: Build and Deploy Cross-platform Applications Using C# and .NET Multi-platform App UI 1st Edition Shaun Lawrence download
61 pages
Z Notation Book
No ratings yet
Z Notation Book
315 pages
Real Aimbot
No ratings yet
Real Aimbot
3 pages
Tech News and Updates
No ratings yet
Tech News and Updates
3 pages
Final Notes For Ultrasonic Sensor
No ratings yet
Final Notes For Ultrasonic Sensor
13 pages
User'S Guide: Dell™ Powerconnect™ 27Xx Systems
No ratings yet
User'S Guide: Dell™ Powerconnect™ 27Xx Systems
84 pages
Flows V1
No ratings yet
Flows V1
52 pages
Unit 1 - Introduction To Python
No ratings yet
Unit 1 - Introduction To Python
77 pages
NUMERICAL SEMIGROUPS AND APPLICATIONS 2nd Edition Abdallah. D'Anna Marco. Garcia-Sanchez Pedro A. Assi All Chapters Instant Download
100% (2)
NUMERICAL SEMIGROUPS AND APPLICATIONS 2nd Edition Abdallah. D'Anna Marco. Garcia-Sanchez Pedro A. Assi All Chapters Instant Download
55 pages
Ts Echoshield Echodyne - 4ja1
No ratings yet
Ts Echoshield Echodyne - 4ja1
2 pages
Dataware House Doc1
No ratings yet
Dataware House Doc1
3 pages
Airtel Cloud Communication Platform - CDR document v1.3[1]
No ratings yet
Airtel Cloud Communication Platform - CDR document v1.3[1]
17 pages
Harbour Minigui
No ratings yet
Harbour Minigui
95 pages
Algorithms and Data Structures JAVA
No ratings yet
Algorithms and Data Structures JAVA
5 pages
Modifica DTS
No ratings yet
Modifica DTS
2 pages
Forward Auction of Discount
No ratings yet
Forward Auction of Discount
33 pages
22meth 1
No ratings yet
22meth 1
6 pages
AI Translator For Scientific Research
No ratings yet
AI Translator For Scientific Research
15 pages
Beep Code Manual PDF
No ratings yet
Beep Code Manual PDF
4 pages
IoT_RFID_Lock_Door_Security_System
No ratings yet
IoT_RFID_Lock_Door_Security_System
13 pages
Sonim-introducing-XP8-datasheet
No ratings yet
Sonim-introducing-XP8-datasheet
2 pages
2G 3G Interworking Solution For GPC 10
No ratings yet
2G 3G Interworking Solution For GPC 10
24 pages
Final Lab Maual For Cyber Security 17 06 2024 (1-12)
No ratings yet
Final Lab Maual For Cyber Security 17 06 2024 (1-12)
85 pages
Software Concepts
No ratings yet
Software Concepts
17 pages
801AP Specspdf
No ratings yet
801AP Specspdf
4 pages
Lecture 9_EECE 3231
No ratings yet
Lecture 9_EECE 3231
38 pages
Espinosa 02 - Activity - 1
No ratings yet
Espinosa 02 - Activity - 1
2 pages