0% found this document useful (0 votes)

4 views

Data Analytics Lab Manual Students.docx

The document outlines the vision and mission of the P. R. Pote Patil College of Engineering & Management's Department of Artificial Intelligence & Data Science for the academic year 2024-25, emphasizing academic excellence, ethical values, and professional development. It details program outcomes, educational objectives, and specific outcomes for graduates, focusing on technical proficiency, leadership, and lifelong learning. Additionally, it provides guidelines for teachers and instructions for students regarding practical laboratory work, assessment criteria, and skills development.

Uploaded by

amankokate01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Data Analytics Lab Manual Students.docx

Uploaded by

amankokate01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

P. R. POTE PATIL
College of Engineering & Management, Amravati.

DEPARTMENT OF ARTIFICIAL INTELLIGENCE & DATA

SCIENCE

Year: 2024-25 Semester: VI

Course: Data Analytics Laboratory

Institute Vision Mission

Vision
To flourish as a center of excellence for producing the skilled technocrats
and committed human beings.
Mission
● To create conducive environment for teaching &learning.
● To impart quality education through demanding academic programs.
● To enhance career opportunities by exposure to Industries & recent
technologies.
● To develop professionals with strong ethics and human values for the
betterment of society.

Department of Artificial Intelligence & Data Science

Vision
To achieve excellence in education, research & innovation for steering
ethical, impactful, and globally competitive engineers
Mission
● To provide academic excellence, ensuring students equipped with
cutting-edge knowledge and skills.
● To promote a vibrant research ecosystem encouraging faculties and
students contributing to the advancement of knowledge and innovation.
● To inculcate ethical values and a sense of responsibility in students and
preparing them to be ethical leaders who prioritize societal well-being and
equity.

Program Outcomes:
Engineering Graduate will be able to:
1. Engineering Knowledge: Apply the knowledge of mathematics, science,
engineering fundamentals, and an engineering specialization to the solution of
complex engineering problems.
2. Problem Analysis: Identify, formulate, research literature, and analyze
complex engineering problems reaching substantiated conclusions using first
principles of mathematics, natural sciences, and engineering sciences.
3. Design/development of solutions: Design solutions for complex
engineering problems and design system components or processes that meet
the specified needs with appropriate consideration for the public health and
safety, and the cultural, societal, and environmental considerations.
4. Conduct investigations of complex problems: Use research-based
knowledge and research methods including design of experiments, analysis
and interpretation of data, and synthesis of the information to provide valid
conclusions.
5. Modern tool usage: Create, select, and apply appropriate techniques,
resources, and modern engineering and IT tools including prediction and
modeling to complex engineering activities with an understanding of the
limitations.
6. The engineer and society: Apply reasoning informed by the contextual
knowledge to assess societal, health, safety, legal and cultural issues and the
consequent responsibilities relevant to the professional engineering practice.
7. Environment and sustainability: Understand the impact of the professional
engineering solutions in societal and environmental contexts, and
demonstrate the knowledge of, and need for sustainable development.
8. Ethics: Apply ethical principles and commit to professional ethics and
responsibilities and norms of the engineering practice.
9. Individual and team work: Function effectively as an individual, and as a
member or leader in diverse teams, and in multidisciplinary settings.
10.Communication: Communicate effectively on complex engineering activities
with the engineering community and with society at large, such as, being able
to comprehend and write effective reports and design documentation, make
effective presentations, and give and receive clear instructions.
11.Project management and finance: Demonstrate knowledge and
understanding of the engineering and management principles and apply
these to one’s own work, as a member and leader in a team, to manage
projects and in multidisciplinary environments.
12.Life-long learning: Recognize the need for, and have the preparation and
ability to engage in independent and life-long learning in the broadest context
of technological change.

Program Educational Objectives (PEO):

To prepare the Engineering graduates to
PEO1:Technical Proficiency: Graduates will demonstrate proficiency in core
areas of computer science, including algorithms, data structures, software
engineering, and database systems, with a specialized focus on artificial
intelligence and data science techniques.
PEO2: Leadership and Innovation: Graduates will exhibit leadership qualities and
innovative thinking, contributing to the development and implementation
of AI-driven solutions that address societal, industrial, and environmental
challenges.
PEO3: Ethical and Professional Responsibility: Graduates will adhere to ethical
and professional standards in their engineering practice, demonstrating
integrity, accountability, and a commitment to societal well-being in the
design and deployment of AI and DS systems.
PEO4: Lifelong Learning: Graduates will adopt a culture of lifelong learning and
professional development, empowering individuals to adapt to emerging
technologies, navigate complex socio-technical landscapes, and contribute
meaningfully to the advancement of AIDS throughout their careers.

Program Specific Outcome (PSO):

PSO1:Apply Intelligent Systems Development: Develop intelligent systems and
applications integrating AI and DS technologies to enhance functionality
and performance.
PSO2: Problem-Solving Skills: Possess strong analytical and problem-solving
skills, enabling them to identify, formulate, and solve complex engineering
problems using AI and DS methodologies in diverse application domains.y

GUIDELINES FOR TEACHERS

Teachers shall discuss the following points with students before start of practical of the
subject.
1. Learning Overview: To develop better understanding of importance of the subject.
To know related skills to be developed such as intellectual and motor skills.
2. Know you’re Laboratory Work: To understand the layout of laboratory, specifications
of equipment / instruments /materials, procedure, working in groups, planning time
etc. also to know total amount of work to be done in the laboratory.
3. Teacher shall ensure that required equipment is in working condition before start of
each experiment, also keep operating instruction manual available.
4. Explain prior concepts to the students before starting of each experiment.
5. Evolve student’s activity at the time of conduct of each experiment.
6. While taking reading / observation each student (from batch of 20 students) shall be
given a chance to perform / observe the experiment.
7. Teacher shall assess the performance of students continuously.
8. Teacher is expected to share the skills to be developed in the students.
9. Teacher should ensure that the respective skills are developed in the students after
the completion of the practical exercise.
10.Teacher may provide additional knowledge and skills to the students even though not
covered in the manual but are expected from students by the industries.
11.Teacher may suggest the students to refer additional related literature of the technical
papers / reference books / Seminar Proceedings, etc.
12.Focus should be given on development of enlisted skills rather than theoretical /
codified knowledge.
13.During assessment teacher is expected to ask questions to the students to tap their
achievements regarding related knowledge and skills.
14.Teacher should give more focus on hands on skills.

INSTRUCTIONS FOR STUDENTS

1. Students shall read the points given below for understanding the theoretical
concepts and practical applications.
2. Listen carefully to the lecture given by teacher about importance of subject,
curriculum philosophy, learning structure, skills to be developed, information about
equipment, instruments, procedure, method of continuous assessment, tentative
plan of work in laboratory and total amount of works to be done in a semester.
3. Student shall undergo study visit of the laboratory for types of equipment, and
material to be used, before performing experiments.
4. Read the write up of each experiment to be performed, a day in advance.
5. Organize the work in the group and make a record of all observations.
6. Understand the purpose of experiment and its practical implications.
7. Student should not hesitate to ask any difficulty faced during conduct of practical
/exercise.
8. Write the answers of the questions allotted by the teacher during practical hours if
possible or afterwards, but immediately.
9. Student should develop the habit of pear discussion / group discussion related to
experiments / exercise so that exchanges of knowledge / skills could take place.
10.Students shall attempt to develop related hands-on-skills and gain confidence.
11.Student shall focus on development of skills rather than theoretical or codified
knowledge.
12.Student shall insist for the completions of recommended Laboratory Work, answers
to the given question etc.
13.Student shall develop the habit of evolving more ideas, innovations, skills etc. that
included in the scope of the manual.
14.Student shall refer technical magazines, proceedings of the Seminars, refer website
related to the scope of the subjects and update their knowledge and skills.
15.Student should develop the habit of not depend totally on teachers but to develop
self-learning techniques.
16.Student should develop the habit to interact with the teacher without hesitation with
respect to academic involved.
17.Student should develop habit to submit the practical’s exercise continuously and
progressively on the schedule dates and should get the assessment done.
18.Student should be well prepared while submitting the write up of the exercise. This
will develop the continuity of the studies and he will not be overloaded at the end of
the term.

P. R. POTE PATIL
COLLEGE OF ENGINEERING & MANAGEMENT, AMRAVATI.

DEPARTMENT OF ELECTRICAL ENGINEERING

Certificate
This is to certify that Mr./Ms…………………………………………... of
….... Semester of Bachelor of Technology in Artificial Intelligence &
Data Science of P. R. Pote Patil College of Engineering &
Management, Amravati, has completed the term work satisfactory of
course…………………. for the academic year 20 - 20 as
prescribed in the curriculum.

Date………………… Roll No……………………

Subject Teacher Head of the Department

LIST OF (PRACTICALS / EXPERIMENTS)& PROGRESSIVE ASSESSMENT FOR TERM WORK

Academic Year :2024-25 Course……………………………….

Course Code: 6AD01 Semester: VI
Name of Faculty: Prof. S. C. Pakhale
Name of Student: ………………………………………
Roll No………………………………

SN. Title of the Date of Date of

Practical / Experiment Performance Submission Sign of Teacher
1 Install and configure Hadoop framework
2 Working with Hadoop distributed file system
3 Implement the Map reduce method in
hadoop and write the Word count program
4 Develop the Program for Apriori Algorithm
5 Installation of R Studio and write a simple
program for it
6 To construct a Data Frame and develop an R
program for data frame
7 Construct a program for Manipulating &
Processing Data in R.
To Generate Graphs Using Plot(), Hist(),
8 Linechart(), Pie(), Boxplot(), and
Scatterplots() Develop an R program

Signature of Faculty
Course Outcomes
After successful completion of laboratory course, the students will able to
SN Outcomes
After successful completion of this course, students will be able to
1 To understand Data Analytics Life Cycle and Business Challenges
2 To understand Analytical Techniques and Statically Models
3 To understand Statically Modeling Language.
Rubrics used for continuous Assessment in every lab session
Skills Allocated Parameters High Medium Low
Marks
1. Handle equipment/ tools/ Most Partially Below
commands correctly or satisfactory successful expectation
Logic Formation (5) (4-5) (3) (0-2)
2. Work cohesively in team Exceptional (2) Satisfactory Unsatisfactory
(2) (1) (0)
Process
Related 15
1. Integrate system & Partially Incorrect or
Skills measure parameters correct (2-3) unsatisfactory
correctly or Debugging Highly
satisfactory (4) (0-1)
Ability(4) *Completed *Completed
2. Completed experiment as In time (4) but delayed with 50%
per schedule (4) (2) delayed (1)
Obtain correct results, Highly Partially Incorrect (0-1)
Interpret results (4) Accurate (4) correct (2-3)
Product Highly accurate Unsatisfactory
Related 10 Draw conclusion (3) (3) Partially (2) (1)
Skills
Answer practical related Highly Moderate Unsatisfactory
questions & submit the satisfactory (3) satisfactory (1)
write up of expt on time (3) (2)
Total Marks 25 Marks
Assessment
Marks Rubrics: 25 = a (05) + b (02) + c (04) + d (04) + e (04) + f (03) + g (03)
a: Handle equipment/ tools/ commands correctly or Logic Formation
b: Work cohesively in team
c: Integrate system & measure parameters correctly or Debugging Ability
d: Completed experiment as per schedule
e: Obtain correct results, Interpret results
f: Draw conclusion
g: Answer practical related questions & submit the write up of experiment on time
SN. Title of the (a) (b) (c) (d) (e) (f) (g) Total
25
Practical / Experiment 05 02 04 04 04 03 03 Marks
1 Install and configure
Hadoop framework
2 Working with Hadoop
distributed file system
Implement the Map reduce
3 method in hadoop and write
the Word count program
4 Develop the Program for
Apriori Algorithm
5 Installation of R Studio and
write a simple program for it
To construct a Data Frame
6 and develop an R program
for data frame
Construct a program for
7 Manipulating & Processing
Data in R.
To Generate Graphs Using
Plot(), Hist(), Linechart(),
8 Pie(), Boxplot(), and
Scatterplots() Develop an R
program
EXPERIMENT NO: 01
Title: Install and configure Hadoop framework
Objective: to enable the use of a distributed computing environment for handling
large volumes of data efficiently
Software Details:
SN Name of Software/Tools Specification Qty Required
01 Hadoop 3.12.4 01
02 Java 8 version 01
Theory:
Hadoop software can be installed in three modes of
Hadoop is a Java-based programming framework that supports the processing and
storage of extremely large data sets on a cluster of inexpensive machines. It was the first
major open source project in the big data playing field and is sponsored by the Apache
Software Foundation.
Hadoop-2.7.3 is comprised of four main layers:
● Hadoop Common is the collection of utilities and libraries that support other
hadoop modules.
● HDFS, which stands for Hadoop Distributed File System, is responsible for
persisting data to disk.
● YARN, short for Yet Another Resource Negotiator, is the "operating system" for
HDFS.
● Map Reduce is the original processing model for Hadoop clusters. It distributes
work within the cluster or map, then organizes and reduces the results from the
nodes into a response to a query. Many other processing models are available for
the 2.x version of Hadoop.
Hadoop clusters are relatively complex to set up, so the project includes a stand-alone
mode which is suitable for learning about Hadoop, performing simple operations, and
debugging.
Procedure:
We’ll install Hadoop in stand-alone mode and run one of the example example Map
Reduce programs it includes verifying the installation.
Prerequisites:
Step1: Installing Java 8 version.
Open jdk version "1.8.0_91"
Open JDK Runtime Environment (build1.8.0_91-8u91-b14-3ubuntu1~16.04.1-b14)
Open JDK 64-Bit Server VM (build 25.91-b14, mixed mode)
This output verifies that Open JDK has been successfully installed.
Note: To set the path for environment variables. i.e. JAVA_HOME
Step2: Installing Hadoop
With Java in place, we'll visit the Apache Hadoop Releases page to find the most recent
stable release. Follow the binary for the current release:
Procedure:
Step1: Installing Java 8 version.

Download Hadoop from www.hadoop.apache.org

Conclusion:
Assessment Scheme:
Process Related Skills Product Related Skills Total Signature of
(15-M) (10-M) (25-M) Faculty

EXPERIMENT NO: 02
Title: Working with Hadoop distributed file system
Objective: To efficiently store and manage large volumes of data across a
distributed environment
Theory:
Working with the Hadoop Distributed File System (HDFS) involves interacting with a
distributed storage system designed to handle large amounts of data across multiple
machines. HDFS is part of the Apache Hadoop ecosystem and provides fault-tolerant
storage with high throughput.
Here’s an overview of how to work with HDFS, including both the Hadoop command-line
interface (CLI) and programmatic approaches (Java and Python).
1. HDFS Overview
HDFS has two key components:
1. NameNode: This is the master node that manages the filesystem namespace and
regulates access to files.
2. DataNode: These are the worker nodes that store the actual data blocks.
HDFS stores large files by splitting them into blocks (default block size is 128 MB or 256
MB) and distributing them across different nodes in the cluster.
2. Using the HDFS Command-Line Interface (CLI)
You can interact with HDFS using the Hadoop CLI. Some basic commands include:
a. Listing files in HDFS:
hdfs dfs -ls /user/hadoop/
This lists all the files and directories in the /user/hadoop/ directory.
b. Creating directories in HDFS:
hdfs dfs -mkdir /user/hadoop/new_dir
This creates a new directory in HDFS at /user/hadoop/new_dir.
c. Copying files from local filesystem to HDFS:
hdfs dfs -put localfile.txt /user/hadoop/
This uploads localfile.txt from your local file system to HDFS under /user/hadoop/.
d. Copying files from HDFS to local file system:
hdfs dfs -get /user/hadoop/testfile.txt /path/to/local/
This retrieves the testfile.txt from HDFS to your local machine.
e. Reading a file from HDFS:
hdfs dfs -cat /user/hadoop/testfile.txt
This prints the content of testfile.txt from HDFS to the terminal.
f. Deleting files from HDFS:
hdfs dfs -rm /user/hadoop/testfile.txt
This deletes the file testfile.txt from HDFS.
g. Checking the status of HDFS:
hdfs dfsadmin -report
This provides an overview of the HDFS cluster's status, including the amount of data
stored and available space.
Program:-

Result:-

Conclusion:
Assessment Scheme:
Process Related Skills Product Related Skills Total Signature of
(15-M) (10-M) (25-M) Faculty

EXPERIMENT NO: 03
Title: Implement the Map Reduce method in hadoop and write the Word count program
Objective: To leverage the distributed computing capabilities of Hadoop to efficiently
process large datasets.
Theory:
Map Reduce is a core component of Hadoop that enables distributed data
processing. It allows you to perform operations like filtering, aggregation, and
transformation of large datasets.
Example Program: Word Count Program This is a classic example where you
count the number of occurrences of each word in a dataset. Here’s how it works:
Mapper: Reads the input and splits it into words.
Reducer: Aggregates the word counts.
Code (MapReduce in Java):
Program:-

Result:-

Conclusion:

Assessment Scheme:
Process Related Skills Product Related Skills Total Signature of
(15-M) (10-M) (25-M) Faculty
EXPERIMENT NO: 04
Title: Develop the Program for Apriori Algorithm
Objective: To implement an efficient method for identifying frequent itemsets and
generating association rules from a transactional dataset.
Theory:
The Apriori algorithm is a classic data mining technique used to find frequent itemsets in
a transaction dataset and derive association rules. It was introduced by Rakesh Agrawal
and Ramakrishnan Srikant in 1994 and is mainly applied in market basket analysis.
Key Concepts:
Frequent Itemsets: A set of items that appear together in a transaction dataset with
frequency above a specified threshold.
Association Rules: These rules express relationships between items, showing how the
presence of one item in a transaction affects the presence of another item. For example,
"If a customer buys bread, they are likely to also buy butter."
Example:
Let's say you have a transaction dataset:
Transaction ID Items Bought
1 Milk, Bread
2 Milk, Butter
3 Milk, Bread, Butter
4 Bread, Butter
Step-by-step:
Step 1: Find frequent 1-itemsets (e.g., Milk, Bread, Butter) by counting the frequency of
individual items.
Step 2: Generate frequent 2-itemsets (e.g., Milk & Bread, Milk & Butter, Bread & Butter).
Step 3: Generate rules like "If Milk is bought, then Bread is also bought."
Program: -
Result: -
Conclusion:
Assessment Scheme:
Process Related Skills Product Related Skills Total Signature of
(15-M) (10-M) (25-M) Faculty
EXPERIMENT NO: 05
Title: Installation of R Studio and write a simple program for it
Objective: The objective of installing R is to set up an environment for statistical
computing and data analysis
Theory:
R is a programming language and free software developed by Ross Ihaka and Robert
Gentleman in 1993.
R possesses an extensive catalog of statistical and graphical methods. It includes
machine Learning algorithms, linear regression, time series, statistical inference to name
a few. Most of the R Libraries are written in R, but for heavy computational tasks, C, C++
and FORTRAN codes are Preferred.
R is not only entrusted by academic, but many large companies also use R programming
Language, including Uber, Google, Airbnb, Facebook and so on.
Data analysis with R is done in a series of steps; programming, transforming, discovering,
modeling and communicate the results.
Program: R is a clear and accessible programming tool
Transform: R is made up of a collection of libraries designed specifically for data science
Discover: Investigate the data, refine your hypothesis and analyze them
Model: R provides a wide array of tools to capture the right model for your data
Communicate: Integrate codes, graphs, and outputs to a report with R Markdown or build
Shiny apps to share with the world
What is R used for?
Statistical inference
Data analysis
Machine learning algorithm
Procedure:
Installation of R-Studio on windows:
Step – 1: With R-base installed, let’s move on to installing RStudio. To begin, go to
download RStudio and click on the download button for RStudio desktop
Step – 2: Click on the link for the windows version of RStudio and save the .exe file.
Step – 3: Run the .exe and follow the installation instructions.
1. Click next on the welcome window

• Enter/Browse the path to the installation folder and click Next to proceed.

• Select the folder for the start menu shortcut or click on do not create shortcuts and then
click Next.
Wait for the installation process to complete.

• Click Finish to end the installation.

Install the R Packages:-

In R Studio, if you require a particular library, then you can go through the
following instructions:
First, run R Studio.
After clicking on the packages tab, click on install. The following dialog box will
appear.
• In the Install Packages dialog, write the package name you want to install under
• The Packages field and then click install. This will install the package you
searched for or give you a list of matching packages based on your package text.
Installing Packages:-
The most common place to get packages from is CRAN. To install packages from CRAN
you use install. Packages("package name"). For instance, if you want to install the ggplot2
package, which is a very popular visualization package, you would type the following in
the console:-
Syntax:-
# install package from CRAN
install.packages("ggplot2")
Loading Packages:-
Once the package is downloaded to your computer you can access the functions and
resources provided by the package in two different ways:
# load the package to use in the current R session
Library (packagename)
Getting Help on Packages:-
For more direct help on packages that are installed on your computer you can use the
help and vignette functions. Here we can get help on the ggplot2 package with the
following:
Help(package = "ggplot2") # provides details regarding contents of a package
vignette(package = "ggplot2") # list vignettes available for a specific package
vignette("ggplot2-specs") # view specific vignette
Vignette() # view all vignettes on your computer

Program:

Result: -
Conclusion:

Assessment Scheme:
Process Related Skills Product Related Skills Total Signature of
(15-M) (10-M) (25-M) Faculty

EXPERIMENT NO: 06
Title: To construct a Data Frame and develop an R program for data frame
Objective: To create a Data Frame is to understand how to construct and manipulate a
structured collection of data.
Theory:
In R, a data frame is a two-dimensional, tabular data structure that can hold different
types of data (like numeric, character, factor, etc.) in columns. Each column can have
different data types, similar to a spreadsheet or SQL table. Data frames are a key
component in R for data manipulation and analysis.
Creating a Data Frame
You can create a data frame using the data frame () function.
Example:
# creating a simple data frame
df<- data.frame(
Name = c("Alice", "Bob", "Charlie"),
Age = c(25, 30, 35),
Height = c(5.5, 6.0, 5.8)
)
print(df)
# Output:
# Name Age Height
# 1 Alice 25 5.5
# 2 Bob 30 6.0
# 3 Charlie 35 5.8

Accessing Data Frame Components

You can access data frame components using various methods:
Accessing Columns
You can access columns by their name using the $ operator or by using square brackets.
# Access a column using $
ages<- df$Age
print(ages) # Output: [1] 25 30 35

# Access a column using square brackets

heights<- df[, "Height"]
print(heights) # Output: [1] 5.5 6.0 5.8

Accessing Rows
You can access rows using square brackets with row and column indices.
# Access the first row
first_row<- df[1, ]
print(first_row)

# Output:
# Name Age Height
# 1 Alice 25 5.5

# Access specific rows and columns (e.g., first row, second column)
age_of_first_person<- df[1, 2]
print(age_of_first_person) # Output: [1] 25

Accessing Multiple Rows and Columns

You can also access multiple rows and columns.
# Access multiple rows (1st and 3rd) and specific columns (Name and Height)
subset_df<- df[c(1, 3), c("Name", "Height")]
print(subset_df)

# Output:
# Name Height
# 1 Alice 5.5
# 3 Charlie 5.8

Adding and Modifying Columns

You can add a new column or modify existing columns easily.
# Adding a new column
df$Weight<- c(120, 150, 180)
print(df)
# Output:
# Name Age Height Weight
# 1 Alice 25 5.5 120
# 2 Bob 30 6.0 150
# 3 Charlie 35 5.8 180

# Modifying an existing column (e.g., increase age by 1)

df$Age<- df$Age + 1
print(df)
# Output:
# Name Age Height Weight
# 1 Alice 26 5.5 120
# 2 Bob 31 6.0 150
# 3 Charlie 36 5.8 180

Deleting Rows and Columns

You can delete rows or columns from a data frame.
Deleting a Column:
# Remove the Weight column
df$Weight<- NULL
print(df)
# Output:
# Name Age Height
# 1 Alice 26 5.5
# 2 Bob 31 6.0
# 3 Charlie 36 5.8

Deleting a Row:
# Remove the second row
df<- df[-2, ] # or df = df[-which(df$Name == "Bob"), ]
print(df)
# Output:
# Name Age Height
# 1 Alice 26 5.5
# 3 Charlie 36 5.8

Basic Statistics and Summary

You can perform basic statistics on data frame columns.
# Summary statistics
Summary (df)

# Output:
# Name Age Height
# Length:2 Min. :26 Min. :5.5
# Class :character 1st Qu.:26 1st Qu.:5.5
# Mode :character Median :31 Median :5.6
# Mean :31 Mean :5.6
# 3rd Qu.:36 3rd Qu.:5.8
# Max.:36 Max. :5.8

Importing and Exporting Data Frames

You often need to import data from CSV or Excel files and export data frames to these
formats.
Importing Data
# Importing data from a CSV file
df_imported<- read.csv("filename.csv")

Exporting Data
# Exporting data frame to a CSV file
write.csv(df, "output.csv", row.names = FALSE)

Program:-
Result: -

Conclusion:

Assessment Scheme:
Process Related Skills Product Related Skills Total Signature of
(15-M) (10-M) (25-M) Faculty

EXPERIMENT NO: 07
Title: Construct a program for Manipulating & Processing Data in R.

Objective: To provide an efficient framework for performing essential data operations,
such as filtering, sorting, transforming, aggregating, and summarizing datasets.
Theory:
Data manipulation and processing are fundamental tasks in data analysis, and R is a
powerful tool for handling these operations. By constructing a program for
manipulating and processing data in R, you can automate common data preparation
tasks such as cleaning, transforming, summarizing, and analyzing datasets. This
allows for efficient and reproducible workflows that support data-driven
decision-making.
Steps Involved in Constructing a Data Manipulation Program in R
Import Data: Load the dataset into R using appropriate functions based on the file
type (e.g., read.csv() for CSV files, read_excel() for Excel files).
Inspect the Data:
Check the structure, dimensions, and summary statistics of the data using functions
like head(), str(), summary(), and glimpse().
This helps in identifying the types of variables and spotting potential issues like
missing values or incorrect formats.
Clean Data:
Handle missing values (NAs) by removing them or replacing them with suitable
values (mean, median, or other imputed values).
Remove duplicates using distinct () and correct data types if needed (e.g., convert a
character column to a factor or numeric).
Transform Data:
Create new columns using the mutate() function. This can involve mathematical
operations or string manipulation.
For example, creating a new column to categorize a numerical variable into
categories (e.g., converting scores into letter grades).
Filter and Sort Data:
Use filter() to extract specific rows based on conditions.
Use arrange() to sort the data in a desired order, such as by score in descending
order.
Group and Aggregate Data:
Use group_by() to group the data by categorical variables (e.g., department or
region).
Use summarise() to compute aggregate values (mean, median, sum, etc.) for each
group.
Summarize the Data:
Summarize the dataset using statistical measures, and get insights into its
distribution or central tendency.
Export Processed Data:
Save the processed data to a new file using write.csv() or other appropriate
functions.
Program:-
Result:

Conclusion:

Assessment Scheme:
Process Related Skills Product Related Skills Total Signature of
(15-M) (10-M) (25-M) Faculty
EXPERIMENT NO: 08
Title: To Generate Graphs Using Plot(), Hist(), Linechart(), Pie(), Boxplot(), and
Scatterplots() Develop an R program
Objective: To model and analyze relationships and connections between
different entities or elements.
Theory:
Graphs using Graph functions: Plot(), Hist(),Line chart(), Pie(), Box plot(), Scatter plots()
in r programming
In R, various functions are provided for creating different types of graphs and
visualizations. Below are examples of how to use the most commonly used plotting
functions, including plot (), hist(), lines(), pie(), boxplot(), and scatterplot().

1. Basic Plotting with plot ()

The plot () function is a versatile function for creating basic scatter plots and line graphs.
Example: Scatter Plot
# Create sample data
x <- c(1, 2, 3, 4, 5)
y <- c(2, 3, 5, 6, 5)

# Basic scatter plot

Plot (x, y, main = "Scatter Plot", x lab = "X-axis Label", y lab = "Y-axis Label", col = "blue",
pch = 19)
2. Histogram with hist()
The hist() function is used to create histograms to visualize the distribution of a dataset.
Example: Histogram
# Create sample data
data<- rnorm(1000) # Generate 1000 random numbers from a normal distribution
# Create histogram
hist(data, main = "Histogram", xlab = "Value", col = "lightblue", border = "black", breaks
= 30)
3. Line Chart with lines ()
After creating a basic plot, you can add lines using the lines() function.
Example: Line Chart
# Create data for line chart
x <- seq(1, 10, by = 0.1)
y <- sin(x)

# Basic plot
plot(x, y, type = "n", main = "Line Chart", x lab = "X-axis", y lab = "Y-axis") # "type = 'n'"
does not plot points
lines (x, y, col = "red", lwd = 2)
# Adding line

4.Pie Chart with pie ()

The pie () function is used to create pie charts.
Example: Pie Chart
# Create sample data
values<- c(10, 20, 30, 40)
labels<- c("A", "B", "C", "D")
# create pie chart
Pie (values, labels = labels, main = "Pie Chart", col = rainbow(length(values)))

5. Box plot with box plot()

The box plot () function creates box-and-whisker plots to visually summarize a dataset.
Example: Box plot
# create sample data
data<- list(A = r norm(100), B = r norm(100, mean = 1), C = r norm(100, mean = 2))
# create box plot
Box plot(data, main = "Box plot", x lab = "Groups", y lab = "Values", col = c("light green",
"light coral", "light blue"))

6. Scatter Plots with plot ()

You can also use the plot() function to create scatter plots (as demonstrated above), or for
more complex scatter plots, you can utilize additional arguments.
Example: Enhanced Scatter Plot
# Create sample data
set.seed(42) # For reproducibility
x <- r norm (100)
y <- r norm (100)

# Enhanced scatter plot

plot(x, y, main = "Enhanced Scatter Plot", x lab = "X Value", y lab = "Y Value", col = "dark
blue", pch = 19, cex = 1.5)
abline(lm(y ~ x), col = "red", lwd = 2)
# Adding a regression line

Program: -
Result: -

Conclusion:
Assessment Scheme:
Process Related Skills Product Related Skills Total Signature of
(15-M) (10-M) (25-M) Faculty

DevOps
No ratings yet
DevOps
50 pages
Updaed Lab Manual-PEII.docx
No ratings yet
Updaed Lab Manual-PEII.docx
38 pages
AI Manual-2021-2022 (Even) - Lab Manual
100% (1)
AI Manual-2021-2022 (Even) - Lab Manual
37 pages
Aiml Front Page Print
No ratings yet
Aiml Front Page Print
10 pages
Ai 1131
No ratings yet
Ai 1131
65 pages
Lab manual
No ratings yet
Lab manual
41 pages
BDA Lab Manual AI&DS
No ratings yet
BDA Lab Manual AI&DS
60 pages
3161608-Ai Gecm
No ratings yet
3161608-Ai Gecm
45 pages
Is Sneha
No ratings yet
Is Sneha
78 pages
Web development - student Copy
No ratings yet
Web development - student Copy
34 pages
LP-II AI
No ratings yet
LP-II AI
39 pages
Dbms Lab Manual
No ratings yet
Dbms Lab Manual
68 pages
Cryptography
No ratings yet
Cryptography
99 pages
AI&ML Handouts
No ratings yet
AI&ML Handouts
10 pages
1 Vision Mission (AI&DS)
No ratings yet
1 Vision Mission (AI&DS)
3 pages
Lab Manual-3
No ratings yet
Lab Manual-3
74 pages
R25 DEPARTMENT VISION & MISSION
No ratings yet
R25 DEPARTMENT VISION & MISSION
3 pages
Aiml FPP
No ratings yet
Aiml FPP
227 pages
Screenshot 2024-05-28 at 12.25.15 PM
No ratings yet
Screenshot 2024-05-28 at 12.25.15 PM
53 pages
21ai66 ML Lab Manual
No ratings yet
21ai66 ML Lab Manual
41 pages
Final_ASD
No ratings yet
Final_ASD
46 pages
Course Handout Software Engineering
No ratings yet
Course Handout Software Engineering
59 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
90 pages
First Page
No ratings yet
First Page
8 pages
V11-2021-175 - Credits Scheme and Syllabus
No ratings yet
V11-2021-175 - Credits Scheme and Syllabus
92 pages
Web Development (1)o (2)
No ratings yet
Web Development (1)o (2)
72 pages
Deep Learning Laboratory
No ratings yet
Deep Learning Laboratory
69 pages
FRONT PAGE (2)
No ratings yet
FRONT PAGE (2)
6 pages
Vishal Ai
No ratings yet
Vishal Ai
57 pages
Dbms Lab 4th Sem 2024-25
No ratings yet
Dbms Lab 4th Sem 2024-25
54 pages
AI lab Manual Om
No ratings yet
AI lab Manual Om
62 pages
DL_Intro
No ratings yet
DL_Intro
7 pages
Tsa Lab Record - Cse
No ratings yet
Tsa Lab Record - Cse
61 pages
DAA Course File
No ratings yet
DAA Course File
47 pages
Laboratory Manual Data Warehousing and Mining Lab: Department of Computer Science and Engineering
No ratings yet
Laboratory Manual Data Warehousing and Mining Lab: Department of Computer Science and Engineering
234 pages
Soft Computing Lab 16 03
No ratings yet
Soft Computing Lab 16 03
38 pages
DDCO Lab Manual
No ratings yet
DDCO Lab Manual
56 pages
OS Lab Manual
No ratings yet
OS Lab Manual
42 pages
LecturePlan_CS220_20CSF-432 (1)
No ratings yet
LecturePlan_CS220_20CSF-432 (1)
6 pages
SEMESTER new curriculum
No ratings yet
SEMESTER new curriculum
44 pages
ADA LAB MANUAL(BCSL404) 23-24 (1) (1)
No ratings yet
ADA LAB MANUAL(BCSL404) 23-24 (1) (1)
46 pages
AIML - 2022 Scheme 160 Credits 16 11 2023
No ratings yet
AIML - 2022 Scheme 160 Credits 16 11 2023
94 pages
Final - A&R Updated With BOS
No ratings yet
Final - A&R Updated With BOS
50 pages
BE02000041-Fundamental of Assignments
No ratings yet
BE02000041-Fundamental of Assignments
12 pages
DV-LAB Manual
No ratings yet
DV-LAB Manual
27 pages
Data Structures Lab
No ratings yet
Data Structures Lab
141 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
62 pages
Annexure I - B.Tech CSE (AI-DS) Syllabus 2023-24 - New
No ratings yet
Annexure I - B.Tech CSE (AI-DS) Syllabus 2023-24 - New
32 pages
r20 Aim Final Syllabus 6.0
No ratings yet
r20 Aim Final Syllabus 6.0
161 pages
Daa Labmanual 2024-25
No ratings yet
Daa Labmanual 2024-25
35 pages
Course File Durga Front Pages Java
No ratings yet
Course File Durga Front Pages Java
8 pages
EI Syl 175 3rd 8th Withmathdip 16082022 Final (12)
No ratings yet
EI Syl 175 3rd 8th Withmathdip 16082022 Final (12)
167 pages
DATA VISUALIZATION lab manual based on syllabus
No ratings yet
DATA VISUALIZATION lab manual based on syllabus
46 pages
75AI
No ratings yet
75AI
67 pages
Artificial Intelligence 2807203
No ratings yet
Artificial Intelligence 2807203
52 pages
CS3381-Oops Lab - Rubrics Final
No ratings yet
CS3381-Oops Lab - Rubrics Final
5 pages
Siddh Ds
No ratings yet
Siddh Ds
121 pages
Lab Manual PS2
No ratings yet
Lab Manual PS2
38 pages
Educational Technology
From Everand
Educational Technology
KHRITISH SWARGIARY
No ratings yet
Elearning Theories & Designs: Between Theory & Practice. a Guide for Novice Instructional Designers
From Everand
Elearning Theories & Designs: Between Theory & Practice. a Guide for Novice Instructional Designers
Awatef Bouledroua
No ratings yet
The Information Age - EDITED
No ratings yet
The Information Age - EDITED
26 pages
AI Portfolio 29june
No ratings yet
AI Portfolio 29june
43 pages
Full Download Machine Learning For Computer Scientists and Data Analysts: From An Applied Perspective Setareh Rafatirad PDF
No ratings yet
Full Download Machine Learning For Computer Scientists and Data Analysts: From An Applied Perspective Setareh Rafatirad PDF
49 pages
Seminar Report Part 2
No ratings yet
Seminar Report Part 2
18 pages
(Ebooks PDF) Download SQL Server Analytical Toolkit: Using Windowing, Analytical, Ranking, and Aggregate Functions For Data and Statistical Analysis 1st Edition Angelo Bobak Full Chapters
100% (2)
(Ebooks PDF) Download SQL Server Analytical Toolkit: Using Windowing, Analytical, Ranking, and Aggregate Functions For Data and Statistical Analysis 1st Edition Angelo Bobak Full Chapters
69 pages
Re imagining University Assessment in a Digital World Margaret Bearman all chapter instant download
100% (5)
Re imagining University Assessment in a Digital World Margaret Bearman all chapter instant download
65 pages
Chapter 1
No ratings yet
Chapter 1
46 pages
Word Embeddings
No ratings yet
Word Embeddings
55 pages
Unit 1
No ratings yet
Unit 1
63 pages
CAAI Trans on Intel Tech - 2024 - Wang - Deep learning on medical image analysis
No ratings yet
CAAI Trans on Intel Tech - 2024 - Wang - Deep learning on medical image analysis
35 pages
Amazon Laloo Offer Letter
No ratings yet
Amazon Laloo Offer Letter
2 pages
Memahami Deep Learning
100% (1)
Memahami Deep Learning
109 pages
Abhay Seminar - Proposal
No ratings yet
Abhay Seminar - Proposal
2 pages
Course Outline - NIA - Intelligent Government Based On AI - Vfinal
No ratings yet
Course Outline - NIA - Intelligent Government Based On AI - Vfinal
6 pages
AI-based Legacy Data Extraction and processing tool
No ratings yet
AI-based Legacy Data Extraction and processing tool
4 pages
Generative AI in Modern Marketing Module 2 1
No ratings yet
Generative AI in Modern Marketing Module 2 1
14 pages
Exercise 1, LAW609, Opinion Writing, Semester March 2024 To July
No ratings yet
Exercise 1, LAW609, Opinion Writing, Semester March 2024 To July
8 pages
Data Science Cheatsheet
100% (1)
Data Science Cheatsheet
1 page
5. DEEP UNIT 3 F (1)
No ratings yet
5. DEEP UNIT 3 F (1)
51 pages
Literature Review Traffic Management
100% (2)
Literature Review Traffic Management
5 pages
Instant download The Unexplained Intellect Complexity Time and the Metaphysics of Embodied Thought 1st Edition Christopher Mole pdf all chapter
100% (12)
Instant download The Unexplained Intellect Complexity Time and the Metaphysics of Embodied Thought 1st Edition Christopher Mole pdf all chapter
60 pages
AWS Certified Machine Learning Specialty Exam Guide
No ratings yet
AWS Certified Machine Learning Specialty Exam Guide
7 pages
Perceptrons and SVMS: Cs771: Introduction To Machine Learning Nisheeth
No ratings yet
Perceptrons and SVMS: Cs771: Introduction To Machine Learning Nisheeth
18 pages
7545 Dcap305 Principles of Software Engineering
No ratings yet
7545 Dcap305 Principles of Software Engineering
299 pages
1733774781939
No ratings yet
1733774781939
3 pages
Model Fine Tuning Documentation
No ratings yet
Model Fine Tuning Documentation
11 pages
Impact of Technological Advancements On Labor Markets
No ratings yet
Impact of Technological Advancements On Labor Markets
4 pages
Sep - 2024
No ratings yet
Sep - 2024
1 page
High Speed Machining Info
No ratings yet
High Speed Machining Info
4 pages
Final_report(Saie)
No ratings yet
Final_report(Saie)
38 pages