0% found this document useful (0 votes)

87 views

Geethanjali College of Engineering and Technology (Ugc Autonomous Institution)

This document provides information about the Data Analytics Lab course offered at Geethanjali College of Engineering and Technology. The course is offered to fourth year Bachelor of Technology students specializing in Computer Science and Engineering. The document outlines the course objectives, which include understanding big data elements, mathematical models, processing technologies, analytical concepts using tools like R and Python, and data visualization. It also lists the 12 experiments to be conducted in the lab over 12 weeks covering topics like Hadoop, MapReduce, machine learning algorithms, clustering, and data visualization. Formats for student certificates and mapping of the course to program educational objectives and outcomes are also included.

Uploaded by

18R11A0530 MUSALE AASHISH

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

87 views

Geethanjali College of Engineering and Technology (Ugc Autonomous Institution)

Uploaded by

18R11A0530 MUSALE AASHISH

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Geethanjali College of Engineering and Technology

(UGC AUTONOMOUS INSTITUTION)

(Accredited by NBA and NAAC with ‘A’ grade, Approved by AICTE
New Delhi and Affiliated to JNTUH)
Cheeryal (V), Keesara (M), Medchal (Dist), Telangana – 501 301.

DATA ANALYTICS LAB

(18CS41L1)
Laboratory Manual

IV Year B.Tech. CSE I Semester

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

2021-2022

Lab - Incharge HOD-CSE

Geethanjali College of Engineering and Technology

(UGC AUTONOMOUS INSTITUTION)

(Accredited by NBA and NAAC with ‘A’ grade, Approved by AICTE New
Delhi and Affiliated to JNTUH)

Cheeryal (V), Keesara (M), Medchal (Dist), Telangana – 501 301.

CERTIFICATE

This is to certify that Mr. / Miss ___________________________________

has satisfactorily completed________ number of experiments in the Computer Networks and

Cloud Computing Laboratory.

Roll No: ________ Branch: _ Section: ___

Year: _ Academic Year:

Head Faculty

Dept. of CSE In-charge

Internal Examiner External Examiner

GEETHANJALI COLLEGE OF ENGINEERING AND TECHNOLOGY

(Autonomous)

Cheeryal (V), Keesara (M), Medchal Dist., Telangana-501301

18CS41L1-DATA ANALYTICS LAB

I V Year. B.Tech. (CSE) – I Sem

L T P/D C

- - 2/- 1

Prerequisite(s):

● 18CS2102 - Object Oriented Programming using Java

● 18CS2203 -Database Management Systems

Course Objectives:
Develop ability to
1. Know the basic elements of Big Data and Data science to handle huge amount of data.
2. Gain knowledge of basic mathematics behind the Big data.
3. Understand the different Big data processing technologies.
4. Apply the Analytical concepts of Big data using R and Python.
5. Visualize the Big Data using different tools.

Course Outcomes (COs):

At the end of the course, student would be able to:

CO1: Observe Big Data elements and Architectures.
CO2: Apply different mathematical models for Big Data.
CO3: Demonstrate their Big Data skills by developing different applications.
CO4: Apply each learning model for different datasets.
CO5: Analyze needs, challenges and techniques for big data visualization.

LIST OF EXPERIMENTS

Week 1: Installation, Configuration, and Running of Hadoop and HDFS.

Week 2: Implementation of Word Count / Frequency Programs using MapReduce.

Week 3: Implementation of MR Program that processes a Weather Dataset.

Week 4: Implementation of Linear and Logistic Regression.

Week 5: Implementation of SVM Classification Technique.

Week 6: Implementation of Decision Tree Classification Technique.

Week 7: Implementation of Hierarchical Clustering.

Week 8: Implementation of Partitioning Clustering.

Week 9: Data Visualization using Pie, Bar, Boxplot Chart Plotting Framework.

Week 10: Data Visualization using Histogram Plotting Framework.

Week 11: Data Visualization using Line Graph Plotting, Scatterplot Plotting Framework.

Week 12: Application to analyze Stock Market Data using R Language.

Mission of the Department

● To be a center of excellence in instruction, innovation in research and scholarship, and

service to the stake holders, the profession, and the public.

● To prepare graduates to enter a rapidly changing field as a competent computer science

engineer.

● To prepare graduate capable in all phases of software development, possess a firm

understanding of hardware technologies, have the strong mathematical background
necessary for scientific computing, and be sufficiently well versed in general theory to
allow growth within the discipline as it advances.

● To prepare graduates to assume leadership roles by possessing good communication

skills, the ability to work effectively as team members, and an appreciation for their social
and ethical responsibility in a global setting.

PROGRAM EDUCATIONAL OBJECTIVES

● To provide graduates with a good foundation in mathematics, sciences and engineering

fundamentals required to solve engineering problems that will facilitate them to find
employment in industry and / or to pursue postgraduate studies with an appreciation for
lifelong learning.

● To provide graduates with analytical and problem solving skills to design algorithms, other
hardware / software systems, and inculcate professional ethics, inter-personal skills to
work in a multi-cultural team.

● To facilitate graduates to get familiarized with the art software / hardware tools, imbibing
creativity and innovation that would enable them to develop cutting-edge technologies of
multi-disciplinary nature for societal development.
PROGRAM OUTCOMES (POs)

Program Outcomes (POs) describe what students are expected to know and be able to do by
the time of graduation to accomplish Program Educational Objectives (PEOs). The Program
Outcomes for Computer Science and Engineering graduates are:

Engineering Graduates would be able to:

PO 1: Engineering knowledge: Apply the knowledge of mathematics, science, engineering
fundamentals, and an engineering specialization to the solution of complex engineering
problems.

PO 2: Problem analysis: Identify, formulate, review research literature, and analyze

complex engineering problems reaching substantiated conclusions using first principles of
mathematics, natural sciences, and engineering sciences.

PO 3: Design/development of solutions: Design solutions for complex engineering

problems and design system components or processes that meet the specified needs with
appropriate consideration for the public health and safety, and the cultural, societal, and
environmental considerations.

PO 4: Conduct investigations of complex problems: Use research-based knowledge and

research methods including design of experiments, analysis and interpretation of data, and
synthesis of the information to provide valid conclusions.

PO 5: Modern tool usage: Create, select, and apply appropriate techniques, resources, and
modern engineering and IT tools including prediction and modeling to complex engineering
activities with an understanding of the limitations.

PO 6: The engineer and society: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent responsibilities
relevant to the professional engineering practice.

PO 7: Environment and sustainability: Understand the impact of the professional

engineering solutions in societal and environmental contexts, and demonstrate the knowledge
of, and need for sustainable development.

PO 8: Ethics: Apply ethical principles and commit to professional ethics and responsibilities
and norms of the engineering practice.
PO 9: Individual and team work: Function effectively as an individual, and as a member or
leader in diverse teams, and in multidisciplinary settings.

PO 10: Communication: Communicate effectively on complex engineering activities with

the engineering community and with society at large, such as, being able to comprehend and
write effective reports and design documentation, make effective presentations, and give and
receive clear instructions.

PO 11: Project management and finance: Demonstrate knowledge and understanding of

the engineering and management principles and apply these to one’s own work, as a member
and leader in a team, to manage projects and in multidisciplinary environments.

PO 12: Life-long learning: Recognize the need for, and have the preparation and ability to
engage in independent and life-long learning in the broadest context of technological change.

PROGRAM SPECIFIC OUTCOMES (PSOs)

PSO 1: To identify and define the computing requirements appropriate for its solution under
given constraints.

PSO 2: To follow the best practices, namely, SEI-CMM levels and 6-sigma which varies
from time to time for software development projects using open-ended programming
environments to produce software deliverables as per customer needs.

6. Course Objectives and Course Outcomes

Course Outcomes (COs):

At the end of the course, student would be able to:

CO1. Observe Big Data elements and Architectures.
CO2. Apply different mathematical models for Big Data.
CO3. Demonstrate their Big Data skills by developing different applications.
CO4. Apply each learning model for different datasets.
CO5. Analyze needs, challenges and techniques for big data visualization.
Mapping of Lab Course with Programme Educational Objectives

Course Course PEOs POs & PSOs

Code
DATA ANALYTICS LAB PEO1,
18CS41L1
PEO2, PO1,PO2,PO3,PO4,PO5,
PEO3 PO11,PO12,PSO1,PSO2

Mapping of Lab Course outcomes with Programme outcomes:

Course Outcomes - Program Outcomes and Program Specific Outcomes

DATA ANALYTICS 1 2 3 4 5 6 7 8 9 10 11 12 PSO1 PSO2
(18CS41L1)

CO1. Observe Big Data 2 1 1 1 1 - - - - - 2 2 1 2

elements and Architectures.

CO2. Apply different 1 1 2 3 2 - - - - - 2 2 1 2

mathematical models for
Big Data.

CO3. Demonstrate their Big 2 1 1 2 1 - - - - - 1 2 1 2

Data skills by developing
different applications.

CO4. Apply each learning 1 1 1 2 1 - - - - - 1 2 1 2

model for different datasets.

CO5. Analyze needs, 2 1 1 2 1 - - - - - 1 2 1 2

challenges and techniques
for big data visualization
Prerequisites:

● 18CS2102 - Object Oriented Programming using Java

● 18CS2203 -Database Management Systems

INSTRUCTIONS TO THE STUDENTS:

1. Students are required to attend all labs.

2. Students should be dressed in formals when attending the laboratory sessions.
3. Students will work individually in computer laboratories.
4. While coming to the lab bring the observation book and Work book etc.
5. Before coming to the lab, prepare the pre-lab questions. Read through the lab
experiment to familiarize you.
6. Utilize 3 hours’ time properly to perform the experiment and noting down the
outputs.
7. If the experiment is not completed in the prescribed time, the pending work
has to be done in the leisure hour or extended hours.
8. You will be expected to submit the completed work book according to the
deadlines set up by your instructor.

INSTRUCTIONS TO LABORATORY TEACHERS:

• Observation book and lab records submitted for the lab work are to be checked and
signed before the next lab session.
• Students should be instructed to switch ON the power supply after the connections are
checked by the lab assistant / teacher.
• The promptness of submission should be strictly insisted by awarding the marks
accordingly.
• Ask viva questions at the end of the experiment.
• Do not allow students who come late to the lab class.
• Encourage the students to do the experiments innovatively.
• Fill continuous Evaluation sheet, on regular basis.
• Ensure that the students are dressed in formals
Scheme of Lab Exam Evaluation:

Evaluation of Internal Marks:

a) 15 Marks are awarded for day to day work

1) Record and Observation book --------- 5Marks

2) Attendance and behavior of student --------- 5 Marks

3) Viva and performance ----------------5 Marks

b) 15 Marks are awarded for conducting laboratory test as follows:

1) Write up and program--------5 Marks

2) Execution of Program ---------5 Marks

3) Viva and performance ----------------5 Marks

Evaluation of External Marks:

70 Marks are awarded for conducting laboratory test as follows:

1) Algorithm ------------------- 25 Marks.

2) Write up and program--------- 15 Marks

3) Execution of Program --------- 15 Marks

4) Viva ---------------------- 15 Marks

PERFORMANCE INDICATOR

S.No. Name of Experiment Date of Date of Marks Signature Remarks

Exp. Submission
S.No. Name of Experiment Date of Date of Marks Signature Remarks
Exp. Submission
WEEK 1
IMPLEMENT BASIC PROGRAMS IN R BY USING DATA STRUCTURES
> myString <- "Hello, World!"
> print ( myString)
[1] "Hello, World!"

Vectors:
apple <- c('red','green',"yellow")
print(apple)
print(class(apple))
[1] "red" "green" "yellow"
[1] "character"

Lists:
list1 <- list(c(2,5,3),21.3,sin)
print(list1)
[[1]]
[1] 2 5 3
[[2]]
[1] 21.3
[[3]]
function (x) .Primitive("sin")

Matrices:
M=matrix(c('a','a','b','c','b','a'),
nrow=2,ncol=3,byrow=TRUE)
print(M)
[,1] [,2] [,3]
[1,] "a" "a" "b"
[2,] "c" "b" "a"

Arrays:
a <- array(c('green','yellow'),dim = c(3,3,1))
print(a)
,,1
[,1] [,2] [,3]
[1,] "green" "yellow" "green"
[2,] "yellow" "green" "yellow"
[3,] "green" "yellow" "green"

Data Frames:
BMI <- data.frame(
gender = c("Male", "Male","Female"),
height = c(152, 171.5, 165),
weight = c(81,93, 78),
Age = c(42,38,26)
)
print(BMI)
gender height weight Age
1 Male 152.0 81 42
2 Male 171.5 93 38
3 Female 165.0 78 26
WEEK2
1. Write a R program to take input from the user (name and age) and display the
values. Also print the version of R installation.

name = readline(prompt="Input your name: ")

age = readline(prompt="Input your age: ")
print(paste("My name is",name, "and I am",age ,"years old."))
print(R.version.string)

Output

Input your name: abc

Input your age: 25
[1] "My name is abc and I am 25 years old."
[1] "R version 3.4.4 (2018-03-15)"

2. Write a R program to get the details of the objects in memory.

name = "Python";
n1 = 10;
n2 = 0.5
nums = c(10, 20, 30, 40, 50, 60)
print(ls())
print("Details of the objects in memory:")
print(ls.str())

Output

[1] "n1" "n2" "name" "nums"

[1] "Details of the objects in memory:"
n1 : num 10
n2 : num 0.5
name : chr "Python"
nums : num [1:6] 10 20 30 40 50 60

3. Write a R program to create a sequence of numbers from 20 to 50 and find the mean
of numbers from 20 to 60 and sum of numbers from 51 to 91.

print("Sequence of numbers from 20 to 50:")

print(seq(20,50))
print("Mean of numbers from 20 to 60:")
print(mean(20:60))
print("Sum of numbers from 51 to 91:")
print(sum(51:91))

Output

[1] "Sequence of numbers from 20 to 50:"

[1] 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
[26] 45 46 47 48 49 50
[1] "Mean of numbers from 20 to 60:"
[1] 40
[1] "Sum of numbers from 51 to 91:"
[1] 2911

4. Write a R program to create a vector which contains 10 random integer values

between -50 and +50.

v = sample(-50:50, 10, replace=TRUE)

print("Content of the vector:")
print("10 random integer values between -50 and +50:")
print(v)

Output

[1] "Content of the vector:"

[1] "10 random integer values between -50 and +50:"
[1] 31 -13 -21 42 49 -39 20 12 39 -2

5. Write a R program to get the first 10 Fibonacci numbers.

Fibonacci <- numeric(10)

Fibonacci[1] <- Fibonacci[2] <- 1
for (i in 3:10) Fibonacci[i] <- Fibonacci[i - 2] + Fibonacci[i - 1]
print("First 10 Fibonacci numbers:")
print(Fibonacci)

Output

[1] "First 10 Fibonacci numbers:"

[1] 1 1 2 3 5 8 13 21 34 55

6.Write a R program to get all prime numbers up to a given number (based on the sieve
of Eratosthenes).

prime_numbers <- function(n) {

if (n >= 2) {
x = seq(2, n)
prime_nums = c()
for (i in seq(2, n)) {
if (any(x == i)) {
prime_nums = c(prime_nums, i)
x = c(x[(x %% i) != 0], i)
}
}
return(prime_nums)
}
else
{
stop("Input number should be at least 2.")
}
}
prime_numbers(12)

Output

[1] 2 3 5 7 11

7.Write a R program to print the numbers from 1 to 100 and print "Fizz" for multiples
of 3, print "Buzz" for multiples of 5, and print "FizzBuzz" for multiples of both

for (n in 1:100) {

if (n %% 3 == 0 & n %% 5 == 0) {print("FizzBuzz")}

else if (n %% 3 == 0) {print("Fizz")}

else if (n %% 5 == 0) {print("Buzz")}

else print(n)

Ouput

[1] 1
[1] 2
[1] "Fizz"
[1] 4
[1] "Buzz"
[1] "Fizz"
[1] 7
[1] 8
[1] "Fizz"
[1] "Buzz"
[1] 11
[1] "Fizz"
[1] 13
[1] 14
[1] "FizzBuzz"
[1] 16
[1] 17
[1] "Fizz"
[1] 19
[1] "Buzz"
[1] "Fizz"
[1] 22
[1] 23
[1] "Fizz"
[1] "Buzz"
[1] 26
[1] "Fizz"
[1] 28
[1] 29
[1] "FizzBuzz"
[1] 31
[1] 32
[1] "Fizz"
[1] 34
[1] "Buzz"
[1] "Fizz"
[1] 37
[1] 38
[1] "Fizz"
[1] "Buzz"
[1] 41
[1] "Fizz"
[1] 43
[1] 44
[1] "FizzBuzz"
[1] 46
[1] 47
[1] "Fizz"
[1] 49
[1] "Buzz"
[1] "Fizz"
[1] 52
[1] 53
[1] "Fizz"
[1] "Buzz"
[1] 56
[1] "Fizz"
[1] 58
[1] 59
[1] "FizzBuzz"
[1] 61
[1] 62
[1] "Fizz"
[1] 64
[1] "Buzz"
[1] "Fizz"
[1] 67
[1] 68
[1] "Fizz"
[1] "Buzz"
[1] 71
[1] "Fizz"
[1] 73
[1] 74
[1] "FizzBuzz"
[1] 76
[1] 77
[1] "Fizz"
[1] 79
[1] "Buzz"
[1] "Fizz"
[1] 82
[1] 83
[1] "Fizz"
[1] "Buzz"
[1] 86
[1] "Fizz"
[1] 88
[1] 89
[1] "FizzBuzz"
[1] 91
[1] 92
[1] "Fizz"
[1] 94
[1] "Buzz"
[1] "Fizz"
[1] 97
[1] 98
[1] "Fizz"
[1] "Buzz"

8.Write a R program to extract first 10 english letter in lower case and last 10 letters in
upper case and extract letters between 22nd to 24th letters in upper case.

print("First 10 letters in lower case:")

t = head(letters, 10)
print(t)
print("Last 10 letters in upper case:")
t = tail(LETTERS, 10)
print(t)
print("Letters between 22nd to 24th letters in upper case:")
e = tail(LETTERS[22:24])
print(e)

Output

[1] "First 10 letters in lower case:"

[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
[1] "Last 10 letters in upper case:"
[1] "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z"
[1] "Letters between 22nd to 24th letters in upper case:"
[1] "V" "W" "X"
9.Write a R program to find the factors of a given number.

print_factors = function(n) {
print(paste("The factors of",n,"are:"))
for(i in 1:n) {
if((n %% i) == 0) {
print(i)
}
}
}
print_factors(4)
print_factors(7)
print_factors(12)

Output

[1] "The factors of 4 are:"

[1] 1
[1] 2
[1] 4
[1] "The factors of 7 are:"
[1] 1
[1] 7
[1] "The factors of 12 are:"
[1] 1
[1] 2
[1] 3
[1] 4
[1] 6
[1] 12

10.Write a R program to find the maximum and the minimum value of a given vector.

nums = c(10, 20, 30, 40, 50, 60)

print('Original vector:')
print(nums)
print(paste("Maximum value of the said vector:",max(nums)))
print(paste("Minimum value of the said vector:",min(nums)))

Output

[1] "Original vector:"

[1] 10 20 30 40 50 60
[1] "Maximum value of the said vector: 60"
[1] "Minimum value of the said vector: 10"
11. Write a R program to get the unique elements of a given string and unique numbers
of vector

str1 = "The quick brown fox jumps over the lazy dog."

print("Original vector(string)")
print(str1)
print("Unique elements of the said vector:")
print(unique(tolower(str1)))
nums = c(1, 2, 2, 3, 4, 4, 5, 6)
print("Original vector(number)")
print(nums)
print("Unique elements of the said vector:")
print(unique(nums))

Output

[1] "Original vector(string)"

[1] "The quick brown fox jumps over the lazy dog."
[1] "Unique elements of the said vector:"
[1] "the quick brown fox jumps over the lazy dog."
[1] "Original vector(number)"
[1] 1 2 2 3 4 4 5 6
[1] "Unique elements of the said vector:"
[1] 1 2 3 4 5 6
12. Write a R program to create three vectors a,b,c with 3 integers. Combine the three
vectors to become a 3×3 matrix where each column represents a vector. Print the content of
the matrix

a<-c(1,2,3)
b<-c(4,5,6)
c<-c(7,8,9)
m<-cbind(a,b,c)
print("Content of the said matrix:")
print(m)

Output

[1] "Content of the said matrix:"

abc
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
WEEK-3

SUMMARIZING THE STATISTICS

1. summary(iris)

Output:

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 setosa :50

1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 versicolor:50

Median :5.800 Median :3.000 Median :4.350 Median :1.300 virginica :50

Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199

3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800

Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500

2. str(mtcars)

Output:

'data.frame': 32 obs. of 11 variables:

$ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...

$ cyl : num 6 6 4 6 8 6 8 4 4 6 ...

$ disp: num 160 160 108 258 360 ...

$ hp : num 110 110 93 110 175 105 245 62 95 123 ...

$ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...

$ wt : num 2.62 2.88 2.32 3.21 3.44 ...

$ qsec: num 16.5 17 18.6 19.4 17 ...

$ vs : num 0 0 1 1 0 1 0 1 1 1 ...

$ am : num 1 1 1 0 0 0 0 0 0 0 ...

$ gear: num 4 4 4 3 3 3 3 4 4 4 ...

$ carb: num 4 4 1 1 2 1 4 2 2 4 ...

3. head(mtcars)

Output:

mpg cyl disp hp drat wt qsec vs am gear carb

Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4

Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4

Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1

Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1

Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2

Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

4. tail(mtcars)

Output:

mpg cyl disp hp drat wt qsec vs am gear carb

Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.7 0 1 5 2

Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.9 1 1 5 2

Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.5 0 1 5 4

Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.5 0 1 5 6

Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.6 0 1 5 8

Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.6 1 1 4 2

5. names(mtcars)

Output:

[1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear" "carb"

6. nrow(mtcars)

Output:

[1] 32
7. aggregate(Sepal.Length~Species,iris,mean)

Output:

Species Sepal.Length

1 setosa 5.006

2 versicolor 5.936

3 virginica 6.588

8. fix(iris)

Output:

9. sepalsub<- subset(iris,Sepal.Length>7)

sepalsub

Output:

Sepal.Length Sepal.Width Petal.Length Petal.Width Species var6

103 7.1 3.0 5.9 2.1 virginica

<NA>
106 7.6 3.0 6.6 2.1 virginica
<NA>

108 7.3 2.9 6.3 1.8 virginica

<NA>

110 7.2 3.6 6.1 2.5 virginica

<NA>

118 7.7 3.8 6.7 2.2 virginica

<NA>

119 7.7 2.6 6.9 2.3 virginica

<NA>

123 7.7 2.8 6.7 2.0 virginica

<NA>

126 7.2 3.2 6.0 1.8 virginica

<NA>

130 7.2 3.0 5.8 1.6 virginica

<NA>

131 7.4 2.8 6.1 1.9 virginica

<NA>

132 7.9 3.8 6.4 2.0 virginica

<NA>

136 7.7 3.0 6.1 2.3 virginica

<NA>
WEEK 4

BINOMIAL DISTRIBUTION

1. choose(10,3)*((1/6)^3*(5/6)^7)

Output:

[1] 0.1550454

2. dbinom(3,size=10,prob=(1/6))

Output:

[1] 0.1550454

3. choose(10,0)*((1/6)^0*(5/6)^10) +

choose(10,1)*((1/6)^1*(5/6)^9)+

choose(10,2)*((1/6)^2*(5/6)^8)+

choose(10,3)*((1/6)^3*(5/6)^7)

Output:

[1] 0.9302722

4. pbinom(3,size=10,prob=(1/6),lower=T)

Output:

[1] 0.9302722

5. pbinom(3,size=10,prob=(1/6),lower=F)

Output:

[1] 0.06972784

6. dbinom(4,size=12,prob=0.2)

Output:

[1] 0.1328756
7. dbinom(0,size=12,prob=0.2)+

dbinom(1,size=12,prob=0.2)+

dbinom(2,size=12,prob=0.2)+

dbinom(3,size=12,prob=0.2)+

dbinom(4,size=12,prob=0.2)

Output:

[1] 0.9274445

8. pbinom(4,size=12,prob=0.2)

Output:

[1] 0.9274445

9. x <- pbinom(26,51,0.5)

print(x)

Output:

[1] 0.610116

10. x <- seq(0,50,by = 1)

y <- dbinom(x,50,0.5)
plot(x,y)
Output:
11. x <- qbinom(0.25,51,1/2)

print(x)

Output:

[1] 23

12. x <- rbinom(8,150,.4)

print(x)

Output:

[1] 60 71 57 60 62 62 50 59

POISSON DISTRIBUTION

1. ppois(16,lambda = 12)

Output:

[1] 0.898709

2. ppois(16,lambda = 12,lower=F)

Output:

[1] 0.101291

3. rpois(16,lambda = 12)

Output:

[1] 12 8 12 10 13 8 11 13 12 11 15 12 10 11 14 11

4. dpois(16,lambda = 12)

Output:

[1] 0.05429334
NORMAL DISTRIBUTION

1. pnorm(84,mean=72,sd=15.2,lower.tail=FALSE)

Output:

[1] 0.2149176

2. x <- seq(-10, 10, by = .1)

y <- dnorm(x, mean = 2.5, sd = 0.5)

plot(x,y)

Output:

3. x <- seq(-10,10,by = .2)

y <- pnorm(x, mean = 2.5, sd = 2)

plot(x,y)

Output:
4. x <- seq(0, 1, by = 0.02)

y <- qnorm(x, mean = 2, sd = 1)

plot(x,y)

Output:

5. y <- rnorm(50)

hist(y, main = "Normal DIstribution")

Output:
LINEAR REGRSSION

Write a program to implement linear and logistic regression

head(cars)

Output:

speed dist

1 4 2

2 4 10

3 7 4

4 7 22

5 8 16

6 9 10

scatter.smooth(x=cars$speed,y=cars$dist,main="dist~speed")

Output:

linearMod<-lm(dist~speed,data=cars)

print(linearMod)

Output:
Coefficients:

(Intercept) speed

-17.579 3.932

summary(linearMod)

Output:

Residuals:

Min 1Q Median 3Q Max

-29.069 -9.525 -2.272 9.215 43.201

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -17.5791 6.7584 -2.601 0.0123 *

speed 3.9324 0.4155 9.464 1.49e-12 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 15.38 on 48 degrees of freedom

Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438

F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12

LOGISTIC REGRESSION

input<-mtcars[,c("am","cyl","hp","wt")]

print(input)

Output:

am cyl hp wt

Mazda RX4 1 6 110 2.620

Mazda RX4 Wag 1 6 110 2.875

Datsun 710 1 4 93 2.320

Hornet 4 Drive 0 6 110 3.215

Hornet Sportabout 0 8 175 3.440

Valiant 0 6 105 3.460

Duster 360 0 8 245 3.570

Merc 240D 0 4 62 3.190

Merc 230 0 4 95 3.150

Merc 280 0 6 123 3.440

Merc 280C 0 6 123 3.440

Merc 450SE 0 8 180 4.070

Merc 450SL 0 8 180 3.730

input<-mtcars[,c("am","cyl","hp","wt")]

am.data=glm(formula=am~cyl+hp+wt,data=input,family=binomial)

print(summary(am.data))

Output:

Deviance Residuals:

Min 1Q Median 3Q Max

-2.17272 -0.14907 -0.01464 0.14116 1.27641

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept) 19.70288 8.11637 2.428 0.0152 *

cyl 0.48760 1.07162 0.455 0.6491

hp 0.03259 0.01886 1.728 0.0840 .

wt -9.14947 4.15332 -2.203 0.0276 *

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 43.2297 on 31 degrees of freedom

Residual deviance: 9.8415 on 28 degrees of freedom

AIC: 17.841

Number of Fisher Scoring iterations: 8

DP 100
100% (1)
DP 100
459 pages
Cruise Control System Ccs
No ratings yet
Cruise Control System Ccs
2 pages
Bda 20cs41001 Course File Ds
No ratings yet
Bda 20cs41001 Course File Ds
170 pages
Bda 20cs41001 Course File
No ratings yet
Bda 20cs41001 Course File
133 pages
Bda Lab Manual (R20a0592)
No ratings yet
Bda Lab Manual (R20a0592)
89 pages
1 to 5 and 9
No ratings yet
1 to 5 and 9
38 pages
Bda Lab Manual (R20a0592)
No ratings yet
Bda Lab Manual (R20a0592)
89 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
90 pages
3160714 Data Mining. Vbv Jjj Ldce Vgec (1) - Copy
No ratings yet
3160714 Data Mining. Vbv Jjj Ldce Vgec (1) - Copy
43 pages
Big Data Analytics(r18a0529)
No ratings yet
Big Data Analytics(r18a0529)
139 pages
Laboratory Manual Data Warehousing and Mining Lab: Department of Computer Science and Engineering
No ratings yet
Laboratory Manual Data Warehousing and Mining Lab: Department of Computer Science and Engineering
234 pages
BDA Final Lab Manual
100% (1)
BDA Final Lab Manual
56 pages
bda lab manual (1)
No ratings yet
bda lab manual (1)
55 pages
DWDM Lab Manual - It - Iii-Ii - 2018-19 PDF
No ratings yet
DWDM Lab Manual - It - Iii-Ii - 2018-19 PDF
96 pages
Ada Final
No ratings yet
Ada Final
37 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
62 pages
Soft Computing Lab Record
No ratings yet
Soft Computing Lab Record
37 pages
DMBI Lab Manual Final
No ratings yet
DMBI Lab Manual Final
56 pages
LecturePlan CS201 20SMP-460
No ratings yet
LecturePlan CS201 20SMP-460
5 pages
Cst 322 Data Analytics (Elective)
No ratings yet
Cst 322 Data Analytics (Elective)
244 pages
(R18A0584) Data Structures Lab Manual
No ratings yet
(R18A0584) Data Structures Lab Manual
104 pages
DataVisualization_lab_manual (1)
No ratings yet
DataVisualization_lab_manual (1)
110 pages
DM lab manual
No ratings yet
DM lab manual
26 pages
Internship
No ratings yet
Internship
22 pages
software record
No ratings yet
software record
71 pages
Cse Bda Lab Manual
No ratings yet
Cse Bda Lab Manual
99 pages
Siddh Ds
No ratings yet
Siddh Ds
121 pages
Bda Vision Mission New
No ratings yet
Bda Vision Mission New
4 pages
DL Lab Manual Student
No ratings yet
DL Lab Manual Student
6 pages
3130702 Data Structure Lab Manual
No ratings yet
3130702 Data Structure Lab Manual
104 pages
Co-Po Big Data Analytics
No ratings yet
Co-Po Big Data Analytics
41 pages
Kush Wah
No ratings yet
Kush Wah
103 pages
Data Structures Lab
No ratings yet
Data Structures Lab
141 pages
It Iii B.tech Sem-Ii Dwdm-R17a0590 Lab Manual 2019-20
No ratings yet
It Iii B.tech Sem-Ii Dwdm-R17a0590 Lab Manual 2019-20
107 pages
fa1a8fe33ea455ddf446167f3ca33d5d
No ratings yet
fa1a8fe33ea455ddf446167f3ca33d5d
137 pages
3130702 Data Structure Lab Manual
No ratings yet
3130702 Data Structure Lab Manual
56 pages
3130702 Data Structure Practical-1
No ratings yet
3130702 Data Structure Practical-1
27 pages
DWDM Lab Manual Final Updated New Finalll
No ratings yet
DWDM Lab Manual Final Updated New Finalll
60 pages
ADA Lab Manual
No ratings yet
ADA Lab Manual
67 pages
BDA Lab Manual AI&DS
No ratings yet
BDA Lab Manual AI&DS
60 pages
Cse IV I CF 109pages
No ratings yet
Cse IV I CF 109pages
111 pages
Nayan DS
No ratings yet
Nayan DS
65 pages
Dsb Da Lab Manual
No ratings yet
Dsb Da Lab Manual
164 pages
Data analytics front page
No ratings yet
Data analytics front page
12 pages
IV_Year_Syllabus_(2024_-25_)
No ratings yet
IV_Year_Syllabus_(2024_-25_)
51 pages
2CS702-CPD-Odd 23 24
No ratings yet
2CS702-CPD-Odd 23 24
9 pages
3130702_DS_Lab Manual (1)
No ratings yet
3130702_DS_Lab Manual (1)
75 pages
DW lab manual (4)
No ratings yet
DW lab manual (4)
39 pages
OOP Lab Final Record
No ratings yet
OOP Lab Final Record
61 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
91 pages
Bda Manual
No ratings yet
Bda Manual
47 pages
Curriculum and Syllabi (2020-2021) : School of Computer Science and Engineering
No ratings yet
Curriculum and Syllabi (2020-2021) : School of Computer Science and Engineering
26 pages
DWBI Venky Final Print
No ratings yet
DWBI Venky Final Print
39 pages
Data Structures Lab - 101
No ratings yet
Data Structures Lab - 101
70 pages
3 Cse Big Data Analytics 19a 05 602p R 19 Lab Manual
No ratings yet
3 Cse Big Data Analytics 19a 05 602p R 19 Lab Manual
29 pages
Lab Manual ML Final
No ratings yet
Lab Manual ML Final
47 pages
Ds Narjjis
No ratings yet
Ds Narjjis
98 pages
Bda 20cs41001 Course File
No ratings yet
Bda 20cs41001 Course File
133 pages
Est 102 Computer Programming in c
No ratings yet
Est 102 Computer Programming in c
326 pages
Soft Computing Lab 16 03
No ratings yet
Soft Computing Lab 16 03
38 pages
Machine Learning Mastery for Engineers
From Everand
Machine Learning Mastery for Engineers
Abdellatif Sadeq
No ratings yet
Teaching and Learning in STEM With Computation, Modeling, and Simulation Practices: A Guide for Practitioners and Researchers
From Everand
Teaching and Learning in STEM With Computation, Modeling, and Simulation Practices: A Guide for Practitioners and Researchers
Alejandra J. Magana
No ratings yet
1.definition and Characteristics of Iot
No ratings yet
1.definition and Characteristics of Iot
19 pages
Data Analytics Course File 2021-22 Odd Semester
No ratings yet
Data Analytics Course File 2021-22 Odd Semester
164 pages
Data Analytics Important Questions
No ratings yet
Data Analytics Important Questions
11 pages
Data Analytics Unit-3 Notes
No ratings yet
Data Analytics Unit-3 Notes
21 pages
Unit 2 Assignment Questions
No ratings yet
Unit 2 Assignment Questions
1 page
DA Unit 3,4,5 Notes
No ratings yet
DA Unit 3,4,5 Notes
54 pages
NSX-T_Architecture_Overview
No ratings yet
NSX-T_Architecture_Overview
1 page
Discrete Mathematics Unit 5 Graph Theory
No ratings yet
Discrete Mathematics Unit 5 Graph Theory
58 pages
123 234 345 456
No ratings yet
123 234 345 456
5 pages
CIS Project - Computer Parts vs. Human Body Parts PowerPoint
No ratings yet
CIS Project - Computer Parts vs. Human Body Parts PowerPoint
18 pages
Spring Boot
No ratings yet
Spring Boot
13 pages
(7 EL 20m/15m/10m YAGI) : Ta7Om
No ratings yet
(7 EL 20m/15m/10m YAGI) : Ta7Om
18 pages
Software Modelling and Design: Unit IIII
No ratings yet
Software Modelling and Design: Unit IIII
57 pages
Playground
No ratings yet
Playground
16 pages
Iso 27001 Compliance Checklist Template: A. 5. IS Policies
No ratings yet
Iso 27001 Compliance Checklist Template: A. 5. IS Policies
7 pages
U Bub Ms A PR Instructions 202223
No ratings yet
U Bub Ms A PR Instructions 202223
30 pages
Alexander Nikov: 5. IT Infrastructure and Emerging Technologies
No ratings yet
Alexander Nikov: 5. IT Infrastructure and Emerging Technologies
14 pages
UNIT 3 Test: Keep in Touch Between Click On The Link Charge The Battery Chat Online Turns On
No ratings yet
UNIT 3 Test: Keep in Touch Between Click On The Link Charge The Battery Chat Online Turns On
4 pages
Oil Test Description
No ratings yet
Oil Test Description
21 pages
Business Plan - Template - TBDC
No ratings yet
Business Plan - Template - TBDC
12 pages
Sharma Et Al. (2022)
No ratings yet
Sharma Et Al. (2022)
17 pages
CL210 RHOSP 13.0 en 1 20181129 TOC
No ratings yet
CL210 RHOSP 13.0 en 1 20181129 TOC
2 pages
Kelompok 1 D3 TL 1 B (Percobaan 9)
No ratings yet
Kelompok 1 D3 TL 1 B (Percobaan 9)
21 pages
(Ye-Tt) Threading Tools PDF
No ratings yet
(Ye-Tt) Threading Tools PDF
176 pages
Presentation-Template For Journal of Physics Conference Series
No ratings yet
Presentation-Template For Journal of Physics Conference Series
14 pages
Start PMSM
No ratings yet
Start PMSM
4 pages
KCSE 2008 Agriculture P1 E
No ratings yet
KCSE 2008 Agriculture P1 E
5 pages
Systemair SYSHRW 60 L (1)
No ratings yet
Systemair SYSHRW 60 L (1)
3 pages
ccna
No ratings yet
ccna
15 pages
CCNP Interview Questions
No ratings yet
CCNP Interview Questions
4 pages
Yulianto, 2021
No ratings yet
Yulianto, 2021
4 pages
Pneumatic Hack Sawmachine
No ratings yet
Pneumatic Hack Sawmachine
93 pages
Assignment 1
No ratings yet
Assignment 1
27 pages
Typography
100% (1)
Typography
3 pages

Geethanjali College of Engineering and Technology (Ugc Autonomous Institution)

Uploaded by

Geethanjali College of Engineering and Technology (Ugc Autonomous Institution)

Uploaded by

Geethanjali College of Engineering and Technology

(UGC AUTONOMOUS INSTITUTION)

DATA ANALYTICS LAB

IV Year B.Tech. CSE I Semester

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Lab - Incharge HOD-CSE

(UGC AUTONOMOUS INSTITUTION)

Cheeryal (V), Keesara (M), Medchal (Dist), Telangana – 501 301.

This is to certify that Mr. / Miss ___________________________________

has satisfactorily completed________ number of experiments in the Computer Networks and

Roll No: ____________ Branch: _______ Section: _____

Year: _______________ Academic Year: ______________

Dept. of CSE In-charge

Internal Examiner External Examiner

Cheeryal (V), Keesara (M), Medchal Dist., Telangana-501301

18CS41L1-DATA ANALYTICS LAB

I​ V Year. B.Tech. (CSE) – I Sem

● 18CS2102 - Object Oriented Programming using Java

Course Outcomes (COs):

At the end of the course, student would be able to:

Week 1: Installation, Configuration, and Running of Hadoop and HDFS.

Week 2: Implementation of Word Count / Frequency Programs using MapReduce.

Week 3: Implementation of MR Program that processes a Weather Dataset.

Week 4: Implementation of Linear and Logistic Regression.

Week 5: Implementation of SVM Classification Technique.

Week 7: Implementation of Hierarchical Clustering.

Week 8: Implementation of Partitioning Clustering.

Week 10: Data Visualization using Histogram Plotting Framework.

Week 12: Application to analyze Stock Market Data using R Language.

● To be a center of excellence in instruction, innovation in research and scholarship, and

● To prepare graduates to enter a rapidly changing field as a competent computer science

● To prepare graduate capable in all phases of software development, possess a firm

● To prepare graduates to assume leadership roles by possessing good communication

PROGRAM EDUCATIONAL OBJECTIVES

● To provide graduates with a good foundation in mathematics, sciences and engineering

Engineering Graduates would be able to:

PO 2: Problem analysis: Identify, formulate, review research literature, and analyze

PO 3: Design/development of solutions: Design solutions for complex engineering

PO 4: Conduct investigations of complex problems: Use research-based knowledge and

PO 7: Environment and sustainability: Understand the impact of the professional

PO 10: Communication: Communicate effectively on complex engineering activities with

PO 11: Project management and finance: Demonstrate knowledge and understanding of

PROGRAM SPECIFIC OUTCOMES (PSOs)

6. Course Objectives and Course Outcomes

Course Outcomes (COs):

At the end of the course, student would be able to:

Course Course PEOs POs & PSOs

Mapping of Lab Course outcomes with Programme outcomes:

Course Outcomes - Program Outcomes and Program Specific Outcomes

CO1. Observe Big Data 2 1 1 1 1 - - - - - 2 2 1 2

CO2. Apply different 1 1 2 3 2 - - - - - 2 2 1 2

CO3. Demonstrate their Big 2 1 1 2 1 - - - - - 1 2 1 2

CO4. Apply each learning 1 1 1 2 1 - - - - - 1 2 1 2

CO5. Analyze needs, 2 1 1 2 1 - - - - - 1 2 1 2

● 18CS2102 - Object Oriented Programming using Java

INSTRUCTIONS TO THE STUDENTS:

1. Students are required to attend all labs.

INSTRUCTIONS TO LABORATORY TEACHERS:

Evaluation of Internal Marks:

a) 15 Marks are awarded for day to day work

1) Record and Observation book --------- 5Marks

2) Attendance and behavior of student --------- 5 Marks

3) Viva and performance ----------------5 Marks

b) 15 Marks are awarded for conducting laboratory test as follows:

1) Write up and program--------5 Marks

2) Execution of Program ---------5 Marks

3) Viva and performance ----------------5 Marks

Evaluation of External Marks:

70 Marks are awarded for conducting laboratory test as follows:

1) Algorithm ------------------- 25 Marks.

2) Write up and program--------- 15 Marks

3) Execution of Program --------- 15 Marks

4) Viva ---------------------- 15 Marks

S.No. Name of Experiment Date of Date of Marks Signature Remarks

name = readline(prompt="Input your name: ")

Input your name: abc

Roll No: ________ Branch: _ Section: ___

Year: _ Academic Year:

I V Year. B.Tech. (CSE) – I Sem