0% found this document useful (0 votes)
87 views

Geethanjali College of Engineering and Technology (Ugc Autonomous Institution)

This document provides information about the Data Analytics Lab course offered at Geethanjali College of Engineering and Technology. The course is offered to fourth year Bachelor of Technology students specializing in Computer Science and Engineering. The document outlines the course objectives, which include understanding big data elements, mathematical models, processing technologies, analytical concepts using tools like R and Python, and data visualization. It also lists the 12 experiments to be conducted in the lab over 12 weeks covering topics like Hadoop, MapReduce, machine learning algorithms, clustering, and data visualization. Formats for student certificates and mapping of the course to program educational objectives and outcomes are also included.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views

Geethanjali College of Engineering and Technology (Ugc Autonomous Institution)

This document provides information about the Data Analytics Lab course offered at Geethanjali College of Engineering and Technology. The course is offered to fourth year Bachelor of Technology students specializing in Computer Science and Engineering. The document outlines the course objectives, which include understanding big data elements, mathematical models, processing technologies, analytical concepts using tools like R and Python, and data visualization. It also lists the 12 experiments to be conducted in the lab over 12 weeks covering topics like Hadoop, MapReduce, machine learning algorithms, clustering, and data visualization. Formats for student certificates and mapping of the course to program educational objectives and outcomes are also included.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Geethanjali College of Engineering and Technology

(UGC AUTONOMOUS INSTITUTION)


(Accredited by NBA and NAAC with ‘A’ grade, Approved by AICTE
New Delhi and Affiliated to JNTUH)
Cheeryal (V), Keesara (M), Medchal (Dist), Telangana – 501 301.

DATA ANALYTICS LAB


(18CS41L1)
Laboratory Manual

IV Year B.Tech. CSE I Semester

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

2021-2022

Lab - Incharge HOD-CSE


Geethanjali College of Engineering and Technology

(UGC AUTONOMOUS INSTITUTION)

(Accredited by NBA and NAAC with ‘A’ grade, Approved by AICTE New
Delhi and Affiliated to JNTUH)

Cheeryal (V), Keesara (M), Medchal (Dist), Telangana – 501 301.

CERTIFICATE

This is to certify that Mr. / Miss ___________________________________

has satisfactorily completed________ number of experiments in the Computer Networks and


Cloud Computing Laboratory.

Roll No: ____________ Branch: _______ Section: _____

Year: _______________ Academic Year: ______________

Head Faculty

Dept. of CSE In-charge

Internal Examiner External Examiner


GEETHANJALI COLLEGE OF ENGINEERING AND TECHNOLOGY

(Autonomous)

Cheeryal (V), Keesara (M), Medchal Dist., Telangana-501301

18CS41L1-DATA ANALYTICS LAB

I​ V Year. B.Tech. (CSE) – I Sem


L T P/D C

- - 2/- 1

Prerequisite(s):

● 18CS2102 - Object Oriented Programming using Java


● 18CS2203 -Database Management Systems

Course Objectives:
Develop ability to
1. Know the basic elements of Big Data and Data science to handle huge amount of data.
2. Gain knowledge of basic mathematics behind the Big data.
3. Understand the different Big data processing technologies.
4. Apply the Analytical concepts of Big data using R and Python.
5. Visualize the Big Data using different tools.

Course Outcomes (COs):

At the end of the course, student would be able to:


CO1: Observe Big Data elements and Architectures.
CO2: Apply different mathematical models for Big Data.
CO3: Demonstrate their Big Data skills by developing different applications.
CO4: Apply each learning model for different datasets.
CO5: Analyze needs, challenges and techniques for big data visualization.

LIST OF EXPERIMENTS

Week 1: Installation, Configuration, and Running of Hadoop and HDFS.

Week 2: Implementation of Word Count / Frequency Programs using MapReduce.

Week 3: Implementation of MR Program that processes a Weather Dataset.

Week 4: Implementation of Linear and Logistic Regression.

Week 5: Implementation of SVM Classification Technique.


Week 6: Implementation of Decision Tree Classification Technique.

Week 7: Implementation of Hierarchical Clustering.

Week 8: Implementation of Partitioning Clustering.

Week 9: Data Visualization using Pie, Bar, Boxplot Chart Plotting Framework.

Week 10: Data Visualization using Histogram Plotting Framework.

Week 11: Data Visualization using Line Graph Plotting, Scatterplot Plotting Framework.

Week 12: Application to analyze Stock Market Data using R Language.


Mission of the Department

● To be a center of excellence in instruction, innovation in research and scholarship, and


service to the stake holders, the profession, and the public.

● To prepare graduates to enter a rapidly changing field as a competent computer science


engineer.

● To prepare graduate capable in all phases of software development, possess a firm


understanding of hardware technologies, have the strong mathematical background
necessary for scientific computing, and be sufficiently well versed in general theory to
allow growth within the discipline as it advances.

● To prepare graduates to assume leadership roles by possessing good communication


skills, the ability to work effectively as team members, and an appreciation for their social
and ethical responsibility in a global setting.

PROGRAM EDUCATIONAL OBJECTIVES

● To provide graduates with a good foundation in mathematics, sciences and engineering


fundamentals required to solve engineering problems that will facilitate them to find
employment in industry and / or to pursue postgraduate studies with an appreciation for
lifelong learning.

● To provide graduates with analytical and problem solving skills to design algorithms, other
hardware / software systems, and inculcate professional ethics, inter-personal skills to
work in a multi-cultural team.

● To facilitate graduates to get familiarized with the art software / hardware tools, imbibing
creativity and innovation that would enable them to develop cutting-edge technologies of
multi-disciplinary nature for societal development.
PROGRAM OUTCOMES (POs)

Program Outcomes (POs) describe what students are expected to know and be able to do by
the time of graduation to accomplish Program Educational Objectives (PEOs). The Program
Outcomes for Computer Science and Engineering graduates are:

Engineering Graduates would be able to:


PO 1: Engineering knowledge: Apply the knowledge of mathematics, science, engineering
fundamentals, and an engineering specialization to the solution of complex engineering
problems.

PO 2: Problem analysis: Identify, formulate, review research literature, and analyze


complex engineering problems reaching substantiated conclusions using first principles of
mathematics, natural sciences, and engineering sciences.

PO 3: Design/development of solutions: Design solutions for complex engineering


problems and design system components or processes that meet the specified needs with
appropriate consideration for the public health and safety, and the cultural, societal, and
environmental considerations.

PO 4: Conduct investigations of complex problems: Use research-based knowledge and


research methods including design of experiments, analysis and interpretation of data, and
synthesis of the information to provide valid conclusions.

PO 5: Modern tool usage: Create, select, and apply appropriate techniques, resources, and
modern engineering and IT tools including prediction and modeling to complex engineering
activities with an understanding of the limitations.

PO 6: The engineer and society: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent responsibilities
relevant to the professional engineering practice.

PO 7: Environment and sustainability: Understand the impact of the professional


engineering solutions in societal and environmental contexts, and demonstrate the knowledge
of, and need for sustainable development.

PO 8: Ethics: Apply ethical principles and commit to professional ethics and responsibilities
and norms of the engineering practice.
PO 9: Individual and team work: Function effectively as an individual, and as a member or
leader in diverse teams, and in multidisciplinary settings.

PO 10: Communication: Communicate effectively on complex engineering activities with


the engineering community and with society at large, such as, being able to comprehend and
write effective reports and design documentation, make effective presentations, and give and
receive clear instructions.

PO 11: Project management and finance: Demonstrate knowledge and understanding of


the engineering and management principles and apply these to one’s own work, as a member
and leader in a team, to manage projects and in multidisciplinary environments.

PO 12: Life-long learning: Recognize the need for, and have the preparation and ability to
engage in independent and life-long learning in the broadest context of technological change.

PROGRAM SPECIFIC OUTCOMES (PSOs)


PSO 1: To identify and define the computing requirements appropriate for its solution under
given constraints.

PSO 2: To follow the best practices, namely, SEI-CMM levels and 6-sigma which varies
from time to time for software development projects using open-ended programming
environments to produce software deliverables as per customer needs.

6. Course Objectives and Course Outcomes

Course Objectives:
Develop ability to
1. Know the basic elements of Big Data and Data science to handle huge amount of data.
2. Gain knowledge of basic mathematics behind the Big data.
3. Understand the different Big data processing technologies.
4. Apply the Analytical concepts of Big data using R and Python.
5. Visualize the Big Data using different tools.

Course Outcomes (COs):

At the end of the course, student would be able to:


CO1. Observe Big Data elements and Architectures.
CO2. Apply different mathematical models for Big Data.
CO3. Demonstrate their Big Data skills by developing different applications.
CO4. Apply each learning model for different datasets.
CO5. Analyze needs, challenges and techniques for big data visualization.
Mapping of Lab Course with Programme Educational Objectives

Course Course PEOs POs & PSOs


Code
DATA ANALYTICS LAB PEO1,
18CS41L1
PEO2, PO1,PO2,PO3,PO4,PO5,
PEO3 PO11,PO12,PSO1,PSO2

Mapping of Lab Course outcomes with Programme outcomes:

Course Outcomes - Program Outcomes and Program Specific Outcomes


DATA ANALYTICS 1 2 3 4 5 6 7 8 9 10 11 12 PSO1 PSO2
(18CS41L1)

CO1. Observe Big Data 2 1 1 1 1 - - - - - 2 2 1 2


elements and Architectures.

CO2. Apply different 1 1 2 3 2 - - - - - 2 2 1 2


mathematical models for
Big Data.

CO3. Demonstrate their Big 2 1 1 2 1 - - - - - 1 2 1 2


Data skills by developing
different applications.

CO4. Apply each learning 1 1 1 2 1 - - - - - 1 2 1 2


model for different datasets.

CO5. Analyze needs, 2 1 1 2 1 - - - - - 1 2 1 2


challenges and techniques
for big data visualization
Prerequisites:

● 18CS2102 - Object Oriented Programming using Java


● 18CS2203 -Database Management Systems

INSTRUCTIONS TO THE STUDENTS:

1. Students are required to attend all labs.


2. Students should be dressed in formals when attending the laboratory sessions.
3. Students will work individually in computer laboratories.
4. While coming to the lab bring the observation book and Work book etc.
5. Before coming to the lab, prepare the pre-lab questions. Read through the lab
experiment to familiarize you.
6. Utilize 3 hours’ time properly to perform the experiment and noting down the
outputs.
7. If the experiment is not completed in the prescribed time, the pending work
has to be done in the leisure hour or extended hours.
8. You will be expected to submit the completed work book according to the
deadlines set up by your instructor.

INSTRUCTIONS TO LABORATORY TEACHERS:

• Observation book and lab records submitted for the lab work are to be checked and
signed before the next lab session.
• Students should be instructed to switch ON the power supply after the connections are
checked by the lab assistant / teacher.
• The promptness of submission should be strictly insisted by awarding the marks
accordingly.
• Ask viva questions at the end of the experiment.
• Do not allow students who come late to the lab class.
• Encourage the students to do the experiments innovatively.
• Fill continuous Evaluation sheet, on regular basis.
• Ensure that the students are dressed in formals
Scheme of Lab Exam Evaluation:

Evaluation of Internal Marks:

a) 15 Marks are awarded for day to day work

1) Record and Observation book --------- 5Marks

2) Attendance and behavior of student --------- 5 Marks

3) Viva and performance ----------------5 Marks

b) 15 Marks are awarded for conducting laboratory test as follows:

1) Write up and program--------5 Marks

2) Execution of Program ---------5 Marks

3) Viva and performance ----------------5 Marks

Evaluation of External Marks:

70 Marks are awarded for conducting laboratory test as follows:

1) Algorithm ------------------- 25 Marks.

2) Write up and program--------- 15 Marks

3) Execution of Program --------- 15 Marks

4) Viva ---------------------- 15 Marks


PERFORMANCE INDICATOR

S.No. Name of Experiment Date of Date of Marks Signature Remarks


Exp. Submission
S.No. Name of Experiment Date of Date of Marks Signature Remarks
Exp. Submission
WEEK 1
IMPLEMENT BASIC PROGRAMS IN R BY USING DATA STRUCTURES
> myString <- "Hello, World!"
> print ( myString)
[1] "Hello, World!"

Vectors:
apple <- c('red','green',"yellow")
print(apple)
print(class(apple))
[1] "red" "green" "yellow"
[1] "character"

Lists:
list1 <- list(c(2,5,3),21.3,sin)
print(list1)
[[1]]
[1] 2 5 3
[[2]]
[1] 21.3
[[3]]
function (x) .Primitive("sin")

Matrices:
M=matrix(c('a','a','b','c','b','a'),
nrow=2,ncol=3,byrow=TRUE)
print(M)
[,1] [,2] [,3]
[1,] "a" "a" "b"
[2,] "c" "b" "a"

Arrays:
a <- array(c('green','yellow'),dim = c(3,3,1))
print(a)
,,1
[,1] [,2] [,3]
[1,] "green" "yellow" "green"
[2,] "yellow" "green" "yellow"
[3,] "green" "yellow" "green"

Data Frames:
BMI <- data.frame(
gender = c("Male", "Male","Female"),
height = c(152, 171.5, 165),
weight = c(81,93, 78),
Age = c(42,38,26)
)
print(BMI)
gender height weight Age
1 Male 152.0 81 42
2 Male 171.5 93 38
3 Female 165.0 78 26
WEEK2
1. Write a R program to take input from the user (name and age) and display the
values. Also print the version of R installation.

name = readline(prompt="Input your name: ")


age = readline(prompt="Input your age: ")
print(paste("My name is",name, "and I am",age ,"years old."))
print(R.version.string)

Output

Input your name: abc


Input your age: 25
[1] "My name is abc and I am 25 years old."
[1] "R version 3.4.4 (2018-03-15)"

2. Write a R program to get the details of the objects in memory. 

name = "Python";
n1 = 10;
n2 = 0.5
nums = c(10, 20, 30, 40, 50, 60)
print(ls())
print("Details of the objects in memory:")
print(ls.str())

Output

[1] "n1" "n2" "name" "nums"


[1] "Details of the objects in memory:"
n1 : num 10
n2 : num 0.5
name : chr "Python"
nums : num [1:6] 10 20 30 40 50 60

3.  Write a R program to create a sequence of numbers from 20 to 50 and find the mean
of numbers from 20 to 60 and sum of numbers from 51 to 91.

print("Sequence of numbers from 20 to 50:")


print(seq(20,50))
print("Mean of numbers from 20 to 60:")
print(mean(20:60))
print("Sum of numbers from 51 to 91:")
print(sum(51:91))

Output

[1] "Sequence of numbers from 20 to 50:"


[1] 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
[26] 45 46 47 48 49 50
[1] "Mean of numbers from 20 to 60:"
[1] 40
[1] "Sum of numbers from 51 to 91:"
[1] 2911

4. Write a R program to create a vector which contains 10 random integer values


between -50 and +50. 

v = sample(-50:50, 10, replace=TRUE)


print("Content of the vector:")
print("10 random integer values between -50 and +50:")
print(v)

Output

[1] "Content of the vector:"


[1] "10 random integer values between -50 and +50:"
[1] 31 -13 -21 42 49 -39 20 12 39 -2

5. Write a R program to get the first 10 Fibonacci numbers.

Fibonacci <- numeric(10)


Fibonacci[1] <- Fibonacci[2] <- 1
for (i in 3:10) Fibonacci[i] <- Fibonacci[i - 2] + Fibonacci[i - 1]
print("First 10 Fibonacci numbers:")
print(Fibonacci)

Output

[1] "First 10 Fibonacci numbers:"


[1] 1 1 2 3 5 8 13 21 34 55

6.Write a R program to get all prime numbers up to a given number (based on the sieve
of Eratosthenes).

prime_numbers <- function(n) {


if (n >= 2) {
x = seq(2, n)
prime_nums = c()
for (i in seq(2, n)) {
if (any(x == i)) {
prime_nums = c(prime_nums, i)
x = c(x[(x %% i) != 0], i)
}
}
return(prime_nums)
}
else
{
stop("Input number should be at least 2.")
}
}
prime_numbers(12)

Output

[1] 2 3 5 7 11

7.Write a R program to print the numbers from 1 to 100 and print "Fizz" for multiples
of 3, print "Buzz" for multiples of 5, and print "FizzBuzz" for multiples of both

for (n in 1:100) {

if (n %% 3 == 0 & n %% 5 == 0) {print("FizzBuzz")}

else if (n %% 3 == 0) {print("Fizz")}

else if (n %% 5 == 0) {print("Buzz")}

else print(n)

Ouput

[1] 1
[1] 2
[1] "Fizz"
[1] 4
[1] "Buzz"
[1] "Fizz"
[1] 7
[1] 8
[1] "Fizz"
[1] "Buzz"
[1] 11
[1] "Fizz"
[1] 13
[1] 14
[1] "FizzBuzz"
[1] 16
[1] 17
[1] "Fizz"
[1] 19
[1] "Buzz"
[1] "Fizz"
[1] 22
[1] 23
[1] "Fizz"
[1] "Buzz"
[1] 26
[1] "Fizz"
[1] 28
[1] 29
[1] "FizzBuzz"
[1] 31
[1] 32
[1] "Fizz"
[1] 34
[1] "Buzz"
[1] "Fizz"
[1] 37
[1] 38
[1] "Fizz"
[1] "Buzz"
[1] 41
[1] "Fizz"
[1] 43
[1] 44
[1] "FizzBuzz"
[1] 46
[1] 47
[1] "Fizz"
[1] 49
[1] "Buzz"
[1] "Fizz"
[1] 52
[1] 53
[1] "Fizz"
[1] "Buzz"
[1] 56
[1] "Fizz"
[1] 58
[1] 59
[1] "FizzBuzz"
[1] 61
[1] 62
[1] "Fizz"
[1] 64
[1] "Buzz"
[1] "Fizz"
[1] 67
[1] 68
[1] "Fizz"
[1] "Buzz"
[1] 71
[1] "Fizz"
[1] 73
[1] 74
[1] "FizzBuzz"
[1] 76
[1] 77
[1] "Fizz"
[1] 79
[1] "Buzz"
[1] "Fizz"
[1] 82
[1] 83
[1] "Fizz"
[1] "Buzz"
[1] 86
[1] "Fizz"
[1] 88
[1] 89
[1] "FizzBuzz"
[1] 91
[1] 92
[1] "Fizz"
[1] 94
[1] "Buzz"
[1] "Fizz"
[1] 97
[1] 98
[1] "Fizz"
[1] "Buzz"

8.Write a R program to extract first 10 english letter in lower case and last 10 letters in
upper case and extract letters between 22nd to 24th letters in upper case.

print("First 10 letters in lower case:")


t = head(letters, 10)
print(t)
print("Last 10 letters in upper case:")
t = tail(LETTERS, 10)
print(t)
print("Letters between 22nd to 24th letters in upper case:")
e = tail(LETTERS[22:24])
print(e)

Output

[1] "First 10 letters in lower case:"


[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
[1] "Last 10 letters in upper case:"
[1] "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z"
[1] "Letters between 22nd to 24th letters in upper case:"
[1] "V" "W" "X"
9.Write a R program to find the factors of a given number.

print_factors = function(n) {
print(paste("The factors of",n,"are:"))
for(i in 1:n) {
if((n %% i) == 0) {
print(i)
}
}
}
print_factors(4)
print_factors(7)
print_factors(12)

Output

[1] "The factors of 4 are:"


[1] 1
[1] 2
[1] 4
[1] "The factors of 7 are:"
[1] 1
[1] 7
[1] "The factors of 12 are:"
[1] 1
[1] 2
[1] 3
[1] 4
[1] 6
[1] 12

10.Write a R program to find the maximum and the minimum value of a given vector.

nums = c(10, 20, 30, 40, 50, 60)


print('Original vector:')
print(nums)
print(paste("Maximum value of the said vector:",max(nums)))
print(paste("Minimum value of the said vector:",min(nums)))

Output

[1] "Original vector:"


[1] 10 20 30 40 50 60
[1] "Maximum value of the said vector: 60"
[1] "Minimum value of the said vector: 10"
11. Write a R program to get the unique elements of a given string and unique numbers
of vector

str1 = "The quick brown fox jumps over the lazy dog."

print("Original vector(string)")
print(str1)
print("Unique elements of the said vector:")
print(unique(tolower(str1)))
nums = c(1, 2, 2, 3, 4, 4, 5, 6)
print("Original vector(number)")
print(nums)
print("Unique elements of the said vector:")
print(unique(nums))

Output

[1] "Original vector(string)"


[1] "The quick brown fox jumps over the lazy dog."
[1] "Unique elements of the said vector:"
[1] "the quick brown fox jumps over the lazy dog."
[1] "Original vector(number)"
[1] 1 2 2 3 4 4 5 6
[1] "Unique elements of the said vector:"
[1] 1 2 3 4 5 6
12. Write a R program to create three vectors a,b,c with 3 integers. Combine the three
vectors to become a 3×3 matrix where each column represents a vector. Print the content of
the matrix

a<-c(1,2,3)
b<-c(4,5,6)
c<-c(7,8,9)
m<-cbind(a,b,c)
print("Content of the said matrix:")
print(m)

Output

[1] "Content of the said matrix:"


abc
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
WEEK-3

SUMMARIZING THE STATISTICS

1. summary(iris)

Output:

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 setosa :50

1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 versicolor:50

Median :5.800 Median :3.000 Median :4.350 Median :1.300 virginica :50

Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199

3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800

Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500

2. str(mtcars)

Output:

'data.frame': 32 obs. of 11 variables:

$ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...

$ cyl : num 6 6 4 6 8 6 8 4 4 6 ...

$ disp: num 160 160 108 258 360 ...

$ hp : num 110 110 93 110 175 105 245 62 95 123 ...

$ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...

$ wt : num 2.62 2.88 2.32 3.21 3.44 ...

$ qsec: num 16.5 17 18.6 19.4 17 ...

$ vs : num 0 0 1 1 0 1 0 1 1 1 ...

$ am : num 1 1 1 0 0 0 0 0 0 0 ...

$ gear: num 4 4 4 3 3 3 3 4 4 4 ...

$ carb: num 4 4 1 1 2 1 4 2 2 4 ...


3. head(mtcars)

Output:

mpg cyl disp hp drat wt qsec vs am gear carb

Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4

Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4

Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1

Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1

Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2

Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

4. tail(mtcars)

Output:

mpg cyl disp hp drat wt qsec vs am gear carb

Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.7 0 1 5 2

Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.9 1 1 5 2

Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.5 0 1 5 4

Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.5 0 1 5 6

Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.6 0 1 5 8

Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.6 1 1 4 2

5. names(mtcars)

Output:

[1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear" "carb"

6. nrow(mtcars)

Output:

[1] 32
7. aggregate(Sepal.Length~Species,iris,mean)

Output:

Species Sepal.Length

1 setosa 5.006

2 versicolor 5.936

3 virginica 6.588

8. fix(iris)

Output:

9. sepalsub<- subset(iris,Sepal.Length>7)

sepalsub

Output:

Sepal.Length Sepal.Width Petal.Length Petal.Width Species var6

103 7.1 3.0 5.9 2.1 virginica


<NA>
106 7.6 3.0 6.6 2.1 virginica
<NA>

108 7.3 2.9 6.3 1.8 virginica


<NA>

110 7.2 3.6 6.1 2.5 virginica


<NA>

118 7.7 3.8 6.7 2.2 virginica


<NA>

119 7.7 2.6 6.9 2.3 virginica


<NA>

123 7.7 2.8 6.7 2.0 virginica


<NA>

126 7.2 3.2 6.0 1.8 virginica


<NA>

130 7.2 3.0 5.8 1.6 virginica


<NA>

131 7.4 2.8 6.1 1.9 virginica


<NA>

132 7.9 3.8 6.4 2.0 virginica


<NA>

136 7.7 3.0 6.1 2.3 virginica


<NA>
WEEK 4

BINOMIAL DISTRIBUTION

1. choose(10,3)*((1/6)^3*(5/6)^7)

Output:

[1] 0.1550454

2. dbinom(3,size=10,prob=(1/6))

Output:

[1] 0.1550454

3. choose(10,0)*((1/6)^0*(5/6)^10) +

choose(10,1)*((1/6)^1*(5/6)^9)+

choose(10,2)*((1/6)^2*(5/6)^8)+

choose(10,3)*((1/6)^3*(5/6)^7)

Output:

[1] 0.9302722

4. pbinom(3,size=10,prob=(1/6),lower=T)

Output:

[1] 0.9302722

5. pbinom(3,size=10,prob=(1/6),lower=F)

Output:

[1] 0.06972784

6. dbinom(4,size=12,prob=0.2)

Output:

[1] 0.1328756
7. dbinom(0,size=12,prob=0.2)+

dbinom(1,size=12,prob=0.2)+

dbinom(2,size=12,prob=0.2)+

dbinom(3,size=12,prob=0.2)+

dbinom(4,size=12,prob=0.2)

Output:

[1] 0.9274445

8. pbinom(4,size=12,prob=0.2)

Output:

[1] 0.9274445

9. x <- pbinom(26,51,0.5)

print(x)

Output:

[1] 0.610116

10. x <- seq(0,50,by = 1)


y <- dbinom(x,50,0.5)
plot(x,y)
Output:
11. x <- qbinom(0.25,51,1/2)

print(x)

Output:

[1] 23

12. x <- rbinom(8,150,.4)

print(x)

Output:

[1] 60 71 57 60 62 62 50 59

POISSON DISTRIBUTION

1. ppois(16,lambda = 12)

Output:

[1] 0.898709

2. ppois(16,lambda = 12,lower=F)

Output:

[1] 0.101291

3. rpois(16,lambda = 12)

Output:

[1] 12 8 12 10 13 8 11 13 12 11 15 12 10 11 14 11

4. dpois(16,lambda = 12)

Output:

[1] 0.05429334
NORMAL DISTRIBUTION

1. pnorm(84,mean=72,sd=15.2,lower.tail=FALSE)

Output:

[1] 0.2149176

2. x <- seq(-10, 10, by = .1)

y <- dnorm(x, mean = 2.5, sd = 0.5)

plot(x,y)

Output:

3. x <- seq(-10,10,by = .2)

y <- pnorm(x, mean = 2.5, sd = 2)

plot(x,y)

Output:
4. x <- seq(0, 1, by = 0.02)

y <- qnorm(x, mean = 2, sd = 1)

plot(x,y)

Output:

5. y <- rnorm(50)

hist(y, main = "Normal DIstribution")

Output:
LINEAR REGRSSION

Write a program to implement linear and logistic regression

head(cars)

Output:

speed dist

1 4 2

2 4 10

3 7 4

4 7 22

5 8 16

6 9 10

scatter.smooth(x=cars$speed,y=cars$dist,main="dist~speed")

Output:

linearMod<-lm(dist~speed,data=cars)

print(linearMod)

Output:
Coefficients:

(Intercept) speed

-17.579 3.932

summary(linearMod)

Output:

Residuals:

Min 1Q Median 3Q Max

-29.069 -9.525 -2.272 9.215 43.201

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -17.5791 6.7584 -2.601 0.0123 *

speed 3.9324 0.4155 9.464 1.49e-12 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 15.38 on 48 degrees of freedom

Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438

F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12

LOGISTIC REGRESSION

input<-mtcars[,c("am","cyl","hp","wt")]

print(input)

Output:

am cyl hp wt

Mazda RX4 1 6 110 2.620

Mazda RX4 Wag 1 6 110 2.875

Datsun 710 1 4 93 2.320

Hornet 4 Drive 0 6 110 3.215

Hornet Sportabout 0 8 175 3.440

Valiant 0 6 105 3.460


Duster 360 0 8 245 3.570

Merc 240D 0 4 62 3.190

Merc 230 0 4 95 3.150

Merc 280 0 6 123 3.440

Merc 280C 0 6 123 3.440

Merc 450SE 0 8 180 4.070

Merc 450SL 0 8 180 3.730

input<-mtcars[,c("am","cyl","hp","wt")]

am.data=glm(formula=am~cyl+hp+wt,data=input,family=binomial)

print(summary(am.data))

Output:

Deviance Residuals:

Min 1Q Median 3Q Max

-2.17272 -0.14907 -0.01464 0.14116 1.27641

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept) 19.70288 8.11637 2.428 0.0152 *

cyl 0.48760 1.07162 0.455 0.6491

hp 0.03259 0.01886 1.728 0.0840 .

wt -9.14947 4.15332 -2.203 0.0276 *

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 43.2297 on 31 degrees of freedom

Residual deviance: 9.8415 on 28 degrees of freedom

AIC: 17.841

Number of Fisher Scoring iterations: 8

You might also like