0% found this document useful (0 votes)
19 views

Practical (Bda)

Uploaded by

2203051057108
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Practical (Bda)

Uploaded by

2203051057108
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Faculty Of Engineering& Technology

BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

FACULTY OF ENGINEERING AND TECHNOLOGY

Big Data Analytics (203105348)


7th SEMESTER

7A9 (CSE)

Name: Aashutosh.S.yadav

Year/Sem: 4th

Enrolment No. 2203051057108

Course: B-tech(CSE)

2203051057108 1
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

CERTIFICATE

This is to certify that

Mr./Ms..............................................................................................................
with enrolment no. ................................................................ has
successfully completed his/her laboratory experiments in the Big Data
Analytics (203105348) From the Department of

...................................................................................................

during the academic year ............................................

Date of Submission: -........................... Staff In charge: -...........................

2203051057108 2
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

INDEX
MARKS SIGN
PAGE NO.
SR DATE OF DATE OF
NO PRACTICAL LIST START COMPLETION From To
1 15-06-2024 15-06-2024 4 6
To understand the overall
programming architecture
using Map Reduce API.
2 22-06-2024 29-06-2024 7 10
Write a program of Word
Count in Map Reduce over
HDFS.
3
Basic CRUD operations in
MongoDB
4
Store the basic information
about students such as roll
no, name, date of birth, and
address of student using
various collection types
such as List, Set and Map.
5
Basic commands available
for the Hadoop Distributed
File System
6
Basic commands available
for HIVE Query Language.
7
Basic commands of HBASE
Shell.
8
Creating the HDFS tables
and loading them in Hive
and learn joining of tables
in Hive

2203051057108 3
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

Practical:1
AIM: To understand the overall programming architecture using Map Reduce API.

The MapReduce task is mainly divide into into two phase map phase and Reduce Phase.
1. Map(), filter(), and reduce() in python.
2. These functions are most commonly used with lambda function.
1.Map():
“A map function execute certain instructions or functionality provided to it on every item of an
iterable could be a list, tuple, set, etc.
SYNTAX:
Map(function,iterable)
EXAMPLE:
items=[1,2,3,4,5]
a=list(map((lambda x: x **3), items)) print(a)

2.Filter():-
“A filter function in python tests a specific user-defined confition for a function and returns an iterable
for the elements and values that satisfy the condition or, in other words, return true.”

2203051057108 4
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

SYNTAX:
Filter(function, iterable)

EXAMPLE:
a=[1,2,3,4,5] b=[2,5,0,7,3]
c=list(filter(lambda x: x in a,b)) print(c)#
prints out[2,5,3]

3.Reduce():
“Reduce function apply a function to every item of an iterable and gives back a single value as a
resultant”.
We have to import the reduce function from functools module using the statement.
SYNTAX:
reduce(function, iterable)
EXAMPLE:
from functools import reduce a=reduce((lambda
x, y: x*y),[1,2,3,4,]) print(a)

2203051057108 5
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

2203051057108 6
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

Practical-2

Aim: Write a program of Word Count in Map Reduce over HDFS.


Description:
MapReduce is a framework for processing large datasets using a large number of computers
(nodes), collectively referred to as a cluster. Processing can occur on data stored in a file system
(HDFS).A method for distributing computation across multiple nodes.Each node processes the
data that is stored at that node.
Consists of two main phases
Mapper Phase
Reduce phase

Map

I/P File Reduce HDFS

Map

Input data set is split into independent blocks – processed in parallel. Each input split is
converted in Key Value pairs. Mapper logic processes each key value pair and produces and
intermediate key value pairs based on the implementation logic. Resultant key value pairs can
be of different type from that of input key value pairs. The output of Mapper is passed to the
reducer. Output of Mapper function is the input for Reducer. Reducer sorts the intermediate key
value pairs. Applies reducer logic upon the key value pairs and produces the output in desired
format.Output is stored in HDFS.

2203051057108 7
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

The overall MapReduce word count process

Python Code
import urllib.request import

random from operator import

itemgetter

import_word = {}
import_count = 0 story = 'http://sixty-

north.com/c/t.txt' request

=urllib.request.Request(story) response =

urllib.request.urlopen(request)

each_word = [] words = 1 same_words = {}

word = []

2203051057108 8
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

""" lopping the entire file"""

#Collect All the words into a list

for line in response: #print

"Line " , line line_words =

line.split() for word in

line_words:

each_word.append(word)

for words in each_word: if words


.lower() not in same_words.keys():

same_words[words.lower()] = 1

else:

same_words[words.lower()] =same_words[words.lower()]= +1

for each in same_words.keys():


print("word = ",each,"count = ",same_words[each]

2203051057108 9
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

Output:-

2203051057108 10
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

Practical-3

Aim: Basic CRUD operations in MongoDB

1. Create Operations

The create or insert operations are used to insert or add new documents in the
collection. If a collection does not exist, then it will create a new collection in the
database.
You can perform, create operations using the following methods provided by the
MongoDB:
Method Description

db.collection.insertOne() It is used to insert a single document in the collection.

db.collection.insertMany() It is used to insert multiple documents in the collection.

db.createCollection() It is used to create an empty collection.

2. Read Operations
The Read operations are used to retrieve documents from the collection, or in other
words, read operations are used to query a collection for a document.
You can perform read operation using the following method provided by the
MongoDB:
Method Description

db.collection.find() It is used to retrieve documents from the collection.


Note: pretty() method is used to decorate the result such that it is easy to read.

3. Update Operations
The update operations are used to update or modify the existing document in the
collection. You can perform update operations using the following methods provided
by the MongoDB:

2203051057108 11
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

Method Description

It is used to update a single document in the collection


db.collection.updateOne() that satisfy the given criteria.

It is used to update multiple documents in the collection


db.collection.updateMany() that satisfy the given criteria.

It is used to replace single document in the collection


db.collection.replaceOne() that satisfy the given criteria.

Update Operations Example


Let’s look at some examples of the update operation from CRUD in MongoDB.

Example 1: In this example, we are updating the age of Submit in the student collection
using db.collection.updateOne() method.

Example 2: In this example, we are updating the year of course in all the documents
in the student collection using db.collection.updateMany() method.

4. Delete Operations
The delete operation are used to delete or remove the documents from a collection. You
can perform delete operations using the following methods provided by the MongoDB:
Method Description

It is used to delete a single document from the collection


db.collection.deleteOne() that satisfy the given criteria.

It is used to delete multiple documents from the collection


db.collection.deleteMany() that satisfy the given criteria.

Delete Operations Examples


Let’s look at some examples of delete operation from CRUD in MongoDB.
Example 1: In this example, we are deleting a document from the student collection
using db.collection.deleteOne() method.

Example 2: In this example, we are deleting all the documents from the student
collection using db.collection.deleteMany() method.

2203051057108 12
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

MongoDB Code:

Output:

2203051057108 13
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

• MongoDB Database Example:

db.orders.insertMany( [
{ id: 0, name: "Pepperoni", size: "small", price: 19,
quantity: 10, date: ISODate( "2021-03-13T08:14:30Z" ) },
{ id: 1, name: "Pepperoni", size: "medium", price: 20,
quantity: 20, date : ISODate( "2021-03-13T09:13:24Z" ) },
{ id: 2, name: "Pepperoni", size: "large", price: 21,
quantity: 30, date : ISODate( "2021-03-17T09:22:12Z" ) },
{ id: 3, name: "Cheese", size: "small", price: 12,
quantity: 15, date : ISODate( "2021-03-13T11:21:39.736Z" ) },
{ id: 4, name: "Cheese", size: "medium", price: 13,
quantity:50, date : ISODate( "2022-01-12T21:23:13.331Z" ) },
{ id: 5, name: "Cheese", size: "large", price: 14,
quantity: 10, date : ISODate( "2022-01-12T05:08:13Z" ) },
{ id: 6, name: "Vegan", size: "small", price: 17,
quantity: 10, date : ISODate( "2021-01-13T05:08:13Z" ) },
{ id: 7, name: "Vegan", size: "medium", price: 18,
quantity: 10, date : ISODate( "2021-01-13T05:10:13Z" ) }
])

db.orders.find({size: "medium"});

db.orders.insert({id: 9, name: "Vegan", size: "medium", price: 8,


quantity: 5, date : ISODate( "2021-01-22T05:10:13Z" )})
db.orders.updateMany({name:'Vegan'},{$set:{name:'Veg'}})
db.orders.find({name: 'Veg'});
db.orders.remove({name:'Pepperoni'})
db.orders.find({ $and: [ {name: 'Veg'}, { name: 'Cheese'} ] })
//db.orders.find()
db.orders.find({ $or: [ {quantity: 10}, { name: 'Cheese'} ] })
//db.orders.find()
db.orders.find( { "quantity": { $not: { $gt: '20' } } } )

db.orders.find({size: "medium"});

db.orders.find( { "quantity": { $not: { $gt: '20' } } } )

2203051057108 14
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester

2203051057108 15

You might also like