Practical (Bda)
Practical (Bda)
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
7A9 (CSE)
Name: Aashutosh.S.yadav
Year/Sem: 4th
Course: B-tech(CSE)
2203051057108 1
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
CERTIFICATE
Mr./Ms..............................................................................................................
with enrolment no. ................................................................ has
successfully completed his/her laboratory experiments in the Big Data
Analytics (203105348) From the Department of
...................................................................................................
2203051057108 2
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
INDEX
MARKS SIGN
PAGE NO.
SR DATE OF DATE OF
NO PRACTICAL LIST START COMPLETION From To
1 15-06-2024 15-06-2024 4 6
To understand the overall
programming architecture
using Map Reduce API.
2 22-06-2024 29-06-2024 7 10
Write a program of Word
Count in Map Reduce over
HDFS.
3
Basic CRUD operations in
MongoDB
4
Store the basic information
about students such as roll
no, name, date of birth, and
address of student using
various collection types
such as List, Set and Map.
5
Basic commands available
for the Hadoop Distributed
File System
6
Basic commands available
for HIVE Query Language.
7
Basic commands of HBASE
Shell.
8
Creating the HDFS tables
and loading them in Hive
and learn joining of tables
in Hive
2203051057108 3
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
Practical:1
AIM: To understand the overall programming architecture using Map Reduce API.
The MapReduce task is mainly divide into into two phase map phase and Reduce Phase.
1. Map(), filter(), and reduce() in python.
2. These functions are most commonly used with lambda function.
1.Map():
“A map function execute certain instructions or functionality provided to it on every item of an
iterable could be a list, tuple, set, etc.
SYNTAX:
Map(function,iterable)
EXAMPLE:
items=[1,2,3,4,5]
a=list(map((lambda x: x **3), items)) print(a)
2.Filter():-
“A filter function in python tests a specific user-defined confition for a function and returns an iterable
for the elements and values that satisfy the condition or, in other words, return true.”
2203051057108 4
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
SYNTAX:
Filter(function, iterable)
EXAMPLE:
a=[1,2,3,4,5] b=[2,5,0,7,3]
c=list(filter(lambda x: x in a,b)) print(c)#
prints out[2,5,3]
3.Reduce():
“Reduce function apply a function to every item of an iterable and gives back a single value as a
resultant”.
We have to import the reduce function from functools module using the statement.
SYNTAX:
reduce(function, iterable)
EXAMPLE:
from functools import reduce a=reduce((lambda
x, y: x*y),[1,2,3,4,]) print(a)
2203051057108 5
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
2203051057108 6
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
Practical-2
Map
Map
Input data set is split into independent blocks – processed in parallel. Each input split is
converted in Key Value pairs. Mapper logic processes each key value pair and produces and
intermediate key value pairs based on the implementation logic. Resultant key value pairs can
be of different type from that of input key value pairs. The output of Mapper is passed to the
reducer. Output of Mapper function is the input for Reducer. Reducer sorts the intermediate key
value pairs. Applies reducer logic upon the key value pairs and produces the output in desired
format.Output is stored in HDFS.
2203051057108 7
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
Python Code
import urllib.request import
itemgetter
import_word = {}
import_count = 0 story = 'http://sixty-
north.com/c/t.txt' request
=urllib.request.Request(story) response =
urllib.request.urlopen(request)
word = []
2203051057108 8
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
line_words:
each_word.append(word)
same_words[words.lower()] = 1
else:
same_words[words.lower()] =same_words[words.lower()]= +1
2203051057108 9
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
Output:-
2203051057108 10
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
Practical-3
1. Create Operations
The create or insert operations are used to insert or add new documents in the
collection. If a collection does not exist, then it will create a new collection in the
database.
You can perform, create operations using the following methods provided by the
MongoDB:
Method Description
2. Read Operations
The Read operations are used to retrieve documents from the collection, or in other
words, read operations are used to query a collection for a document.
You can perform read operation using the following method provided by the
MongoDB:
Method Description
3. Update Operations
The update operations are used to update or modify the existing document in the
collection. You can perform update operations using the following methods provided
by the MongoDB:
2203051057108 11
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
Method Description
Example 1: In this example, we are updating the age of Submit in the student collection
using db.collection.updateOne() method.
Example 2: In this example, we are updating the year of course in all the documents
in the student collection using db.collection.updateMany() method.
4. Delete Operations
The delete operation are used to delete or remove the documents from a collection. You
can perform delete operations using the following methods provided by the MongoDB:
Method Description
Example 2: In this example, we are deleting all the documents from the student
collection using db.collection.deleteMany() method.
2203051057108 12
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
MongoDB Code:
Output:
2203051057108 13
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
db.orders.insertMany( [
{ id: 0, name: "Pepperoni", size: "small", price: 19,
quantity: 10, date: ISODate( "2021-03-13T08:14:30Z" ) },
{ id: 1, name: "Pepperoni", size: "medium", price: 20,
quantity: 20, date : ISODate( "2021-03-13T09:13:24Z" ) },
{ id: 2, name: "Pepperoni", size: "large", price: 21,
quantity: 30, date : ISODate( "2021-03-17T09:22:12Z" ) },
{ id: 3, name: "Cheese", size: "small", price: 12,
quantity: 15, date : ISODate( "2021-03-13T11:21:39.736Z" ) },
{ id: 4, name: "Cheese", size: "medium", price: 13,
quantity:50, date : ISODate( "2022-01-12T21:23:13.331Z" ) },
{ id: 5, name: "Cheese", size: "large", price: 14,
quantity: 10, date : ISODate( "2022-01-12T05:08:13Z" ) },
{ id: 6, name: "Vegan", size: "small", price: 17,
quantity: 10, date : ISODate( "2021-01-13T05:08:13Z" ) },
{ id: 7, name: "Vegan", size: "medium", price: 18,
quantity: 10, date : ISODate( "2021-01-13T05:10:13Z" ) }
])
db.orders.find({size: "medium"});
db.orders.find({size: "medium"});
2203051057108 14
Faculty Of Engineering& Technology
BIG-DATA ANALYSIS(203105348)
B.Tech CSE 4th Year 7th Semester
2203051057108 15