0% found this document useful (0 votes)
100 views

Big Data Past Questions

The document is an examination paper from Tribhuvan University for Big Data Technologies, covering various topics such as Big Data characteristics, Google File System (GFS) architecture, MapReduce framework, NoSQL databases, and HBase. It includes questions on data analytics processes, fault tolerance, and the components of Hadoop. Candidates are required to answer all questions in their own words and assume suitable data if necessary.

Uploaded by

sharproentgen3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views

Big Data Past Questions

The document is an examination paper from Tribhuvan University for Big Data Technologies, covering various topics such as Big Data characteristics, Google File System (GFS) architecture, MapReduce framework, NoSQL databases, and HBase. It includes questions on data analytics processes, fault tolerance, and the components of Hadoop. Candidates are required to answer all questions in their own words and assume suitable data if necessary.

Uploaded by

sharproentgen3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

TRIBHUYAN I.INIVERSITY Exam.

INSTITUTE OF ENGINEERINIG Level 80i


Examination Control Division --..^'''''.,,..,,.-',..,,,]

2080 Chritra :?i


"?l]I!. i

- Big DataTechno

&ng'c:ml
1
t. What are 5vs of Big Data? Explain with exarnples. Justify
hor.'; Distributed System plaSrs
a vital role in Ilig Data.
a [4+s]
/.,List the various tvpes of NOSQL ,Jatabase with their data
structure anri usage.
t8l
J. Let us assume that YouTube as a client is stili
using GFS. Irew 100 videos in youTube
are really popular and 10"000 of users are accessinfit
simultaneous. Horv do you thirrk
GFS handles this read scenario? Expiain this procesr ulorg
with GFS architecture.
4. what are the main components of lr{ap-Reduce Job? Explain in
brief Data Flow
of Map-Reduce Framework. what happens if DaL Node fails cluring write
technique
Process?
[4-t4*31
5. You are appointed by lv{inistry of Health as an Engineer to
suppoil its plan to perfbmr
data analytics oi'al1 governmental hospitalsirsing a .olumrar
!3ttn
HBase" Explain the architecture of HBase and discuss ho.i,.you
sosqi database ;
exploit its features in this scenario.
! ---o-' -"-- deploy
design and --'.. to fuJl1,
i9j
6 Explain the architecture of LUCEI.IE and_its role in a typical
text search appiication. Also,
Describe different analyzers available in Lucene in briei.
1
f6+51
Compare and Contrast the diffbrent mocle of Hadoop l-tnviroment
setup. Can we step up
l{adoop in the cloud? If so, horv and explain the scenarios .,viren
these diftbrent mode are
preferable?
[4* 4.1
Write shorl notes on:
[1x11
a) FunctionalProgramming
b) Fault tolerance in GFS
c) Recent Trends in Big Data
d) Mongo DB
*rkI
TRIBHUVAN TINIVERSITY Exam.
INSTITUTE OF ENGINEERING
Examination Control Division i
n-rogou*me BEI,BCT Pass Marks ,, j

2079 Chaitra i Year / Part IV / II Time 3 hrs.


,---______ -__ -----l
l

{^ybi'fl:: B!-g DaJa rechnologies (Elective II) (CT 76507)


./ Candidates are required to give their answers in their own wards as far as practicable.
,/ Attempt All questions.
'/ The figures in the rnurgiru indicate Fu! Marks
/ ,,4ssume suitable dala iinecessflry.
1. What are the common challenges of Big-Data? Explain the Data i\nalytics Process. [5+5]
2. Draw the architecture of GFS. Explain the process of data reading alld data writing in
GFS" [2+8]
3. What is the Map- Reduce Framervork and how does it work? Iixpiain with examples" [10]
A
a, Differentiate SQL Vs No.SQL. Deiine Data consistenc,"*. Availability and partition
tolerance intenns of No. SQL .Explain which No.SQL supports of the above terms. i3-+3*4]
5. Defrne LUCENts. Describe the typical components involved in the search application. [10]
6. Briefly describe the daemons of Hadoop. Explain the role of HDFS (Hadoop Distributed
File System) in Hadoop and write down the syntax for file upload, download, list and
view c.ontent of {ile commands for HDFS. [i0]
7. Write short notes on: [ax5]
a) Functional Programming
b) C.assandara
c) Elasticsearch
d) Amazon cloud
x*r,
-

TRIBHUVAN I.INIVERSITY
iififfi:
;".;J- BE FuIl Marks 6u
INSTITUTE OF ENGINEERING
Examination Control Division Programme BCr * -.-lg:.U"#*", 1?

2078 Chaitra iJai r part IV l r - Iiry.g--"*-- -- -."j-.11!:

./ in their own words as far as practicable'


candidates are required to give their ansu'ers
,/ Attempt All questions.
'/ inu ig"G in the tnargin indicote Full Marks
/ Assurne suitable data iJ'necessory'

What is big data? Explain 5 V's of big ilata


by relating with real rvorld use cases' [2+8]
1.

2. Why rvas Google File System .r"rEll


What are the assumptirins made by GFs? Explain
crperations pertbrmeJb:i-uut", in GFS
along with Architeciure diagram. 12+2*6)

Explain in detail how faiiures are handle in ffr?1


a
Why do rne require Map-reduce tiamervork?
exampie. Lr-/l
uai_Reduce frarnework along with an appropriate
datases. Gi'e an overvieu' of architecture of it base;
a
4. List down the categories of NoSeL
i6+6]
distributed columnar <latabase'
tsLJr Lv'eused in tlata indexing
along rvith explanation. Explain the data
5. List the different analyzers
[5+5]
indexing process.
6. List down the clifferent components
instalrecr in the hadoop ciuster. Explain its u'orkflow'
by the hacloop cluster? 12+6+41
How fault tolerauce anci scalability is irandled
[4'4]
7 . Write short notes on:
a) Functional Frogramrning
Ul Structured, Semi-structure and Unstructured data
c) Apache Cassandra
d) Distributed Searching
*rrt6
TRIBHUVAN UNIVERSITY Exam.
INSTITUTE OF ENGINEERING ,- ltcgular
Level BE Fult Markr 80
Examination Control Division Programme BEX, BCT Pnss Merkt 32
2017 Chaltra Yesr / Part ru/II Time 3 hrs.

'/ candidates are required to give their answers in their


r' Anempt All questians.
practicable.

{ Assume suitable data if necessary.

1' What do you know about the term "Big Data" and
what are the five V,s of Big Data?
How Big Data are helpful in increasing dusiness ,.r.n*?
[6+6]
2' Explain GFS Architecture. Why single master is not a bottleneck in GFS cluster?
[8+5]
3' Explain how MapReduce works with suitable example. Explain distributed execution
MapReduce with example. in
[6+6]
4' Explain Hbase with architecture. How can you model RDBMS table in Mango dB?
an example. Give
[4+6]
5' Explain the process of indexing and searching in
Lucene with proper diagrams.
18l
6. Explain various components of Hadoop in brief.
[10]
7. Write short notes on:
[3x5]
a) Elastic Search
b) CAP Theorem
c) JSON and XML
***
].{ATIONAL COLLEGE OF Ei{GNETRING

QUESTION BANK
Big Data Technotogies
(Elective II)(CT7 6501)

BCT IV/II

,r&,TK..r+;
ld*H ',,1r :i'""
. v ..- 1; ,n#id"-..
1

rs$ffistu
i 4

BDO15
l rillil liltllrJilllililt illlllll
TRIBHWANI.JNIVERSITY Exam. l{e sttl:tr / l}rrcli
INSTITUTE OF ENGINEERING Level BE Full Mnrks 80

Examination Control Division Programme BCT, BEX Pasr Mrrlio 32

2076 Bhadra Ypar / Port ry/u Tims 3 hrs.

Subject: - Bie Data Technologryes (Electiw I$ Pr 7.!507)

Candidates are required to give their answers in their own words as far as practicable.
Attempt 4ll questions.
Thefigures in the margin indieate {ull Ma*s,
Assurne suitable data dnecessary.

l. What are the main characteristics of big data? tsl


Z. Define HDFS? How client reads data from HDFS? Explain with the help of zuitable block
diagram. [10]
3. What are the dif[erences between structured and unstructured data? Explain with suitable
examples. t10I
4. How traditional data differ from big d#mJ? List out five distinct differences? tsl
5. Explain GFS architecture. How data and control messages flow in GFS architecture.
Explain with suitable flow diagram. [10]
6. What is elastic search? What indexes will be used during elastic search. Explain with
suitable example. [10]
7. What are the various components of Hbase database. Explain with a suitable block
diagram. t10l
8. Why Hbase is called column-oriented NoSQL built on top of HDFS? What are
database
the commands to STORE, SELECT, MODIFY, and DELETE records from a table of
Hbase. t10I
9. Write short notes on: [2x5J
a) Master-Slave architecture
b) Zookeeper
c) Clieat-Server architerucre
d) Hadoop MaP reduce
e) Application of Big data analytics
I
25 F TRIBHUVAN UNIVERSITY
INSTITUTE OF ENGINEERING
Examination Control Division Programme
2075 Bhadra ry/II

Sabiecr:-BigDatql!9gh"q_l_9gl9lgk:ty!-!!!cIL{10.17)
./ Candidates are required to give their answers in their own words as far as practicable.
,/ Attempt All questions.
{ The fiSares in the margin indicate FullMarks.
./ Assume suitable data if necessary.

1. Why distributed computing is necessary for big data? t5l


Z. Define DFS. How client writes data in HDFS? Explain with the help of suitable block
diagram. [10]
3. The data in big data warehouse is called hybrid data. Explain with suitable examples. t1 0l

4. How GFS differ from other File Systems? List out five distinct differences. tsl
5. What is the main role of GFS Master during read and write processes? How data and
control messages flow in GFS architecture. Explain with suitable flow diagram. [10]
6. Map Reduce is the heart of Hadoop eco-system? Define work flow of Map reduce with
suitable examples. U 0I

7. Clock synchronizationin DFS may be the big challenge. How this clock synchronization
problem can be solved? [10]
8. Hbase, Cassandra and MongoDB are called column-oriented NoSQL database? How
row-oriented database differ from column-oriented database? Explain with suitable
examples. [10]
9. Write short notes on: [5x21
a) Scoop and fiume .
b) Zookeeper
c) Oozie
d) Pig and Hive
e) Client-Server and Master-Siave architecture
,1.**
*
35F TRIBHUVAN UNIVERSITY
INSTITUTE OF ENGINE ERING
Examination Control Division
207 4 Bhadra

Sub-j9c4-Bis-DajaTe9*hnotoE}!'|@^.!9
,/ Candidates are required to give their answers in their own words as far as practicable.
{ Attempt All questions,
,/ The figures in the margin indicate Fu!!_Ws.
,/ Assume suitable data if necessary.

l. a) Explain with example about the distributed system in Big Data. t8l
b) What is the role of Data Scientist? t4l
2, a) Explain the architecture of Google File System (GFS). t8l
b) What is availability and fault tolerance in Google File System? tsl
3. a) Explain in brief Data Flow technique of Map-Reduce Framework. t8l
b) What is Optimization and Data Locality in Map Reduce? t4l
4. Differentiate between structured and unstructured data and discuss the Taxonomy of
NoSQL. t8l
5. Explain the components of Indexing and searching. t8l
6. a) Explain in brief five daemons of Hadoop. t8l
b) What is the role of Hadoop Distributed File System in Hadoop? t4l
7 . Write short notes on: [5 x3]

i) Elastic Search
ii) Hbase Architecture
iii) Functional Programming
,'*

35 F TRIBHWAN UNIVERSITY
INSTITUTE OF ENGINEERING
Examination Control Division
2073 Magh tV/[ iTime

Subjecf - Big Data Technologies (Elective ID Gr76507)


{ Candidates are required to give their answers in their own words as far as practicable.
{ Attempt 4U questions.
/ The figures in the margin indicate Full Marks.
r' Assume suitable dqta if necessary.

l. Why do we need data analytics process? Explain the role of Distributed computing in Big
data-
[s+s]
2. Why do we have large and fixed sized Chunks in GFS? What can be the demerits of that
design?
[10]
3. How is MapReduce library designed to tolerate different machines (map/reduce nodes)
failure wtrile executing MapReduce job? ll 0l
4. For following dablist the input toloutput from both the map and reduce functions for
getting marimum marks oof each co [10]
Student Name College Name Final Marks inYo
Ram ABC 70
Sita ABC 80
Hari ABC 60
Gita XYZ 90
Rita XYZ 80
Shyam PQN 90
Laxmi PQR 70
Gopal PQR 60
OR
What is the combiner function in mapreduce? Explain its purpose with suitable example. tlg]
5. Explain the term NO-SQL. Explain CAP theorem with suitable block diagram. t3+71
6. Describe the typical components involved in search application. tlg]
7. What are different daemons in HADOOP cluster? Explain each in details. [3+Z]
8. Write short notes on any two of following. [2x5]
a) Shadou' Master and Cluak services
b) Analyzers available in Lucene *..
c) Vertical and Horizontal Scalabiliby
*:f.*
*
35 F TRIBHUVAN I.JNIVERSITY i Exam.
INSTITUTE OF ENGINEERING
Examination Control Division
2073 Bhadra

S_1tbi9cti : Pls Dap TgchnoJo_ei_": Pp""rlly" !!l(I7.!-!a7) -- -


r/ Candidates are required to give their answers in their or,vn words as far as practicable.
{ Attempt /!! questions.
{ The Jigures in the margin indicate Fall Marks.
{ Assume suitable data if necessary.

l. What are the current trends in big data analytics? Wtrat are the technical c'hallenges and
characteristics of big data? lt 0l
2. Explain the GFS Architecture. Why single master is n,ct a bottleneck in GF S c luster. [s+5]
3. How does MAP-REDUCE work? Explain each step vrdth suitable example. Is+s]
4. Discuss the architecture of Hbase in short. Explair.r eventual consistency an d tunable
consistency in c.ontext of Cassandra. ll0l
5. Explain LUCRNE architecture and its data indexing approach. tl0l
6. What are the components of Hadoop? Explain each irr bricf. I l0]

7. How do you find max and min occurrence of ttre woi'ds in a given text do,cument.
Explain. tlo]
8. Write short notes on: (any two) [2x5]
a) CAP theorem
b) Role of Data Scientist in Big data
c) Amazon cloucl
,l. r$ t
35F I-RIBHUVAN LINIVERSi'IY
INSTII'UTE OF ENGINEERING
Examination control Division -s;i,
2074 Magh
ip;A;,,,- eci

s
" lj
g c!a __Bjs- D:ta re ch"9
! 9;
i;
"
; r;,r,s ii,
ici1 I |iti i i)
:;

candidates are required to gi'e their ansrvers in


their own words as far as practicable.
Attempt All questions.
The figures in the mttrgin indicate Full Marks.
Assurue suitable data if nec€ssery-.

1' How big data differ from traditional data? List out fir,e distinct differences?
2' what are the sources of structured, semi-structured and un-srructured [1+5]
data in real-world? 0]
3' Define DFS' How client writes data in HDFS? Expiain ri.ith [1
help of suitable block
diagram.
[1 0]
4. Clock synchronization in DFS rnay be the
big challenge. Hou, tiris clock sl,nchronization
problem can be solved?
[1+s]
5. I{or.r, data and control messases flow in GFS architecture. Explain ri.ith suitable flow
diagram.
[10]
6. How GFS provides fault tolerance. How it allows tolerating chunk servers
failures? [10]
7. How u,ord count job is performed for the following file in HDFS
using a i\4ap_reduce
flow chart?
[10]
File.txt(fi le size: 200MB)
Hi how are you
How is your job
How is your family
How is your brother
How is your sister
What is the time now
What is the strength of Hadoop
8' What are the differences between row and column oriented
cassandra and MongoDB are called columa oriented
database? why Hbase,
NoSeL database?
'V/rite Ii0]
9. short notes on:
[4"2]
i) Zookeeper and Oozie
ii) Pig and Hive
***
TRIBHWAN I'NIVERSITY Exam. Resular
INSTiTUTE OF ENGINEERING Level BE Full Marks BO

Examination Control Division Programme BEX, BCT Pass Marks )Z


2072 Ashwin Year / Part IV/II Time 3 hrs.

Subject: - Big Data Technolo gies (Elecrive II) (Cr76507)


'/ Candidates are required to give their answers in their own words as far as practicable.
'/ Attempt All questions.
'/ Thefigures in the margin indicate Full Marks.
/ Assume suitable data if necessary.

1. What are the technical challenges and characteristics of a big data? \\4ro are thq data
scientists, list out their roles and skills. [6+6]
2. With diagram, explain generalarchitecture of Google File System. [10]
OR
a) Why do we have single master in a GFS and millions of chunk serers? t4l
b) A cluster contains 1500 machines, each having 500G8 disc capacity. Calcuiate
approximate the number of the chunck ser\rers, the blocks and the total available size
if default chunck replica is 3 and 5 respectively. l6l
3. a) What is a map reduce? Expiain the erecution overview of the map reduce.
'
t6l
b) Draw the output of mapreduce of the following lines: t4l
"big users big voiume data cloud contributes bid data"
"facebook has big users facebook operates big data"
4. a) Explain a CAP rheorem. tsl
b) Differentiate between a RDBMS and a NoSQL Databases. t3l
5. Explain taxonomy of aNoSQL databases. Explain Cassendra database in brief. [10]
OR
Using a'MongoDB database,
a) Create a collections named "posts", insert following records: 13l
title: MongoDB, description: N4ongoDB is a NoSQL database, by: Tom,
Comments: We use MongoDB for unstructured data, likes: 100

i) Now write a query to search title of the post written by Tom. t3]
ii) Write mapReduce fuhction to count number olposts created by, r'arioris users. ' l4l
6. What is the Lucene? Dcscribe the typieai components involved in the search application. ti0]
7, Explain various components of Hacloop in brief. , t10]
8. Write short notes on: (any two) l5x2l
r) . CombinelFuncti_ons
ii) Fault tolerant systems
iii) JSoN
iv) Unstructured data
2.'
15F I'IilBHiiVA\ trNtVEllSI I Y Irlant.
Nsl'I]'ti i'li OF ENGNEhR lN( i Lc'.'c! L1 I Full I{arks E0
Ex'r mination Contlo! l)iyision !) rog re rn nr e RpY qrl]- ['sss \{a rr-:-.: ] l
iiiTZ Iiagh \ eli',' !''.rrt ilfirc ) itii.

b'ubject: 13rg Dat_a TechnologLes (ELecrn;e 1t) (( L;o)U;)


Lancildates are requireci to give iheir ansu,ers in their ov'nn vvords as lar as practicable.
ntrr-itLlti f,li Q L;r:r,,,,,'
The figures in the margin indicate Fu[[ Marks.
4 i -,, :l * q r t:iIn h I t .1., t.^ : {,-., ^ ^-"jt,l,,-;:.

:. '..: n3-r tS 1"3:: :-ta::",'l-,ci.,'d:sllibuted s1'siems hclp iJ j-i,e u.; ijig fi..,a a,r.l.eii..l
2. Explain how master implernents garbage collection and detects stale replica in a GFS.
3. \\tr1'do rve have iarge and fixed sized Chunks in a GFS? What are the dcmerits of that
design? li0l
4. Hor.r'a MapReduce Iibrary dcsigned to tolerate different machines (map/reduce nodes)
tailure while exccuring iviapxeciuce job? t8l
5. For following dat4 list the input to/output liom botir the map and rcduce functions for
getting maxim um marks of each collese [ 10]
Students Name Collese \amc Final Marks in 96
Ram ABC 70
Sita ABC 80
Hari ABC 60
Gita XYZ i qO
futa AYZ 80
Shyam PQR 90
Laxmi PQR 70
Gopal PQR 60
OR
What is the combiner function in mapreduce? Explain its purpose with suitable example.
6. What is the difference betrveen a structured and unstructured data. Explain the eventual
consistency and tunable consistency in context of Cassandra. [ 10]
7. \\trat is an elastic sealch? trxpiain..,a!:ic,.-ts tvnes of al:l.,,zers ra t ot
lLTo I
6. Whar are the componenrs of the iia.loop? For a hadoop cluster with i28 MB block size,
hor,v rqaly ma1lpets rrri ll lrqrlnn^ :::;::f,,,:-^ t-.--'...']:llt ;=1,=ting ri:,appcr fUnCiiOr-r On 1
GB nf data Justilv r*,ith exnlaraticn. l1 0_r
./'
ll/
a t)

358 TRIBHUVAN UNIVERSITY


INSTITUTE OF ENGINEERING
Exam.
Level BE
_ Rastutai7E;ffi
Full Nfarks 80
Examination Control Division Progranorne BEX, BCT Pass Marks 32
2071 Bhadra Year / Part M$ Time 3 hrs.

fl_ti "S! _B !g latg l-" -"h19lg_ej es (E t e c t i v, ry -:_._--_._ s; o /)


fc i i i ..-.-,--l_..-.............-
...-..._-j*

candidates are required to give their answers in their


own words as far as practicable.
Attempt All questions.
Thefigures in the margin indicate Full Marks.
Assume suitable data if necessary.

1' What are t]1e BIG-DATA challenges? Explain the data


analytics process in terms of BIG-
DATA.
13+71
2. a) Explain the control flow of write mutation with diagram.
t7l
b) Explain the meta data storecl by GFS maste..
t8I
J. a) Explain garbage collection implernented by GFS. Explain its purpose
against
implementing eager deletion for storage reallocation.
t7l
b) Explain cAP theorem an<I Eventual consistency. Also,
explain the reason why some
NoSQL databases like cassandra sacrifice - absohite consistency
availability
for absol*te
t8l
4. How map-reduce works in distributed fashion? Describe
the parallel efticienci,, of map-
reduce with suitable block diagram.
[3+7]
5. List out the HADooP daemons. How HADooP and GFS
are similar interms of design
architecture.
[2+8]
6. Explain the term "NO-SQL". Justi& "for distributecJ scenario
normalization contradict
the data availability,,.
13+71
OR
Write down the map-reduce program to find the word frequency.
tt0l
7 ' What are the data indexing steps? Describe the components
of search application.
[3+7]
,F ,< ,{<

You might also like