0% found this document useful (0 votes)
23 views2 pages

BDAA semister question bank

Uploaded by

chocorizz2211
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views2 pages

BDAA semister question bank

Uploaded by

chocorizz2211
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Department of Computer Science and Engineering

Subject Name: BIG DATA ANALYTICS AND APPLICATIONS


Subject Code: MR20-1CS0307
Year & Semester: IV-I
Unit-Wise Question Bank

Q. No Question Marks Section UNIT


1 What is Big Data? Explain Data Types of Big Data with example 12 Section-I 1
2 Explain 5 v’s of Big Data. 12 Section-I 1
3 What is Hadoop? Explain various modules of Hadoop 12 Section-I 1
4 Explain Hadoop Architecture with neat sketch 12 Section-I 1
5 What is HDFS? Explain main components of HDFS 12 Section-I 1
6 Explain the importance of HDFS in the Hadoop framework. 12 Section-I 1
7 Explain HDFS Architecture and write the advantages and 12 Section-I 1
disadvantages of HDFS.
8 What is the difference between a regular file system and HDFS? Why 12 Section-I 1
is HDFS fault-tolerant?
9 How is node failure handled in Hadoop? Discuss. 12 Section-I 1
10 Role of Name Node and Data Node in HDFS. 12 Section-I 1
11 Explain the components of the Hadoop Framework and use-cases. 12 Section-II 2
12 Explain Hadoop Cluster Architecture and its advantages. 12 Section-II 2
13 What is Hadoop Clustering? Explain Multi-Node Clustering with neat 12 Section-II 2
sketch
14 Write Short notes: a). Data Integrity b). Compression 12 Section-II 2
15 Explain compression and its properties in Hadoop I/O. 12 Section-II 2
16 Explain serialization and writable class hierarchy in Hadoop I/O. 12 Section-II 2
17 What is MapReduce? Explain key components of Map Reduce 12 Section-II 2
18 What is Hadoop Streaming? Differentiate Hadoop Steaming and 12 Section-II 2
Hadoop Pipes
19 Explain data flow & interfaces in HDFS. 12 Section-II 2
20 How sort & shuffle process work in MapReduce? Explain. 12 Section-II 2
21 What is MRUnit, and why is it used in MapReduce application 12 Section-III 3
development?
22 Describe the steps to set up MRUnit for unit testing in MapReduce. 12 Section-III 3
23 How does Hadoop handle failures in the classic MapReduce model? 12 Section-III 3
24 What happens during the shuffle and sort phase in MapReduce? 12 Section-III 3
25 How MapReduce Works: Anatomy of MapReduce Job Run 12 Section- III 3
26 What are the key stages in a MapReduce job, and how do they 12 Section- III 3
interact?
27 What improvements does YARN introduce over classic MapReduce 12 Section- III 3
for failure handling?
28 How is job scheduling managed in YARN? 12 Section- III 3
29 What are the main differences between Classic MapReduce and 12 Section- III 3
YARN?
30 What are the common failure scenarios in MapReduce, and how are 12 Section-III 3
they mitigated?
31 What are the main differences between NoSQL and relational 12 Section-IV 4
databases?
32 What are the different types of NoSQL databases? Classify and 12 Section-IV 4
describe each type in detail.
33 Describe the detailed architecture of Hive and explain its 12 Section-IV 4
components.
34 Explain the data models used in HBase architecture and describe 12 Section-IV 4
their structure.
35 What types of data can Hive handle? Discuss the different data types 12 Section-IV 4
it supports
36 Design & explain the detailed architecture of HIVE 12 Section-IV 4
37 Explain User Defined Functions in HIVE with Examples. 12 Section-IV 4
38 Write a short note on the following: (i) Sqoop, (ii) Flume, (iii) JSON, 12 Section- IV 4
(iv) HiveQL
39 What are Sqoop connectors? Provide an explanation of their 12 Section- IV 4
purpose and functionality.
40 Describe the architecture of Flume and explain its main components. 12 Section- IV 4
·
41 Explain the architecture of Apache Spark, including the key 12 Section-V 5
components and installation steps
42 Explain in detail about RDD in Apache Spark. 12 Section-V 5
43 Briefly explain Anatomy of a spark job run. 12 Section-V 5
44 Explain about Shared Variables in Spark with 2 Examples and what 12 Section-V 5
are the benefits of using Apache Spark.
45 What is Pig in the context of Big Data? Describe its features and use 12 Section-V 5
cases
46 Provide an overview of Scala. What are its key features, and which 12 Section-V 5
frameworks are commonly used with Scala?
47 What is Scala? Is it considered a language or a platform? 12 Section-V 5
Additionally, discuss if Scala is a purely object-oriented programming
language.
48 Explain about Scala & what are the key Features, Frame Works in 12 Section-V 5
Scala.
49 What are the Software’s and things need to install for setting up 12 Section-V 5
Scala development environment
50 Explain about Pattern matching and functions in Scala with 2 12 Section-V 5
examples.

You might also like