0% found this document useful (0 votes)
22 views

BDA Assignment - 231012 - 151952

The document contains questions related to big data and hadoop distributed file system (HDFS) for three assignments. It includes questions about characteristics of big data, HDFS architecture, MapReduce, YARN, and big data technologies like Apache Spark, Kafka etc. It also has multiple choice questions to test knowledge of big data concepts.

Uploaded by

Arun Chaudhari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

BDA Assignment - 231012 - 151952

The document contains questions related to big data and hadoop distributed file system (HDFS) for three assignments. It includes questions about characteristics of big data, HDFS architecture, MapReduce, YARN, and big data technologies like Apache Spark, Kafka etc. It also has multiple choice questions to test knowledge of big data concepts.

Uploaded by

Arun Chaudhari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Big Data Analytics

Question Bank
Assignment I
1. What is Big Data? Explain Characteristics of Big Data.
2. What is Big Data Analytics? Explain 5 ‘V’s of Big Data.
3. Explain any 4 big data distribution packages
4. List top big data technologies.
5. Explain applications of big data.
Assignment II
1. With neat sketch explain HDFS?
2. Describe the working of map reduce with a relevant example.
3. Discuss architecture and application workflow of Hadoop YARN in detail.
4. What is zookeeper? Explain the architecture of zookeeper in detail.
5. Illustrate the architecture of big data stack.
6. Illustrate Paxos algorithm and its working in detail.
7. Explain HBASE architecture.
Assignment III
1. Explain different types of big data pipeline architecture with suitable diagram.
2. Explain spark streaming architecture with neat diagram.
3. What are the major components of Kafka? Explain with neat diagram.
MCQ
1. ___________ is a collection of data that is used in volume, yet growing exponentially
with time
A. Big Database
B. Big DBMS
C. Big Datafile
D. Big Data

2. Identify the incorrect big data Technologies.

A. Apache Pytorch
B. Apache Kafka
C. Apache Hadoop
D. Apache Spark

3. Choose the primary characteristics of big data among the following


A. Value
B. Variety
C. Volume
D. All of the above

4. Identify the different features of Big Data Analytics.


A. Open-source
B. Data recovery
C. Scalability
D. All of the above

5. Identify the node which acts as a checkpoint node in HDFS.


A. Secondary Name Node
B. Secondary data node
C. Name node
D. Data node

6. What is the minimum amount of data that a disk can read or write in HDFS?
A. Byte size
B. Block size
C. Heap
D. None of the above

7. Data in ___________ bytes size is called Big Data.


A. Tera
B. Giga
C. Peta
D. Meta

8. How many V's of Big Data


A. 2
B. 3
C. 4
D. 5

9. Transaction data of the bank is?

A. structured data
B. unstructured datat
C. Both A and B
D. None of the above

10. In how many forms Big Data could be found?


A. 2
B. 3
C. 4
D. 5

11. Which of the following are Benefits of Big Data Processing?


A. Businesses can utilize outside intelligence while taking decisions
B. Improved customer service
C. Better operational efficiency
D. All of the above

12. What are the main components of Big Data?


A. MapReduce
B. HDFS
C. YARN
D. All of the above

13. Which step is executed by the data scientist after obtaining the data?
(A). Data Replication
(B). Data Integration
(C). Data Cleansing
(D). All of these
(E). None of these

You might also like