0% found this document useful (0 votes)
36 views

Question Bank For PUT

Uploaded by

beyou782
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Question Bank For PUT

Uploaded by

beyou782
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Question bank for PUT

2 marks questions

1. List characteristics of big data. (2 marks, CO: 1, B.T.: K1)

2. Discuss one difference between map reduce and yarn. (2 marks, CO: 2, B.T.: K2)

3. Describe the concept of data partitioning in Hadoop. (2 marks, CO: 3, B.T.: K1)

4. Define Hadoop and discuss its significance in handling big data. (2 marks, CO: 1, B.T.:
K1)

5. Write down any 4 industry examples of big data. (2 marks, CO: 1, B.T.: K2)

6. What is HDFS? (2 marks, CO: 3, B.T.: K1)

7. Discuss different types of data that can be handled with HIVE. (2 marks, CO: 5, B.T.: K2)

8. What are the key components of the Hadoop ecosystem? Briefly explain each. (2 marks,
CO: 2, B.T.: K3)

9. Describe the role of NameNode and DataNode in HDFS. (2 marks, CO: 3, B.T.: K1)

10. Discuss two main points of utility PIG. (2 marks, CO: 5, B.T.: K2)

11. Differentiate between structured, semi-structured, and unstructured data. (2 marks,


CO: 1, B.T.: K2)

12. Compare and contrast NoSQL with Relational Databases. (2 marks, CO: 4, B.T.: K2)

13. Discuss the advantages and disadvantages of using Hadoop for big data processing. (2
marks, CO: 4, B.T.: K2)

14. What is map reducing? (2 marks, CO: 2, B.T.: K1)

15. Explain the concept of shuffling and sorting in the map-reduce framework. (2 marks,
CO: 2, B.T.: K1)

16. Compare and contrast Hadoop 1.x with Hadoop 2.x. (2 marks, CO: 2, B.T.: K2)

17. Name two types of nodes in Hadoop. (2 marks, CO: 3, B.T.: K1)

18. Discuss the concept of data locality in Hadoop and its importance in distributed
computing. (2 marks, CO: 3, B.T.: K1)

19. Describe the role of ZooKeeper in Hadoop ecosystem and its importance for distributed
coordination. (2 marks, CO: 4, B.T.: K2)

20. Explain the concept of data replication in HDFS and its significance for fault tolerance.
(2 marks, CO: 3, B.T.: K1)
10 marks questions

1. Analyze and Explain the Hadoop Ecosystem in detail.

2. Discuss the advantages and disadvantages of using Apache Kafka as a real-time data
streaming platform in a big data ecosystem.

3. Examine the process of reading and writing data in HDFS by the client.

4. Analyze the role of Apache Spark in processing big data, highlighting its components
and workflow.

5. Draw and explain the detailed architecture of HIVE.

6. With examples, illustrate the use cases and benefits of implementing Apache HBase as
a NoSQL database in a big data environment.

7. With the help of a suitable example, explain how CRUD operations are performed in
MongoDB.

8. Examine the architecture of Apache Cassandra and explain how it ensures high
availability and scalability in distributed database systems.

9. Illustrate the architecture of MapReduce.

10. Analyze the architecture of Apache Storm in processing streaming data and provide a
comparison with Apache Flink.

11. Analyse and discuss in detail different forms of big data.

12. Discuss the significance of data governance in big data environments and outline its key
components.

13. Elaborate on various components of big data architecture.

14. Explore the role of data lakes in modern big data architectures and compare them with
traditional data warehouses.

15. Explain the detailed Architecture of MapReduce.

16. Analyze the impact of data replication strategies on fault tolerance and data availability
in distributed systems like Hadoop.

17. Differentiate "Scale up and Scale-out". Explain with an example how Hadoop uses the
scale-out feature to improve performance.

18. Compare and contrast the CAP theorem and the BASE principles in the context of
distributed database systems.

19. Demonstrate the design of HDFS and concept in detail.

20. Write about the benefits and challenges of HDFS.

21. Classify in detail different types of NoSQL.

22. Summarise the role of MongoDB using an example.

23. Explain various Execution Models of PIG.


24. Design and Explain the detailed architecture of HIVE.

25. Explain the concept of data sharding and its importance in achieving scalability and
performance in NoSQL databases like Apache Cassandra.

You might also like