Chapter 1
Chapter 1
Question 1:
Question 2:
Question 3:
Which processing framework is known for its support of interactive analysis in Hadoop?
a) MapReduce.
b) Apache Spark.
c) Apache Flink.
d) B and c.
Question 4:
d) Batch processing.
Question 5:
Which processing pattern is suitable for machine learning algorithms? a) Batch processing.
b) Interactive SQL.
c) Stream processing.
d) Search.
Question 6:
Which characteristic of Big Data refers to its ability to predict future events?
a) Scale.
b) Distribution.
c) Diversity.
d) Timeliness.
Question 7:
Question 9:
Question 10:
Question 11:
Question 12:
___________ is a collection of data that is used in volume, yet growing exponentially
with time
a) Big Database
b) Big DBMS
c) Big Datafile
d) Big Data
Question 13:
Question 14:
Question 15:
Among the following options choose the one which depicts the correct reason why
big data analysis is difficult to optimize.
Question 16:
What is the primary use case for MapReduce?
• A) Real-time processing
• B) Interactive analysis
• C) Batch processing
• D) Stream processing
Question 17:
A) MapReduce
B) Apache Spark
C) Apache Flink
D) Batch processing
Question 18:
A) Batch processing
C) Real-time processing
D) Interactive analysis
Question 19:
Which processing model is not supported by MapReduce?
A) Real-time processing
B) Interactive analysis
C) Batch processing
D) Stream processing
Question 20:
A) Batch processing
B) Interactive analysis
C) Real-time processing
D) Stream processing
Question 21:
A) MapReduce
B) Apache Spark
C) Apache Flink
D) Batch processing
Question 23:
Which processing model provides native support for iterative processing?
A) MapReduce
B) Apache Spark
C) Apache Flink
D) Real-time processing
Question 24:
D) Indexing documents
Question 25:
C) MapReduce
D) Hive
Question 26:
Which of the following processing models is not suitable for interactive analysis?
A) MapReduce
B) Real-time processing
C) Apache Flink
D) Interactive analysis
Question 27:
Which processing pattern allows for low-latency responses to SQL queries on Hadoop?
A) Batch processing
B) Interactive SQL
C) Iterative processing
D) Stream processing
Question 28:
A) Batch processing
B) Stream processing
C) Iterative processing
D) Search
Question 29:
A) Interactive SQL
B) Iterative processing
C) Stream processing
D) Search
Question 30:
What type of processing pattern is associated with the use of Solr on a Hadoop cluster
for indexing and search?
A) Interactive SQL
B) Iterative processing
C) Stream processing
D) Search
Question 31:
What is the primary use case for the Iterative processing pattern?
A) Real-time processing
Question 32:
A) Interactive SQL
B) Iterative processing
C) Stream processing
D) Search
Question 33:
C) Real-time processing
Question 34:
In which processing pattern does Storm, Spark Streaming, or Samza play a role?
A) Interactive SQL
B) Iterative processing
C) Stream processing
D) Search
Question 35:
What is the benefit of using a distributed query engine in the Interactive SQL pattern?
A) High-latency responses
B) Low-latency responses
Question 36:
Question 37:
A) Structured data
B) Semi-structured data
C) Unstructured data
D) Relational data
Question 38:
What is a major advantage of using MapReduce for analyzing web server logs?
Question 39:
Why are web server log files well suited for analysis with Hadoop?
Question 40:
A) MapReduce scales linearly with the data size and cluster size
B) SQL queries scale linearly with the data size but not cluster size
D) SQL queries scale linearly with the data size but not cluster size
Question 41:
In what direction are Hadoop systems like Hive evolving with respect to features?
Question 42:
Question 43:
Why does MapReduce suit applications where data is written once and read many times?
Question 45:
What distinguishes Hadoop from Grid Computing with respect to data flow management?
Question 46:
B) Expensive resources
C) Data locality
Question 48:
D) Shared-nothing architecture
Question 49:
What architecture is MapReduce based on, making it easier for programmers to handle failure?
A) Shared-everything
B) Shared-something
C) Shared-nothing
D) Shared-all
Question 50:
What distinguishes MPI programs from MapReduce in terms of check pointing and recovery?
Question 51: