DA Practice Questions - Unit - 1
DA Practice Questions - Unit - 1
Q16. Consider the following sample data. Draw the MapReduce process to find the number of customers
from each city.
Q17. Consider the following sample data. Draw the MapReduce process to find the number of employees
from each category of marital status.
Q18. Discuss Hadoop ecosystem by outlining each component.
Q19. Draw a diagram illustrating multi-threaded parallel distributed system.
Q20. Discuss similarities and differences between ELT and ETL.
Q21. Complete the following diagram with the incorporation of value.
Q22. You are planning the marketing strategy for a new product in your business. Identify and list some
limitations of structured data related to this work.
Q23. In what ways does analyzing Big Data help organizations prevent fraud?
Q24. Design considerations for distributed systems are: No global clock, Geographical distribution, No
shared memory, Independence and heterogeneity, Fail-over mechanism, and Security concerns. Explain
each of the terms.
Q25. The reasons to why a system should be built distributed, not just parallel with the characteristics of
Scalability, Reliability, Data sharing, Resources sharing, Heterogeneity and modularity, Geographic
construction, and Economic. Explain each of the terms in details.
*** The End ***