DAR Question Bank 1
DAR Question Bank 1
Unit-1: Introduction to Big Data Analytics and Data Analytics Life Cycle
1. List and explain the three characteristics of Big Data.
2. What are the main considerations in processing Big Data?
3. What is an analytic sandbox and why is it important? Explain in detail.
4. Differentiate between Business Intelligence and Data Science with suitable graph.
5. Describe the challenges of the current analytical architecture for data scientists.
6. What are the key skill sets and behavioral characteristics of a data scientist? Discuss.
7. Define Big Data as per the McKinsey Report. Highlight the several sources of Big data deluge and
explain any two in detail.
8. Explain the types of data structures used to represent the Big data. Give examples for each.
9. Mention the types of Data Repositories according to the Analyst’s perspective.
10. With a neat diagram, describe the architecture of emerging Big Data Ecosystem.
11. What are the key roles of a new Data Ecosystem? Explain in detail.
12. Outline the three sets of recurring activities to be performed by Data Scientists.
13. Brief explain the overview of the main phases of the Data Analytics Lifecycle.
14. Explain the current data analytical architecture with suitable diagram.
15. Outline the limitations of traditional data architecture.
16. Describe the sources of Big data and data evolution with necessary graph.
17. Discuss the emerging Big Data ecosystem with suitable diagram.
18. Describe the key roles of the new Big Data ecosystem in detail.
19. Mention the three sets of activities that data scientists perform in a Big Data ecosystem.
20. Mention the features of Hadoop framework in Big Data analytics.
21. Summarize the key stakeholders and their roles in implementing a successful analytical project.
22. Mention the steps involved in phase-1 of the Data Analytics Lifecycle. Explain any three steps in detail.
23. How to develop an analytical sandbox? Discuss in detail.
24. Summarize the processes of ETLT and API.
25. Define Data conditioning. Mention the additional questions and points to be considered during data
conditioning step.
26. Mention the guidelines and considerations are recommended during data visualization step.
27. Outline the commonly used tools for the Data Preparation Phase.
28. List the activities to be carried out in model planning phase.
29. Discuss the process involved in data exploration and essential variable selection during Data Preparation
Phase.
30. List out the tools available for assisting model planning phase.
Unit-2: Data Analytics Life Cycle and Review of Basic Data Analytic Methods using R
1. Discuss the questions to be considered while creating robust models to meet the objectives of the
project.
2. Outline the commonly used commercial and open source tools for the Model Building Phase.
3. Explain the key outputs expected by each of the main stakeholders of an analytics project.
4. In which phase would the team expect to invest most of the project time? Why? Where would the
team expect to spend the least time?
5. What are the benefits of doing a pilot program before a full-scale rollout of a new analytical
methodology?
6. What kinds of tools would be used in the following phases, and for which kinds of use scenarios?
a. Phase 2: Data preparation
b. Phase 4: Model building
7. How to import and export csv and other text files? Illustrate with examples.
8. How to establish the connection between database and R program and also to present the SQL query
through R program. Explain with an example.
9.