Basic Concepts of BDA
Basic Concepts of BDA
Que. 1
A Define Big Data. What are the key skill sets and behavioural 6M
characteristics of a data scientist?
B Consider facebook a social network application and write your 6M
observations on the following
a. Type of data
b. Characteristics of data
OR
Que. 2
A What are the three characteristics of Big Data, and what are the main 6M
considerations in processing Big Data?
B What is an analytic sandbox, and why is it important? 6M
Que. 3
A Explain the different phases of data analytics lifecycle . 6M
B In which phase would the team expect to invest most of the project 6M
time? Why? Where would the team expect to spend the least time?
Justify your answer
OR
Que. 4
A What are the benefits of doing a pilot program before a full -scale 6M
rollout of a new analytical methodology? Discuss this in the context of
the mini case study.
Page 1 of 2
B Design and explain the different phases of data analytics lifecycle for 6M
‘youtube’.
Que. 5
A How is the k value selected in k means clustering algorithm? Explain 6M
B Suppose everyone who visits a retail website gets one promotional offer or 6M
no promotion at all. We want to see if making a promotional offer makes a
difference. What statistical method would you recommend for this analysis?
OR
Que. 6
A What is a type I error? What is a type II error? Is one always more 6M
serious than the other? Why?
B Describe how logistic regression can be used as a classifier 6M
Que. 7
A Explain the basic building blocks of Hadoop with a neat sketch. 6M
B Comment on failures in classic Hadoop 6M
OR
Que. 8
A Explain shuffle and sort 6M
B How is network distance calculated in Hadoop? Explain with an 6M
example.
Que. 9
A What is HIVE? List the features of HIVE 6M
B Explain the architecture of SQOOP with a neat sketch. 6M
OR
Que. 10
A Compare and contrast Hadoop and Pig. List strengths and weaknesses 6M
of each tool set.
B Explain the architecture of zookeeper with a neat sketch 6M
*****
Page 2 of 2