0% found this document useful (0 votes)
55 views

Basic Concepts of BDA

This document is an exam for a Big Data Analytics course. It contains 10 questions assessing key concepts in big data and related technologies. Students are instructed to write their enrollment number, use an approved calculator, leave margins, and start new questions on new pages. The questions cover topics like defining big data; analyzing types and characteristics of data from Facebook; characteristics of big data and analytic sandboxes; phases of the data analytics lifecycle; benefits of pilot programs; k-means clustering; statistical methods; type I and II errors; logistic regression; Hadoop architecture; failures in Hadoop; shuffle and sort; network distance in Hadoop; features of Hive; Sqoop architecture; comparing Hadoop and Pig; and Zookeeper

Uploaded by

Sushanth Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

Basic Concepts of BDA

This document is an exam for a Big Data Analytics course. It contains 10 questions assessing key concepts in big data and related technologies. Students are instructed to write their enrollment number, use an approved calculator, leave margins, and start new questions on new pages. The questions cover topics like defining big data; analyzing types and characteristics of data from Facebook; characteristics of big data and analytic sandboxes; phases of the data analytics lifecycle; benefits of pilot programs; k-means clustering; statistical methods; type I and II errors; logistic regression; Hadoop architecture; failures in Hadoop; shuffle and sort; network distance in Hadoop; features of Hive; Sqoop architecture; comparing Hadoop and Pig; and Zookeeper

Uploaded by

Sushanth Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Enrolment No.

Semester End Examinations May-2022


MIT School of Engineering
VII Semester B. Tech (FINAL YEAR) Computer Science & Engineering
18BTIS702: Big Data Analytics
Date : Max. Marks: 60
Time :

Instructions for the Students:


1. Assume suitable data if necessary.
2. Use of nonprogrammable type of scientific calculator is allowed.
3. Do not write anything other than the Enrolment Number on the Question Paper.
4. Figures to right indicate the marks allotted to the questions.
5. Leave enough margin on all the sides and start each new question on a new page.

Que. 1

A Define Big Data. What are the key skill sets and behavioural 6M
characteristics of a data scientist?
B Consider facebook a social network application and write your 6M
observations on the following
a. Type of data
b. Characteristics of data

OR
Que. 2
A What are the three characteristics of Big Data, and what are the main 6M
considerations in processing Big Data?
B What is an analytic sandbox, and why is it important? 6M
Que. 3
A Explain the different phases of data analytics lifecycle . 6M
B In which phase would the team expect to invest most of the project 6M
time? Why? Where would the team expect to spend the least time?
Justify your answer

OR
Que. 4
A What are the benefits of doing a pilot program before a full -scale 6M
rollout of a new analytical methodology? Discuss this in the context of
the mini case study.

Page 1 of 2
B Design and explain the different phases of data analytics lifecycle for 6M
‘youtube’.

Que. 5
A How is the k value selected in k means clustering algorithm? Explain 6M
B Suppose everyone who visits a retail website gets one promotional offer or 6M
no promotion at all. We want to see if making a promotional offer makes a
difference. What statistical method would you recommend for this analysis?

OR
Que. 6
A What is a type I error? What is a type II error? Is one always more 6M
serious than the other? Why?
B Describe how logistic regression can be used as a classifier 6M

Que. 7
A Explain the basic building blocks of Hadoop with a neat sketch. 6M
B Comment on failures in classic Hadoop 6M
OR
Que. 8
A Explain shuffle and sort 6M
B How is network distance calculated in Hadoop? Explain with an 6M
example.

Que. 9
A What is HIVE? List the features of HIVE 6M
B Explain the architecture of SQOOP with a neat sketch. 6M
OR
Que. 10
A Compare and contrast Hadoop and Pig. List strengths and weaknesses 6M
of each tool set.
B Explain the architecture of zookeeper with a neat sketch 6M

*****

Page 2 of 2

You might also like