0% found this document useful (0 votes)
26 views

Question Bank (4-5-6)

Uploaded by

Prajwal Parab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Question Bank (4-5-6)

Uploaded by

Prajwal Parab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Module 4 (Mining Data Streams)

Q.1)What are the challenges in clustering of Data Streams. Explain stream clustering algorithm in
detail.

Q.2) What do you mean by Counting Distinct Elements in a stream. Illustrate with an example
working of an Flajolet – Martin Algorithm used to count number of distinct elements.

(one numerical based on FM algorithm)

Q.3) Explain DGIM algorithm for counting ones in stream with example. 10

Q.4)How Bloom filter is used for big data analytics. Explain with example

Q.5)What is stream management system? Explain with block diagram.

Q.6)With respect to data stream querying, give example of

a) One time queries


b) Continuous Queries
c) Pre defined Queries
d) Ad-hoc Queries
e) Standing Queries

Q.7)Consider the stock market stream data. Justify the data stream features and draw the model of
data stream management for the mention system. Give two examples of onetime query and
continuous query from stock marketing stream.

Stream Features:

Stream data sets are Continuous, Massive, Unbounded and Possibly infinite. It is fast
changing and requires fast, real-time response. (2 marks)

Justification how stock data is an example of stream processing w.r.to feature of stream
data. (2 marks)

Examples (4 marks)

One time Query:

Opening value of stock


Closing value of stock

Continuous Query

Max value of the stock in a day/week/month

Min value of the stock in a day/week/month

Module 5 (Real Time Big Data Models)


Q.1) Describe Girvan – Newman Algorithm. For the following graph show how the Girvan Newman
algorithm finds the different communities.

(https://youtu.be/LtQoPEKKRYM Refer this link for more details)

Q.2) Explain Community in a social network graph? Explain any one algorithm for finding
communities in a social graph.(Girvan-Newman)

Q.3)

For the graph given above use the betweenness factor and find all communities.

(https://youtu.be/X8VFpttCpP4 Refer this link for more details)


Q.4) Explain Collaborative filtering based recommendation system . How it is different from content
based recommendation.

Q.5) How would you get the features of the document in a content based system? Explain document
similarity

Q.6) What is the use of Recommender System. How is classification algorithm used in
recommendation system.

Q.7) Explain the design of a recommender system used to recommend movies to users.

(https://youtu.be/uqTxqvqvjC8 )

Q.8) How recommendation is done on properties of product? Explain suitable example.

Q.9) What are recommendation systems? Clearly explain two applications for recommendation
system.

Q.10) Explain Cique percolation method (CPM) used in direct community detection in a social graph
with example. ( https://youtu.be/1EHCAq9ZFYY Refer this link for more details)

Q.11) Write the algorithm for Clique Percolation Method. Apply the same to find the communities

on the following graph. (Show the stepwise execution of the algorithm).


Q.12) Compare Content based recommendation system with collaborative recommendation. Give an
example of Utility Matrix for the most popular movie recommendation system for the user profile
and the item profile and mention the methods by which you can find the similar users.
(https://youtu.be/OtrEk___TSY)

Comparison with 4 points------ 4 marks.

Movie Recommendation System

User profile matrix -------2 marks ( It will have few users and list of movies with rating)
Just for your information definition for Clique and Community
Module 6 : Data Analytics with R
5 marks questions
1. Visualization is an excellent medium to analyze, comprehend and share information. Justify
this statement.
2. List and discuss basic features of R.
3. The following table shows the number of units of different products sold on different
days:

Product Monday Tuesday Wednesday Thursday Friday


Bread 12 3 5 11 9
Milk 21 27 18 20 15
Cola Cans 10 1 33 6 12
Chocolate bars 6 7 4 13 12
Detergent 5 8 12 20 23

Create five sample numeric vectors from this data.

4. Which function is used to concatenate text values in R. Write a script to concatenate text
and numerical values in R.
Text 1: Ram has scored
Text 2: 89
Text 3: marks
Text 4: in Mathematics
5. Which function is used to construct a vector in R. Write a script to generate the following
list of numerical values with spaces:
3 5 6 9 11 34
6. List and explain operators used to form data subsets in R.
7. List the functions provided by R to combine different sets of data.
8. Suppose you have two datasets A and B. Dataset A has the following data: 1 2 4 5. Dataset B
has the following data: 6 7 8 9. Which function is used to combine the data from both
datasets into dataset C. Demonstrate the function with the input values and write the
output.
9. What are the advantages of using functions over scripts?

10 marks questions
1. List and discuss various types of data visualizations.
2. Discuss any five applications of data visualizations.
3. List and explain various functions that allow users to handle data in R workspace with
appropriate examples.
4. List and discuss various types of data structures in R.
5. Discuss the syntax of defining a function in R.

You might also like