Question Bank (4-5-6)
Question Bank (4-5-6)
Q.1)What are the challenges in clustering of Data Streams. Explain stream clustering algorithm in
detail.
Q.2) What do you mean by Counting Distinct Elements in a stream. Illustrate with an example
working of an Flajolet – Martin Algorithm used to count number of distinct elements.
Q.3) Explain DGIM algorithm for counting ones in stream with example. 10
Q.4)How Bloom filter is used for big data analytics. Explain with example
Q.7)Consider the stock market stream data. Justify the data stream features and draw the model of
data stream management for the mention system. Give two examples of onetime query and
continuous query from stock marketing stream.
Stream Features:
Stream data sets are Continuous, Massive, Unbounded and Possibly infinite. It is fast
changing and requires fast, real-time response. (2 marks)
Justification how stock data is an example of stream processing w.r.to feature of stream
data. (2 marks)
Examples (4 marks)
Continuous Query
Q.2) Explain Community in a social network graph? Explain any one algorithm for finding
communities in a social graph.(Girvan-Newman)
Q.3)
For the graph given above use the betweenness factor and find all communities.
Q.5) How would you get the features of the document in a content based system? Explain document
similarity
Q.6) What is the use of Recommender System. How is classification algorithm used in
recommendation system.
Q.7) Explain the design of a recommender system used to recommend movies to users.
(https://youtu.be/uqTxqvqvjC8 )
Q.9) What are recommendation systems? Clearly explain two applications for recommendation
system.
Q.10) Explain Cique percolation method (CPM) used in direct community detection in a social graph
with example. ( https://youtu.be/1EHCAq9ZFYY Refer this link for more details)
Q.11) Write the algorithm for Clique Percolation Method. Apply the same to find the communities
User profile matrix -------2 marks ( It will have few users and list of movies with rating)
Just for your information definition for Clique and Community
Module 6 : Data Analytics with R
5 marks questions
1. Visualization is an excellent medium to analyze, comprehend and share information. Justify
this statement.
2. List and discuss basic features of R.
3. The following table shows the number of units of different products sold on different
days:
4. Which function is used to concatenate text values in R. Write a script to concatenate text
and numerical values in R.
Text 1: Ram has scored
Text 2: 89
Text 3: marks
Text 4: in Mathematics
5. Which function is used to construct a vector in R. Write a script to generate the following
list of numerical values with spaces:
3 5 6 9 11 34
6. List and explain operators used to form data subsets in R.
7. List the functions provided by R to combine different sets of data.
8. Suppose you have two datasets A and B. Dataset A has the following data: 1 2 4 5. Dataset B
has the following data: 6 7 8 9. Which function is used to combine the data from both
datasets into dataset C. Demonstrate the function with the input values and write the
output.
9. What are the advantages of using functions over scripts?
10 marks questions
1. List and discuss various types of data visualizations.
2. Discuss any five applications of data visualizations.
3. List and explain various functions that allow users to handle data in R workspace with
appropriate examples.
4. List and discuss various types of data structures in R.
5. Discuss the syntax of defining a function in R.