TOPIC WISE DSA QUESTIONS
TOPIC WISE DSA QUESTIONS
MODULE 1
Data Visualization
1. What is data Visualization? Explain bar chart and line chart. (8)
2. Explain Data Visualization and recognize its use. Sketch Python code segment to
visualize line chart and scatterplot with example. (6)
3. With matplotlib explain simple line chart and bar chart. (8)
4. Write a short note on data visualization. (6)
5. Describe the process of creating a bar chart using matplotlib. What information is
typically conveyed by a bar chart? (4)
6. Explain the concept of correlation and its significance in data analysis. Discuss
Simpson’s Paradox and other correlational caveats with examples. (8)
7. Explain with example the matplotlib library in python. (6)
8. Draw the scatter plot to illustrate the relationship between the number of friends and
the number of minutes spent on every day. (4)
9. Develop a python program to plot a bar chart for the given data. Draw the bar chart
and label x and y axes. (8)
10. Develop a python program to plot a line chart for the given data. Explain the various
attributes of the line chart. Draw the line chart. (6)
Find the standard deviation of salary of employees in each dept. of a company
and identify the department with the highest standard deviation. (7)
Find the mean and median salary of employees in each department of the
company. (7)
Simpson’s Paradox
Correlation vs Causation
1. Explain the difference between correlation and causation. Why is it incorrect to infer
causation from correlation alone? Describe an example where correlation does not
imply causation. (7)
2. Describe the statement “Correlation is not Causation” with an example in detail. (6)
Data Science
1. Define Data Science. Explain the Venn diagram of Data Science. (6)
2. What is Data Science? Write a short note on data visualization. (6)
3. What is Data Science? With example explain the role of a data scientist. (8)
4. Who is a Data scientist? Draw the data science life cycle in detail. (8)
Random Variables
Gradient Descent:
Web Scraping:
1. What is Simple Linear Regression? How is error calculated in the Linear Regression
model? How would you detect overfitting in a linear model?
2. Explain the mathematical intuition of Multiple Linear Regression. Explain the steps.
3. Explain how gradient descent is used to fit parameterized models.
Dimensionality Reduction:
Miscellaneous:
1. Predict the genre of the ‘Barbie’ movie with IMDB=7.4 and duration 114 using KNN,
considering k=3.
1. Explain the simple linear regression model in detail and write a Python program to
illustrate gradient descent for a simple linear regression model.
2. Write a note on simple linear regression using gradient descent.
1. What is feature extraction, and why is it important in machine learning? Explain the
difference between feature extraction and feature selection.
2. Write a short note on feature extraction and selection.
3. Illustrate the process of feature extraction and selection in machine learning. Why is
this step important, and what techniques are commonly used?
Support Vector Machines (SVM)
Iris Dataset
1. Describe the Iris dataset and its significance in machine learning. What are the
features and target variables in the Iris dataset? How is the Iris dataset typically used
to demonstrate classification algorithms?
2. What is the Iris Dataset? Build a model that can predict the class from the first four
measurements.
3. Write a Python program to build a K-nearest neighbor model that can predict the class
from the Iris dataset.
1. What is a model in the context of machine learning? Explain the difference between
supervised and unsupervised learning models.
2. Discuss the need for fitting the model in Multiple Regression.
Digression
Regularization
Decision Trees
1. Illustrate the working of decision tree and explain the importance of entropy in
decision trees.
2. Can decision trees handle continuous data? If so, how is entropy used to handle
continuous data in decision trees? What are the limitations of decision trees?
3. Discuss decision trees in detail and provide a Python program to create a decision
tree.
4. Describe the decision tree process with Python and demonstrate the ID3 algorithm.
5. Consider the following dataset. Write a program to demonstrate the working of the
decision tree based ID3 algorithm.
6. Explain the role of entropy and entropy partition in creating a decision tree with
explanation and Python code.
7. Describe how entropy is used to create a decision tree and provide an example to
illustrate the process.
1. Define a feedforward neural network and explain the backpropagation method for
training neural networks.
2. Describe the basic architecture of a feedforward neural network and explain the
concept of a loss function.
3. Discuss the role of the backpropagation algorithm in training neural networks.
4. Explain layer abstraction in deep learning and provide a Python program to compute
loss and optimization in deep learning.
5. Illustrate K-Nearest Neighbors with code.
6. Define neural networks and explain implementing AND function using the perceptron
algorithm.
7. Illustrate the backpropagation algorithm, its importance in training neural networks,
and how gradients are computed and weights are updated.
Deep Learning vs. Machine Learning
1. Define an optimization algorithm and explain its role in training deep learning
models. Describe gradient descent and its variants.
2. Define entropy and write code for entropy calculation.
3. Write a function to compute gradients for backpropagation.
4. Write code to train a network that computes XOR using a new framework.
5. Write code to generate any number of clusters by performing the appropriate number
of unmerges.
6. Explain the process of training a neural network on the MNIST dataset, including
architecture, input preprocessing, evaluation metrics, and a summary of the network's
performance.
Miscellaneous
Recommender Systems
1. Illustrate the PageRank algorithm and its application in directed graphs. How does it
work and what is its significance in network analysis?
2. Develop a Python function for the PageRank algorithm for a directed graph.
3. Explain PageRank with the Hypertext Induced Topic Selection algorithm in terms of
their underlying principles and use cases.
Additional Topics