DL QB With Ans
DL QB With Ans
UNIT I
2. What are the main differences between AI, Machine Learning, and Deep Learning?
AI stands for Artificial Intelligence. It is a technique which enables machines to mimic
human behavior.
Machine Learning is a subset of AI which uses statistical methods to enable machines
to improve with experiences.
Deep learning is a part of Machine learning, which makes the computation of multi-
layer neural networks feasible. It takes advantage of neural networks to simulate
human-like decision making.
3. Differentiate supervised and unsupervised deep learning procedures.
Supervised learning is a system in which both input and desired output data are
provided. Input and output data are labeled to provide a learning basis for future data
processing.
Unsupervised procedure does not need labeling information explicitly, and the
operations can be carried out without the same. The common unsupervised learning
method is cluster analysis. It is used for exploratory data analysis to find hidden
patterns or grouping in data.
We usually identify the elements of a matrix by using its name in italics but not in bold,
and the subscripts are listed with separating commas.
Tensors: In some cases, we’ll need an array with more than two axes. In the general
case, an array of numbers arranged on a regular grid with a varying number of axes is
called a tensor. We note a tensor named “A” with this font: A.
Random variables may be discrete or continuous. A discrete random variable is one that
has a finite or countably infinite number of states. Note that these states are not
necessarily the integers; they can also just be named states that are not considered to
have any numerical value. A continuous random variable is associated with a real value.
• ∀x ∈ x,0 ≤ P(x) ≤ 1. An impossible event has probability 0 and no state can be less
probable than that. Likewise, an event that is guaranteed to happen has probability 1,
and no state can have a greater chance of occurring.
• ∑x∈x P(x) = 1. We refer to this property as being normalized. Without this property, we
could obtain probabilities greater than one by computing the probability of one of many
events occurring.
We can control whether a model is more likely to overfit or underfit by altering its
capacity
Most often, the existing methods of finding the parameters of large populations are
unrealistic. For example, when finding the average age of kids attending kindergarten, it will
be impossible to collect the exact age of every kindergarten kid in the world. Instead, a
statistician can use the point estimator to make an estimate of the population parameter.
Consistency tells us how close the point estimator stays to the value of the parameter as it
increases in size. The point estimator requires a large sample size for it to be more consistent
and accurate.
You can also check if a point estimator is consistent by looking at its corresponding
expected value and variance. For the point estimator to be consistent, the expected value
should move toward the true value of the parameter.
3. Most efficient or unbiased
The most efficient point estimator is the one with the smallest variance of all the unbiased
and consistent estimators. The variance measures the level of dispersion from the estimate,
and the smallest variance should vary the least from one sample to the other.
Slow Convergence: SGD may require more iterations to converge to the minimum
since it updates the parameters for each training example one at a time.
Sensitivity to Learning Rate: The choice of learning rate can be critical in SGD since
using a high learning rate can cause the algorithm to overshoot the minimum, while a
low learning rate can make the algorithm converge slowly.
Less Accurate: Due to the noisy updates, SGD may not converge to the exact global
minimum and can result in a suboptimal solution. This can be mitigated by using
techniques such as learning rate scheduling and momentum-based updates
When the feed forward neural network gets simplified, it can appear as a single layer
perceptron.
This model multiplies inputs with weights as they enter the layer. Afterward, the
weighted input values get added together to get the sum. As long as the sum of the
values rises above a certain threshold, set at zero, the output value is usually 1, while if it
falls below the threshold, it is usually -1.
Tanh:
Only positive values are allowed to flow through this function. Negative values get
mapped to 0.
28. What is Regularization?
Regularization is a technique used in machine learning and deep learning to prevent
overfitting and improve the generalization performance of a model. It involves adding a
penalty term to the loss function during training. This penalty discourages the model
from becoming too complex or having large parameter values, which helps in
controlling the model’s ability to fit noise in the training data. Regularization methods
include L1 and L2 regularization, dropout, early stopping, and more.
31. How does splitting a dataset into train, dev and test sets help identify overfitting?
• Overfitting: the model fits the training set so much that it does not generalize well.
• Low training error and high dev error can be used to identify this
• Must ensure that the distribution of train and dev is the same/similar!
PART B
1. Develop short notes on following with respect to deep learning with
Examples.
i) Scalar and Vectors. (6)
ii) Matrices. (7)
2. Explicate Probability Mass function and Probability Density function (13)
3. Describe Gradient-based optimization in deep learning.
4. Explain in detain on linear regression machine learning algorithm. (13)
5. Describe Stochastic Gradient Descent in detail. (13)
6. Explain in detail on different regularization technique in Deep learning? (13)
7. Brief how does regularization help reduce overfitting? (13)
8. Analyse and write short notes on Dataset Augmentation. (13)
9. Point out and explain different set of layers in Feed forward networks.
10. Describe Deep feed forward networks with neat diagram. (13)
PART C
1. Assess the following with respect to deep learning examples.
i) Random Variables. (6)
ii) Probability. (7)
2. Explain briefly on Estimators, Bias and Variance that are useful for generalization,
underfitting and overfitting.
3. Briefly explain an example of a fully functioning feed forward network on a simple
task.
4. Assess the difference between linear models and neural networks. (15)
UNIT II
CONVOLUTIONAL NEURAL NETWORKS
Convolution Operation -- Sparse Interactions -- Parameter Sharing -- Equivariance -- Pooling --
Convolution Variants: Strided -- Tiled -- Transposed and dilated convolutions; CNN Learning:
Computation.
Part A
Apply 2x2 filter to the input and get the first convolutional layer (a feature map)
2) Another hyperparameter is the stride that defines how much we slide the
filter over the data. For example if stride is 1 then we move the window
by 1 pixel at a time over the image, when our input is an image. When
we use larger values of stride 2 or 3 we allow jumping 2 or pixels at a
time. This reduces significantly the output size.
8. Write the formula to find how many neurons fit for a network?
To compute the spatial size of the output volume as a function of the input volume size (W),
the receptive field size of the Conv Layer neurons (F), the stride with which they are applied
(S), and the amount of zero padding used (P) on the border. The formula for calculating how
many neurons “fit” is given by
Max Pooling
Average Pooling
15. What is the difference between normal convolution and transposed convolution?
Traditional convolution determines the output value as the dot product between filter
and input, by moving the filter kernel for two pixels in every step, the input is
downsampled by factor two. For transposed convolution, the input value determines the
filter values that will be written to the output.
image and the amount of shift in the kernel when sliding it across the input, as they
would in a standard convolution operation.
2. Classification
Binary cross-entropy
Categorical cross-entropy
Cost Function:
A cost function, on the other hand, is the average loss over the entire training dataset.
25. What are the commonly used non linearity function using CNN?
1.Rectified Linear Unit (ReLU)
2. Leaky ReLU
3. Sigmoid
4. Hyperbolic Tangent (Tanh)
5. Softmax
26. Why is it important to place non-linearities between the layers of neural networks?
Non-linearity introduces more degrees of freedom to the model. It lets it capture more
complex representations which can be used towards the task at hand. A deep neural
network without non-linearities is essentially a linear regression.
27. Following the last FC-3 layer of your network, what activation must be applied?
Given a vector a = [0.3, 0.3, 0.3], what is the result of using your activation on this
vector?
Softmax is the one that is used as it can output class probabilities. Output is [0.33, 0.33,
0.33]
PART B
PART C
1. Explain how to build a CNN model from a scratch for any real time application.
2. Analyse, why to use Adam optimizer for CNN model for training purpose than to use
Gradient Descent or Stochastic Gradient Descent?
3. With an example explain the layers of CNN by running a covnets on of image of
dimension 32x32x3.
4. Let’s consider an image and apply the convolution layer, activation layer, and pooling
layer operation to extract the inside feature.
UNIT III
DEEP LEARNING ALGORITHMS FOR AI
Artificail Neural Netowrks – Linear Associative Networks – Perceptrons -The
Backpropagation Algorithm - Hopfield Nets - Boltzmann Machines - Deep RBMs - Variational
Autoencoders - DeepBackprop Networks- Autoencoders
PART A
1. Draw a simplified taxonomy of artificial neural network.
12. Brief about the relationship between LAN, Perceptron’s and backpropagation
algorithm.
The relationship between these concepts are, LANs are an early type of neural network
model that inspired the development of more complex architectures like perceptrons.
Perceptrons, in turn, laid the foundation for multilayer neural networks, which are
trained using the backpropagation algorithm to learn complex mappings between inputs
and outputs. The backpropagation algorithm revolutionized the field of neural networks
and played a crucial role in their widespread adoption and success in various
applications.
A deep Boltzmann machine is a model with more hidden layers with directionless
connections between the nodes as shown in Fig. DBM learns the features hierarchically
from the raw data and the features extracted in one layer are applied as hidden variables
as input to the subsequent layer.
PART B
1. Differentiate Artificial neurons with Biological neurons (13)
2. How do the artificial neurons work? Explain with neat diagram (13)
3. Explain with diagram the architecture of linear associative network (13)
4. Describe how does perceptron works in artificial neural network. (13)
5. Explain in detail on Perceptron function, inputs, activation function and outputs of
perceptron (13)
6. How backpropagation algorithm work for neural network in detail? (13)
7. Why backpropagation algorithm for neural network in detail? (13)
8. Elucidate briefly on structure and architecture of Hopfield network. (13)
9. Describe in detail on Deep Restricted Boltzmann Machines (RBMs). (13)
10. Explain in detail on the architecture of autoencoders and how to train autoencoders?
(13)
PART C
3. Consider the following problem. We are required to create Discrete Hopfield Network
with bipolar representation of input vector as [1 1 1 -1] or [1 1 1 0] (in case of binary
representation) is stored in the network. Test the hopfield network with missing entries
in the first and second component of the stored vector (i.e. [0 0 1 0]). (15)
4. Assess the working principle of how restricted Boltzmann machines work with suitable
example. (15)
UNIT IV
DATA SCIENCE AND DEEP LEARNING
Data science fundamentals and responsibilities of a data scientist - life cycle of data science –
Data science tools - Data modeling, and featurization - How to work with data variables and
data science tools - How to visualize the data - How to work with machine learning algorithms
and Artificial Neural Networks
PART A
Data science is the science of analyzing raw data using statistics and machine learning
techniques with the purpose of drawing conclusions about that information.
Who integrates the skills of software programmer, statistician and storyteller slash artist
to extract the nuggets of gold hidden under mountains of data.
Management
Analytics
Strategy/Design
Collaboration
Knowledge
It helps to convert the big quantity of uncooked and unstructured records into significant
insights.
It can assist in unique predictions such as a range of surveys, elections, etc.
It also helps in automating transportation such as growing a self-driving car, we can say
which is the future of transportation.
Companies are shifting towards Data science and opting for this technology. Amazon,
Netflix, etc, which cope with the big quantity of data, are the use of information science
algorithms for higher consumer experience.
Apache Hadoop
Apache Spark
Data Robot
Tableau
BigML
TensorFlow
Jupyter
The modeling techniques are the most compensating process that has become the center
of attention for the data learners. It is not just about applying functions from one package
class and applying it to the available data. There is more to it than that.
It is a process that converts the nested JSON object into a pointer. It becomes a vector
of scalar value that is the basic requirement for the analysis process.
JSON is a lightweight format for the data set through which machines can easily write
and read. The main reason behind using JSON is that it can easily and strongly interact with
different languages (platform) such as JavaScript, R, Python, etc. The software that is used
to interact with the stored data is mainly for the data that is influenced by JSON.
a) Exploration:
Investigate the dataset and identify some of its main features, laying the
foundation for more thorough analysis. At this stage, visualizations can make it
easier to get a sense of what’s in your dataset and to spot any noteworthy trends or
anomalies.
b) Explanation
Once you’ve conducted your analysis and have figured out what the data is
telling you, you’ll want to share these insights with others—key business
stakeholders who can take action based on the data.
Get an initial understanding of your data by making trends, patterns, and outliers
easily visible to the naked eye
Communicate insights and findings to non-data experts, making your data accessible
and actionable
Tell a meaningful and impactful story, highlighting only the most relevant
information for a given context
Hierarchical visualizations
Network visualizations
Multidimensional or 3D visualizations
Geospatial visualizations
Charts
Tables
Graphs
Maps
Infographics
Dashboards
Keep it simple
19. What are the similarities between data science and AI?
Both are interdisciplinary fields that draw from computer science, mathematics, and
statistics.
Featurization is the process to convert varied forms of data to numerical data which can
be used for basic ML algorithms. Data can be text data, images, videos, graphs, various
database tables, time-series, categorical features, etc.
Technologies that come under the umbrella of AI include machine learning and deep
learning. Machine learning enables software applications to become more accurate at
predicting outcomes without being explicitly programmed to do so. Machine learning
algorithms use historical data as input to predict new output values.
PART – B
1. Explicate in detail about data science life cycle with neat diagram? (13)
2. Briefly explain the most commonly used data science tools with example? (13)
4. How to visualize the data and explain the visualization techniques with suitable example.
(13)
5. Compare and Contrast on Machine learning and Artificial Neural Network in detail. (13)
6. How to work with data variables and data science tools with example? (13)
7. How do the artificial neurons work? Explain with neat diagram (13)
8. Discuss step by step working of the Artificial Neural Network in detail. (13)
PART C
3. Explicate the steps involved for machine learning using algorithms that automatically
help the system to gather and use data to learn more with real time example. (15)
UNIT V
APPLICATIONS OF DEEP LEARNING
Detection in chest X-ray images -object detection and classification -RGB and depth image
fusion - NLP tasks - dimensionality estimation - time series forecasting -building electric
power grid for controllable energy resources - guiding charities in maximizing donations and
robotic control in industrial environments.
Bounding box which identifies the edges of the object tagged with a clear-cut
quadrilateral — typically either a square or rectangle.
Label of the object, whether it be a person, a car, or a dog to describe the target
object. Bounding boxes can overlap to showcase multiple objects in a given shot as
long as the model has prior knowledge of items it is tagging.
This is the prediction of the class of an item in an image. Image classification can show
that a particular object exists in the image, but it involves one primary object and does not
provide the location of the object within the visual.
Object localization seeks to identify the location of one or more objects in an image,
whereas object detection identifies all objects and their borders without much focus on
placement.
R-CNN
Fast R-CNN
Faster R-CNN
7. What is NLP?
Speech Recognition
Machine Translation
Document Summarization
Time series forecasting is a technique for the prediction of events through a sequence of
time. It predicts future events by analyzing the trends of the past, on the assumption that
future trends will hold similar to historical trends. It is used across many fields of study in
various applications including: Astronomy.
10. Mention the deep learning architectures specialized in time series forecasting?
N-BEATS (ElementAI)
DeepAR (Amazon)
Spacetimeformer
Multiple time-series
Interpretability
The model would consider both temporal and spatial relationships. This is the core idea
of Spacetimeformer.
14. What is the difference between depth image and RGB image?
The RGB image represents a 24-bit integer value for each pixel with a fixed resolution of
1/256 of a millimeter. Each bit of blue represents 1/256 millimeter. Each bit of green
represents 1 millimeter. Each bit of red represents 256 millimeters
The main purpose of the RGB color model is for the sensing, representation, and display of
images in electronic systems, such as televisions and computers, though it has also been used
in conventional photography.
18. What are the approaches used for dimensionality estimation in deep learning?
Probabilistic forecasting
The main drive behind the use of deep learning in robotics is that it is more general than
any other learning algorithm. It has been proven deep networks are capable of thinking and
abstraction at a high level
The arm has a controller which is the “brain” of the system. The controller holds the
programming code and receives signals from the system (input), processes the signals, and
then sends signals out to the system (output) to control the robot.
PRM Algorithm
They are used in complex planning systems and also to find low cost paths around
obstacles. PRMs use a random sample of points on their map where a robot device can
possibly move and then the shortest path is calculated.
PART B
1. Explain in detail on how the object is detected and classified using deep learning
concepts.(13)
2. Detailed overview on building electric power grid for controllable energy resources in
deep learning concept. (13)
3. Explain in detail on the process involved for prediction on historical time dependent data
using neural network. (13)
N-Beats
Deep AR
Space time former
Temporal Fusion transformer
PART C
1. Build a deep learning model for guiding charities in maximizing donations (15)
2. Build a deep learning model for detecting chest X-ray images using tensor flow (15)
3. Implement a model for object detection of traffic images using python. (15)