0% found this document useful (0 votes)
3 views

DL_4

The document outlines the course details for 'Programming for Data Analytics' taught by Dr. Kumod Kumar Gupta at the Noida Institute of Engineering and Technology. It includes faculty information, evaluation schemes, syllabus, course objectives, outcomes, and applications of data analytics across various industries. Additionally, it covers specific topics such as Recurrent Neural Networks and their applications in fields like healthcare and finance.

Uploaded by

gs275103
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

DL_4

The document outlines the course details for 'Programming for Data Analytics' taught by Dr. Kumod Kumar Gupta at the Noida Institute of Engineering and Technology. It includes faculty information, evaluation schemes, syllabus, course objectives, outcomes, and applications of data analytics across various industries. Additionally, it covers specific topics such as Recurrent Neural Networks and their applications in fields like healthcare and finance.

Uploaded by

gs275103
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 119

Noida Institute of Engineering and Technology,

Greater Noida

Unit: 4

Dr. Kumod Kumar


Gupta
Course Details Associate Professor
(B.Tech 7rth Sem /3rd Year) CS (AI) DEPARTMENT

Dr. Kumod kumar Gupta Programming for Data analytics Unit 4


1
05/17/2025
Noida Institute of Engineering and Technology, Greater Noida

Dr. Kumod Kumar Gupta, Ph.D. from Banasthali University, Jaipur. Presently,
he is working as an assistant professor NIET, Greater Noida (INDIA). He has
more than 16 years of teaching & research experience. His research interests
include VLSI Design, Machine Learning and Artificial Neural Network. He
has published more than 17 papers in refereed international journals. He has
attended various training programs in the area of electronics & communication
engineering.

05/17/2025 Dr. Kumod kumar Gupta Programming 2


for Data analytics Unit 4
Table of Contents

1. Name of Subject with code, Course and Subject Teacher


2. Brief Introduction of Faculty member with Photograph
3. Evaluation Scheme
4. Subject Syllabus
5. Branch wise Applications
6. Course Objective (Point wise)
7. Course Outcomes (COs)
8. Program Outcomes only heading (POs)
9. COs and POs Mapping
10. Program Specific Outcomes (PSOs)

05/17/2025 Dr. Kumod kumar Gupta Programming 3


for Data analytics Unit 4
Table of Contents

11. COs and PSOs Mapping


12. Program Educational Objectives (PEOs)
13. Result Analysis (Department Result, Subject Result and
Indivisual Faculty Result)
14. End Semester Question Paper Templates (Offline
Pattern/Online Pattern)
15. Prerequisite/ Recap
16. Brief Introduction about the Subject with videos
17. Unit Content
18. Unit Objective
19. Topic Objective/Topic Outcome
20. Lecture related to topic
21. Daily Quiz
22. Weekly Assignment
05/17/2025 Dr. Kumod kumar Gupta Programming 4
for Data analytics Unit 4
Table of Contents
23 Topic Links
24 MCQ (End of Unit)
25 Glossary Questions
26 Old Question Papers (Sessional + University)
27 Expected Questions
28 Recap of Unit

05/17/2025 Dr. Kumod kumar Gupta Programming 5


for Data analytics Unit 4
Evaluation Scheme

05/17/2025 Dr. Kumod kumar Gupta Programming 6


for Data analytics Unit 4
Syllabus

05/17/2025 Dr. Kumod kumar Gupta Programming 7


for Data analytics Unit 4
Branch-wise Applications
Data analytics is used in most sectors of businesses. Here are some primary
areas where data analytics does its magic:
1.Data analytics is used in the banking and e-commerce
industries to detect fraudulent transactions.
2.The healthcare sector uses data analytics to improve patient
health by detecting diseases before they happen. It is
commonly used for cancer detection.
3.Data analytics finds its usage in inventory management to
keep track of different items.
4.Logistics companies use data analytics to ensure faster
delivery of products by optimizing vehicle routes.
5.Marketing professionals use analytics to reach out to the right
customers and perform targeted marketing to increase ROI.
6.Data analytics can be used for city planning, to build smart
cities.

05/17/2025 Dr. Kumod kumar Gupta Programming 8


for Data analytics Unit 4
Course Objectives

• Demonstrate knowledge of statistical data analysis


techniques utilized in business decision making.
• Apply principles of Data Science to the analysis of business
problems.
• Use data mining software to solve real-world problems.
• Employ cutting edge tools and technologies to analyze Big
Data.

05/17/2025 Dr. Kumod kumar Gupta Programming 9


for Data analytics Unit 4
Course Outcomes
After completion of this course students will be able to:

CO1 Install, Code and Use Python & R Programming Language


in R Studio IDE to perform basic tasks on Vectors,
Matrices and Data frames.

CO2 Implement the concept of the R packages.

CO3 Understand the basic concept of the MongoDB.

Understand and apply the concept of the RNN and


CO4 tensorflow.
Understand and evaluate the concept of the keras in deep
CO5 learning.
05/17/2025 Dr. Kumod kumar Gupta Programming 10
for Data analytics Unit 4
Program Outcomes
. Engineering knowledge:
2. Problem analysis:
3. Design/development of solutions:
4.Conduct investigations of complex problems
5. Modern tool usage
6. The engineer and society
7. Environment and sustainability
8. Ethics:
9. Individual and team work
10. Communication:
11. Project management and finance
12. Life-long learning

05/17/2025 Dr. Kumod kumar Gupta Programming 11


for Data analytics Unit 4
CO-PO Mapping
Mapping of Course Outcomes and Program Outcomes:

PO PO PO PO PO PO PO PO PO9 PO1 PO1 PO1


1 2 3 4 5 6 7 8 0 1 2

CO.1 3 3 3 2 1 2 3

CO.2 3 3 3 2 1 2 3

CO.3 3 3 3 2 1 2 3

CO.4 3 3 3 2 1 2 3

CO.5 3 3 3 2 1 2 3

05/17/2025 Dr. Kumod kumar Gupta Programming 12


for Data analytics Unit 4
Program Specific Outcomes

On successful completion of graduation degree the


Engineering graduates will be able to:
• PSO1: Design innovative intelligent systems for the welfare
of the people using machine learning and its applications.
• PSO2: Demonstrate ethical, professional and team-oriented
skills while providing innovative solutions in Artificial
Intelligence and Machine Learning for life-long learning.

05/17/2025 Dr. Kumod kumar Gupta Programming 13


for Data analytics Unit 4
CO-PSO Mapping
Mapping of Course Outcomes and Program Specific Outcomes:

PSO1 PSO2
CO.1 2 3

CO.2 2 3

CO.3 2 3

CO.4 2 3

CO.5 2 3

05/17/2025 Dr. Kumod kumar Gupta Programming 14


for Data analytics Unit 4
Program Educational Objectives

PEO1: Pursue higher education and professional career to


excel in the field of Artificial Intelligence and Machine
Learning.
PEO2: Lead by example in innovative research and
entrepreneurial zeal for 21st century skills.
PEO3: Proactively provide innovative solutions for societal
problems to promote life-long learning.

05/17/2025 Dr. Kumod kumar Gupta Programming 15


for Data analytics Unit 4
End Semester Question Paper Templates (Offline Pattern/Online Pattern)

05/17/2025 Dr. Kumod kumar Gupta Programming 16


for Data analytics Unit 4
End Semester Question Paper Templates (Offline Pattern/Online Pattern)

05/17/2025 Dr. Kumod kumar Gupta Programming 17


for Data analytics Unit 4
End Semester Question Paper Templates (Offline Pattern/Online Pattern)

05/17/2025 Dr. Kumod kumar Gupta Programming 18


for Data analytics Unit 4
End Semester Question Paper Templates (Offline Pattern/Online Pattern)

05/17/2025 Dr. Kumod kumar Gupta Programming 19


for Data analytics Unit 4
End Semester Question Paper Templates (Offline Pattern/Online Pattern)

05/17/2025 Dr. Kumod kumar Gupta Programming 20


for Data analytics Unit 4
Prerequisite/Recap

Students must know computer programming and related programming


paradigms

05/17/2025 Dr. Kumod kumar Gupta Programming 21


for Data analytics Unit 4
About the Subject with videos

Data is getting generated at a massive rate, by the minute. Organizations, on the


other hand, are trying to explore every opportunity to make sense of this data.
This is where Data analytics has become crucial in running a business successfully. It
is commonly used in companies to drive profit and business growth
Link:
Unit 1 https://www.ibm.com/cloud/blog/python-vs-r

Unit 2 https://www.youtube.com/watch?v=C5R5SdYzQBI

Unit 3 https://hevodata.com/learn/data-engineering-and-data-engineers
/
Unit 4 https://www.youtube.com/watch?v=IjEZmH7byZQ

Unit 5 https://www.youtube.com/watch?v=pWp3PhYI-OU

05/17/2025 Dr. Kumod kumar Gupta Programming 22


for Data analytics Unit 4
UNIT CONTENT

Pandas data structures - Series and Data Frame,


Data wrangling using pandas,
Statistics with Pandas,
Mathematical Computing Using NumPy,
Data visualization with Python Descriptive and Inferential Statistics,
Introduction to Model Building,
Probability and Hypothesis Testing,
Sensitivity Analysis, Regular expression: RE packages.

05/17/2025 Dr. Kumod kumar Gupta Programming 23


for Data analytics Unit 4
Unit Objectives

Learn what data analytics is and the various applications of data analytics.
Study of different types of data analytics and process steps. And perform data
analytics using Python’s NumPy, Pandas, and Matplotlib libraries.

05/17/2025 Dr. Kumod kumar Gupta Programming 24


for Data analytics Unit 4
Topic Objectives

Understand the fundamentals of the Pandas library in


Python and how it is used to handle data.
Learn how to work with arrays, queries, and dataframes.
Learn how to use the groupby, merge, and join methods in
Pandas.

05/17/2025 Dr. Kumod kumar Gupta Programming 25


for Data analytics Unit 4
Introduction to TensorFlow and AI
Day1 17/02/23
About tensor flow
Basics of tensor flow
Activation function
ANN theory
Assignment day1
Second half
CNN
CNN program
Day2 19/02/23
Revise ANN and CNN
ANN program
Tensor Board
Second Half
RNN
One hot encoding program
Sequence program
Understanding basic RNN
Assignment day2
05/17/2025 Dr. Kumod kumar Gupta Programming 26
for Data analytics Unit 4
Introduction to TensorFlow and AI

Introduction to tensor flow and AI

05/17/2025 Dr. Kumod kumar Gupta Programming 27


for Data analytics Unit 4
Recurrent Neural Networks

Several neural networks can help solve different business problems. Let’s
look at a few of them.
•Feed-Forward Neural Network: Used for general
Regression and Classification problems.
•Convolutional Neural Network: Used for object detection and image
classification.
•Deep Belief Network: Used in healthcare sectors for cancer detection.
•RNN: Used for speech recognition, voice recognition, time series
prediction, and natural language processing.

05/17/2025 Dr. Kumod kumar Gupta Programming 28


for Data analytics Unit 4
Recurrent Neural Networks

What Is a Recurrent Neural Network (RNN)?


• RNN works on the principle of saving the output of a particular layer and
feeding this back to the input in order to predict the output of the layer.

Below is how you can convert a Feed-Forward Neural Network into a


Recurrent Neural Network:

Fig: Simple Recurrent Neural Network

05/17/2025 Dr. Kumod kumar Gupta Programming 29


for Data analytics Unit 4
Recurrent Neural Networks

The nodes in different layers of the neural network are compressed to form
a single layer of recurrent neural networks. A, B, and C are the parameters
of the network.

Fig: Fully connected Recurrent Neural Network


05/17/2025 Dr. Kumod kumar Gupta Programming 30
for Data analytics Unit 4
Recurrent Neural Networks
• Here, “x” is the input layer, “h” is the hidden layer, and “y” is the output
layer. A, B, and C are the network parameters used to improve the output
of the model.
• At any given time t, the current input is a combination of input at x(t) and
x(t-1). The output at any given time is fetched back to the network to
improve on the output.

Fig: Fully connected Recurrent Neural Network


05/17/2025 Dr. Kumod kumar Gupta Programming 31
for Data analytics Unit 4
Recurrent Neural Networks

Why Recurrent Neural Networks?


RNN were created because there were a few issues in the feed-forward
neural network:
•Cannot handle sequential data
•Considers only the current input
•Cannot memorize previous inputs
The solution to these issues is the RNN. An RNN can handle sequential data,
accepting the current input data, and previously received inputs. RNNs can
memorize previous inputs due to their internal memory.

05/17/2025 Dr. Kumod kumar Gupta Programming 32


for Data analytics Unit 4
Recurrent Neural Networks

How Does Recurrent Neural Networks Work?


In Recurrent Neural networks, the information cycles through a loop to the
middle hidden layer.

Fig: Working of Recurrent Neural


Network
05/17/2025 Dr. Kumod kumar Gupta Programming 33
for Data analytics Unit 4
Recurrent Neural Networks

• The input layer ‘x’ takes in the input to the neural network and processes it
and passes it onto the middle layer.
• The middle layer ‘h’ can consist of multiple hidden layers, each with its own
activation functions and weights and biases. If you have a neural network
where the various parameters of different hidden layers are not affected by
the previous layer, ie: the neural network does not have memory, then you
can use a recurrent neural network.
• The Recurrent Neural Network will standardize the different activation
functions and weights and biases so that each hidden layer has the same
parameters. Then, instead of creating multiple hidden layers, it will create
one and loop over it as many times as required.

05/17/2025 Dr. Kumod kumar Gupta Programming 34


for Data analytics Unit 4
Recurrent Neural Networks

Applications of Recurrent Neural Networks


Image Captioning
RNNs are used to caption an image by analyzing the activities
present.

Time Series Prediction


Any time series problem, like predicting the prices of stocks in a particular
month, can be solved using an RNN.

05/17/2025 Dr. Kumod kumar Gupta Programming 35


for Data analytics Unit 4
Recurrent Neural Networks

Natural Language Processing


Text mining and Sentiment analysis can be carried out using an RNN for
Natural Language Processing (NLP).

05/17/2025 Dr. Kumod kumar Gupta Programming 36


for Data analytics Unit 4
Recurrent Neural Networks

Machine Translation
Given an input in one language, RNNs can be used to translate the input into
different languages as output.

05/17/2025 Dr. Kumod kumar Gupta Programming 37


for Data analytics Unit 4
Recurrent Neural Networks

Types of Recurrent Neural Networks


There are four types of Recurrent Neural Networks:
1.One to One
2.One to Many
3.Many to One
4.Many to Many

05/17/2025 Dr. Kumod kumar Gupta Programming 38


for Data analytics Unit 4
Recurrent Neural Networks

One to One RNN


This type of neural network is known as the Vanilla Neural Network. It's
used for general machine learning problems, which has a single input and a
single output.

05/17/2025 Dr. Kumod kumar Gupta Programming 39


for Data analytics Unit 4
Recurrent Neural Networks

One to Many RNN


This type of neural network has a single input and multiple outputs. An
example of this is the image caption.

05/17/2025 Dr. Kumod kumar Gupta Programming 40


for Data analytics Unit 4
Recurrent Neural Networks

Many to One RNN


This RNN takes a sequence of inputs and generates a single output.
Sentiment analysis is a good example of this kind of network where a given
sentence can be classified as expressing positive or negative sentiments .

05/17/2025 Dr. Kumod kumar Gupta Programming 41


for Data analytics Unit 4
Recurrent Neural Networks

Many to Many RNN


This RNN takes a sequence of inputs and generates a sequence of outputs.
Machine translation is one of the examples.

05/17/2025 Dr. Kumod kumar Gupta Programming 42


for Data analytics Unit 4
Recurrent Neural Networks

Two Issues of Standard RNNs


1. Vanishing Gradient Problem
• Recurrent Neural Networks enable you to model time-dependent and
sequential data problems, such as stock market prediction, machine
translation, and text generation. You will find, however, RNN is hard to
train because of the gradient problem.

05/17/2025 Dr. Kumod kumar Gupta Programming 43


for Data analytics Unit 4
Recurrent Neural Networks
• RNNs suffer from the problem of vanishing gradients. The gradients carry
information used in the RNN, and when the gradient becomes too small,
the parameter updates become insignificant. This makes the learning of
long data sequences difficult.

05/17/2025 Dr. Kumod kumar Gupta Programming 44


for Data analytics Unit 4
Recurrent Neural Networks

2. Exploding Gradient Problem


• While training a neural network, if the slope tends to grow exponentially
instead of decaying, this is called an Exploding Gradient.
• This problem arises when large error gradients accumulate, resulting in
very large updates to the neural network model weights during the
training process.
Long training time, poor performance, and bad accuracy are the major issues
in gradient problems.
Gradient Problem Solutions

05/17/2025 Dr. Kumod kumar Gupta Programming 45


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Now, let’s discuss the most popular and efficient way to deal with gradient
problems, i.e., Long Short-Term Memory Network (LSTMs).
First, let’s understand Long-Term Dependencies.
Suppose you want to predict the last word in the text: “The clouds are in the
______.”
The most obvious answer to this is the “sky.” We do not need any further
context to predict the last word in the above sentence.
Consider this sentence: “I have been staying in Spain for the last 10 years…I
can speak fluent ______.”
The word you predict will depend on the previous few words in context. Here,
you need the context of Spain to predict the last word in the text, and the most
suitable answer to this sentence is “Spanish.” The gap between the relevant
information and the point where it's needed may have become very large.
LSTMs help you solve this problem.

05/17/2025 Dr. Kumod kumar Gupta Programming 46


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

• LSTM (Long Short-Term Memory) is a type of recurrent neural network (RNN) that is
specifically designed to address the vanishing gradient problem.
• The vanishing gradient problem occurs in RNNs when the gradients of the loss
function with respect to the model parameters become very small or even zero, as the
network iterates over long sequences of data. This can make it difficult for the network
to learn long-term dependencies between inputs and outputs.
• LSTMs solve the vanishing gradient problem by using three gates to control the flow
of information through the network:
• The forget gate determines how much of the previous cell state is forgotten at each
time step.
•The input gate determines how much of the current input is added to the cell state.
•The output gate determines how much of the cell state is output to the next time step.
The forget gate is particularly important for preventing the vanishing gradient problem.
The forget gate can be used to selectively forget information from the previous cell state,
even if that information is important for long-term dependencies. This allows the LSTM
to learn long-term dependencies without the gradients becoming too small.

05/17/2025 Dr. Kumod kumar Gupta Programming 47


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

• In addition to the forget gate, LSTMs also use a sigmoid activation function for the
input and output gates.
• Sigmoid activation functions have a range of [0, 1], which means that they can be
used to control the flow of information in a more gradual way than other activation
functions, such as the rectified linear unit (ReLU) activation function.
• This helps to prevent the gradients from becoming too large or too small, which can
also contribute to the vanishing gradient problem.
• As a result of these design features, LSTMs are able to learn long-term dependencies
much more effectively than RNNs without these features.
• This makes them well-suited for tasks that require the network to remember
information over long periods of time, such as machine translation, speech
recognition, and natural language processing.
• However, it is important to note that LSTMs do not completely solve the
vanishing gradient problem. In some cases, the gradients in LSTMs can still
become very small, which can make it difficult for the network to learn. However,
LSTMs are much more resistant to the vanishing gradient problem than RNNs
without these features. this can be solved via GRU

05/17/2025 Dr. Kumod kumar Gupta Programming 48


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)
Backpropagation Through Time
Backpropagation through time is when we apply a Backpropagation
algorithm to a Recurrent Neural network that has time series data as its
input.
In a typical RNN, one input is fed into the network at a time, and a single
output is obtained. But in backpropagation, you use the current as well as
the previous inputs as input. This is called a timestep and one timestep will
consist of many time series data points entering the RNN simultaneously.
Once the neural network has trained on a time set and given you an output,
that output is used to calculate and accumulate the errors. After this, the
network is rolled back up and weights are recalculated and updated keeping
the errors in mind.

https://towardsdatascience.com/
understanding-rnns-lstms-and-grus-
ed62eb584d90

05/17/2025 Dr. Kumod kumar Gupta Programming 49


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Long Short-Term Memory Networks


LSTMs are a special kind of RNN — capable of learning long-term
dependencies by remembering information for long periods is the default
behavior.
All RNN are in the form of a chain of repeating modules of a neural network.
In standard RNNs, this repeating module will have a very simple structure,
such as a single tanh layer.

Forget Gate

o/p
Gate

Input Gate

Fig: Long Short Term Memory


05/17/2025 Networks Dr. Kumod kumar Gupta
for Data analytics Unit 4
Programming 50
Long Short-Term Memory Network (LSTMs)

LSTMs also have a chain-like structure, but the repeating module is a bit
different structure. Instead of having a single neural network layer, four
interacting layers are communicating extraordinarily.

Using multiple activation function to refine information and last activation


tanh making the previous output suitable for next layer
05/17/2025 Dr. Kumod kumar Gupta Programming 51
for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Workings of LSTMs in RNN

LSTMs work in a 3-step process.

05/17/2025 Dr. Kumod kumar Gupta Programming 52


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Forget Gate in LSTM Cell Input gate in the LSTM cell

Output gate in the LSTM cell


05/17/2025 Dr. Kumod kumar Gupta Programming 53
for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Forget Gate
• The information that is no longer useful in the cell state is removed with the
forget gate.
• Two inputs x_t (input at the particular time) and h_t-1 (previous cell output) are
fed to the gate and multiplied with weight matrices followed by the addition of
bias.
• The resultant is passed through an activation function which gives a binary
output. If for a particular cell state, the output is 0, the piece of information is
forgotten and for output 1, the information is retained for future use.

Forget Gate in LSTM Cell

05/17/2025 Dr. Kumod kumar Gupta Programming 54


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)
Input gate
• The addition of useful information to the cell state is done by the input gate.
• First, the information is regulated using the sigmoid function and filter the
values to be remembered similar to the forget gate using inputs h_t-1 and x_t.
Then, a vector is created using the tanh function that gives an output from -1 to
+1, which contains all the possible values from h_t-1 and x_t.
• At last, the values of the vector and the regulated values are multiplied to
obtain useful information.

Input gate in the LSTM cell

05/17/2025 Dr. Kumod kumar Gupta Programming 55


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)
Output gate
• The task of extracting useful information from the current cell state to be
presented as output is done by the output gate.
• First, a vector is generated by applying the tanh function on the cell. Then, the
information is regulated using the sigmoid function and filtered by the values
to be remembered using inputs h_t-1 and x_t.
• At last, the values of the vector and the regulated values are multiplied to be
sent as an output and input to the next cell.

Output gate in the LSTM cell

05/17/2025 Dr. Kumod kumar Gupta Programming 56


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Step 1: Decide How Much Past Data It Should Remember


The first step in the LSTM is to decide which information should be
omitted from the cell in that particular time step. The sigmoid
function determines this. It looks at the previous state (ht-1) along
with the current input xt and computes the function.

05/17/2025 Dr. Kumod kumar Gupta Programming 57


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Consider the following two sentences:


Let the output of h(t-1) be “Alice is good in Physics. John, on the other hand, is
good at Chemistry.”
Let the current input at x(t) be “John plays football well. He told me yesterday
over the phone that he had served as the captain of his college football team.”
The forget gate realizes there might be a change in context after encountering
the first full stop. It compares with the current input sentence at x(t). The next
sentence talks about John, so the information on Alice is deleted. The position
of the subject is vacated and assigned to John.

05/17/2025 Dr. Kumod kumar Gupta Programming 58


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Step 2: Decide How Much This Unit Adds to the Current State
In the second layer, there are two parts. One is the sigmoid function, and the
other is the tanh function. In the sigmoid function, it decides which values to
let through (0 or 1). tanh function gives weightage to the values which are
passed, deciding their level of importance (-1 to 1).

With the current input at x(t), the input gate analyzes the important
information — John plays football, and the fact that he was the captain of his
college team is important.
“He told me yesterday over the phone” is less important; hence it's forgotten.
This process of adding some new information can be done via the input gate.

05/17/2025 Dr. Kumod kumar Gupta Programming 59


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Step 3: Decide What Part of the Current Cell State Makes It to the
Output
The third step is to decide what the output will be. First, we run a sigmoid
layer, which decides what parts of the cell state make it to the output. Then,
we put the cell state through tanh to push the values to be between -1 and 1
and multiply it by the output of the sigmoid gate.

05/17/2025 Dr. Kumod kumar Gupta Programming 60


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Let’s consider this example to predict the next word in the sentence: “John
played tremendously well against the opponent and won for his team. For
his contributions, brave ____ was awarded player of the match.”
There could be many choices for the empty space. The current input brave
is an adjective, and adjectives describe a noun. So, “John” could be the best
output after brave.

05/17/2025 Dr. Kumod kumar Gupta Programming 61


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)
Drawbacks of Using LSTM Networks
As it is said, everything in this world comes with its own advantages and disadvantages, LSTMs too,
have a few drawbacks which are discussed below:
1.LSTMs became popular because they could solve the problem of vanishing gradients. But it turns
out, they fail to remove it completely. The problem lies in the fact that the data still has to move from
cell to cell for its evaluation. Moreover, the cell has become quite complex now with additional
features (such as forget gates) being brought into the picture.
2.They require a lot of resources and time to get trained and become ready for real-world applications.
In technical terms, they need high memory bandwidth because of the linear layers present in each cell
which the system usually fails to provide. Thus, hardware-wise, LSTMs become quite inefficient.
3.With the rise of data mining, developers are looking for a model that can remember past information
for a longer time than LSTMs. The source of inspiration for such kind of model is the human habit of
dividing a given piece of information into small parts for easy remembrance.
4.LSTMs get affected by different random weight initialization and hence behave quite similarly to that
of a feed-forward neural net. They prefer small-weight initialization instead.
5.LSTMs are prone to overfitting and it is difficult to apply the dropout algorithm to curb this issue.
Dropout is a regularization method where input and recurrent connections to LSTM units are
probabilistically excluded from activation and weight updates while training a network.

05/17/2025 Dr. Kumod kumar Gupta Programming 62


for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)
Applications of LSTM Networks
LSTM models need to be trained with a training dataset prior to their employment in real-world
applications. Some of the most demanding applications are discussed below:
1.Language modeling or text generation, involves the computation of words when a sequence
of words is fed as input. Language models can be operated at the character level, n-gram level,
sentence level, or even paragraph level.
2.Image processing involves performing an analysis of a picture and concluding its result into a
sentence. For this, it’s required to have a dataset comprising a good amount of pictures with
their corresponding descriptive captions. A model that has already been trained is used to
predict features of images present in the dataset. This is photo data. The dataset is then
processed in such a way that only the words that are most suggestive are present in it. This is
text data. Using these two types of data, we try to fit the model. The work of the model is to
generate a descriptive sentence for the picture one word at a time by taking input words that
were predicted previously by the model and also the image.
3.Speech and Handwriting Recognition.
4.Music generation is quite similar to that of text generation where LSTMs predict musical
notes instead of text by analyzing a combination of given notes fed as input.
5.Language Translation involves mapping a sequence in one language to a sequence in another
language. Similar to image processing, a dataset, containing phrases and their translations, is
first cleaned and only a part of it is used to train the model. An encoder-decoder LSTM model
is used which first converts the input sequence to its vector representation (encoding) and then
outputs it to its translated version.

05/17/2025 Dr. Kumod kumar Gupta Programming 63


for Data analytics Unit 4
Recurrent Neural Networks

• Recurrent neural networks are a powerful and widely used class of neural
network architectures for modeling sequence data.
• The basic idea behind RNN models is that each new element in the
sequence contributes some new information, which updates the
current state of the model.

• RNN models are also based on this notion of chain structure, and vary in
how exactly they maintain and update information. As their name
implies, recurrent neural nets apply some form of “loop.”

05/17/2025 Dr. Kumod kumar Gupta Programming 64


for Data analytics Unit 4
Recurrent Neural Networks

05/17/2025 Dr. Kumod kumar Gupta Programming 65


for Data analytics Unit 4
Recurrent Neural Networks

• As seen in Figure 2, at some point in time t, the network observes an input


xt (a word in a sentence) and updates its “state vector” to ht from the
previous vector h(t-1).
• When we process new input (the next word), it will be done in some manner
that is dependent on ht and thus on the history of the sequence (the previous
words we’ve seen affect our understanding of the current word).

Figure 2. Recurrent neural networks updating with new information


received over time.
05/17/2025 Dr. Kumod kumar Gupta Programming 66
for Data analytics Unit 4
Recurrent Neural Networks

• As seen in the illustration, this recurrent structure can simply be viewed as


one long unrolled chain, with each node in the chain performing the same
kind of processing “step” based on the “message” it obtains from the
output of the previous node.
• This, of course, is very related to the Markov chain models.

tangent function that has its range in [–1,1] and is strongly connected to the
sigmoid function,
• and xt and ht are the input and state vectors as defined previously.
• Finally, the hidden state vector is multiplied by another set of weights,
yielding the outputs that appear in Figure 2.

05/17/2025 Dr. Kumod kumar Gupta Programming 67


for Data analytics Unit 4
RNN

RNN is used in NLP

Google Translate

05/17/2025 Dr. Kumod kumar Gupta Programming 68


for Data analytics Unit 4
RNN

05/17/2025 Dr. Kumod kumar Gupta Programming 69


for Data analytics Unit 4
RNN

Sequential

05/17/2025 Dr. Kumod kumar Gupta Programming 70


for Data analytics Unit 4
RNN

05/17/2025 Dr. Kumod kumar Gupta Programming 71


for Data analytics Unit 4
RNN

How RNN works

05/17/2025 Dr. Kumod kumar Gupta Programming 72


for Data analytics Unit 4
RNN

05/17/2025 Dr. Kumod kumar Gupta Programming 73


for Data analytics Unit 4
RNN

05/17/2025 Dr. Kumod kumar Gupta Programming 74


for Data analytics Unit 4
RNN

05/17/2025 Dr. Kumod kumar Gupta Programming 75


for Data analytics Unit 4
RNN

05/17/2025 Dr. Kumod kumar Gupta Programming 76


for Data analytics Unit 4
RNN

• A recurrent neural network (RNN) is a variation of a basic neural network.


• RNNs are good for processing sequential data such as natural language
processing and audio recognition.
• They had, until recently, suffered from short-term-memory problems.
• In this We will try explaining what an
(1) RNN is,
(2) the vanishing gradient problem, and the solutions to this problem known as long-
short-term memory (LSTM)and gated recurrent units(GRU).

05/17/2025 Dr. Kumod kumar Gupta Programming 77


for Data analytics Unit 4
RNN
What is an RNN?
First, lets cover the basic neural network architecture, neural nets are trained
with 3 basic steps:

(1) A forward pass that makes a prediction.


(2) A comparison of the prediction to the ground truth using a loss function.
The loss function outputs an error value.
(3) Using that error value, perform back propagation which calculates the
gradients for each node in the network.

05/17/2025 Dr. Kumod kumar Gupta Programming 78


for Data analytics Unit 4
RNN

In contrast, an RNN contains a hidden state that is feeding it information


from previous states:

• The concept of a hidden state is analogous to integrating sequential data in


order to make a more accurate prediction.
• Consider how much easier it is to predict the motion of a ball if your data
is still shots of the ball in motion:

05/17/2025 Dr. Kumod kumar Gupta Programming 79


for Data analytics Unit 4
RNN

With no sequence information, it is impossible to predict where it is


moving, in contrast, if you know the previous locations:

Predictions will be more accurate. The same logic is applicable to estimating


the next word in a sentence, or the next piece of audio in a song. This
information is the hidden state, which is a representation of previous inputs.

05/17/2025 Dr. Kumod kumar Gupta Programming 80


for Data analytics Unit 4
RNN

Vanishing Gradient Problem


• However, this becomes problematic, to train an RNN, you use an application of
back-propagation called back-propagation through time (BPTT).
• Since the weights at each layer are tuned via the Chain Rule, their gradient
values will exponentially shrink as it propagates through each time step,
eventually “vanishing”:

05/17/2025 Dr. Kumod kumar Gupta Programming 81


for Data analytics Unit 4
RNN

To illustrate this phenomenon in an NLP application:

Vanishing Gradient

And you can see that by output 5, the information from “What” and
“time” have all but disappeared, how well do you think you can predict
what comes after “is” and “it” be without these?

05/17/2025 Dr. Kumod kumar Gupta Programming 82


for Data analytics Unit 4
LSTM
LSTM and GRU as solutions
• LSTMs and GRUs were created as a solution to the vanishing gradient
problem. They have internal mechanisms called gates that can regulate
the flow of information.
• For the LSTM, there is a main cell state, or conveyor belt, and several
gates that control whether new information can pass into the belt:

In the above problem, suppose we want to determine the gender of the speaker
in the new sentence. We would have to selectively forget certain things about
the previous states, namely, about who Bob is, and whether he likes apples, and
remember other things, that Alice is a woman and that she likes oranges.

05/17/2025 Dr. Kumod kumar Gupta Programming 83


for Data analytics Unit 4
LSTM

Zooming in, the gates in an LSTM do this as a 3-step process:

(1) Decide what to forget (state)


(2) Decide what to remember (state)
(3) The actual “forgetting” and update of the state
(4) Production of the output

05/17/2025 Dr. Kumod kumar Gupta Programming 84


for Data analytics Unit 4
GRU

To wrap up, in an LSTM,


• the forget gate (1) decides what is relevant to keep from prior steps.
• The input (2) gate decides what information is relevant to add from the
current step.
• The output gate (3) determines what the next hidden state should be.
For the GRU, which is the newer generation of RNNs, it is quite similar to
the LSTM, except that GRUs got rid of the cell state and used the hidden
state to transfer information. It also only has two gates, a reset
gate and update gate:

05/17/2025 Dr. Kumod kumar Gupta Programming 85


for Data analytics Unit 4
GRU
(1) the update gate acts similar to the forget and input gate of an LSTM, it
decides what information to keep and which to throw away, and what new
information to add.
The candidate hidden state is done the same way as the LSTM, the
difference is that, inside of the candidate computation, it will reset some of
the previous Gate (vectors between 0–1, defined by linear
transformations).
(2) the reset gate is used to decide how much of the past information to
forget.

05/17/2025 Dr. Kumod kumar Gupta Programming 86


for Data analytics Unit 4
GRU

• GRU (Gated Recurrent Unit) is a type of RNN that is specifically designed to address the
vanishing gradient problem.
• GRUs are similar to LSTMs in that they use gates to control the flow of information
through the network. However, GRUs do not have an output gate, which makes them
simpler and more efficient than LSTMs.
• GRUs solve the vanishing gradient problem by using a reset gate and an update gate.
The reset gate determines how much of the previous cell state is reset at each time step.
The update gate determines how much of the current input is added to the cell state.
• The reset gate is particularly important for preventing the vanishing gradient problem. The
reset gate can be used to selectively reset information from the previous cell state, even if
that information is important for long-term dependencies. This allows the GRU to learn
long-term dependencies without the gradients becoming too small.
• In addition to the reset gate, GRUs also use a tanh activation function for the update gate.
Tanh activation functions have a range of [-1, 1], which means that they can be used to
control the flow of information in a more gradual way than other activation functions,
such as the sigmoid activation function. This helps to prevent the gradients from becoming
too large or too small, which can also contribute to the vanishing gradient problem.

05/17/2025 Dr. Kumod kumar Gupta Programming 87


for Data analytics Unit 4
Transformer

https://towardsdatascience.com/transformers-141e32e69591

https://medium.com/inside-machine-learning/what-is-a-
transformer-d07dd1fbec04

• Transformers are a type of neural network architecture that have been


gaining popularity.
• Transformers were recently used by OpenAI in their language models, and
also used recently by DeepMind for AlphaStar — their program to defeat a
top professional Starcraft player.

05/17/2025 Dr. Kumod kumar Gupta Programming 88


for Data analytics Unit 4
Transformer

• Transformers were developed to solve the problem of


sequence transduction, or neural machine
translation. That means any task that transforms an
input sequence to an output sequence. This includes
speech recognition, text-to-speech transformation, etc..

Sequence transduction. The input is represented in green, the model is


represented in blue, and the output is represented in purple. fig1

• For models to perform sequence transduction, it is necessary to have


some sort of memory. For example let’s say that we are translating the
following sentence to another language (French):

05/17/2025 Dr. Kumod kumar Gupta Programming 89


for Data analytics Unit 4
Transformer

“The Transformers” are a Japanese [[hardcore punk]] band. The band


was formed in 1968, during the height of Japanese music history”

• In this example, the word “the band” in the second sentence refers to the band
“The Transformers” introduced in the first sentence.
• When you read about the band in the second sentence, you know that it is
referencing to the “The Transformers” band. That may be important for
translation.
• There are many examples, where words in some sentences refer to words in
previous sentences.
• For translating sentences like that, a model needs to figure out these sort of
dependencies and connections.
• Recurrent Neural Networks (RNNs) and Convolutional Neural Networks
(CNNs) have been used to deal with this problem because of their properties.
• Let’s go over these two architectures and their drawbacks.

05/17/2025 Dr. Kumod kumar Gupta Programming 90


for Data analytics Unit 4
Transformer

Recurrent Neural Networks


Recurrent Neural Networks have loops in them, allowing information to
persist.

The input is represented as x_t


In the figure above, we see part of the neural network, A, processing some
input x_t and outputs h_t. A loop allows information to be passed from one
step to the next.

05/17/2025 Dr. Kumod kumar Gupta Programming 91


for Data analytics Unit 4
Transformer
• The loops can be thought in a different way. A Recurrent Neural
Network can be thought of as multiple copies of the same network, A,
each network passing a message to a successor.
• Consider what happens if we unroll the loop:

An unrolled recurrent neural network


This chain-like nature shows that recurrent neural networks are clearly related to
sequences and lists. In that way, if we want to translate some text, we can set each
input as the word in that text. The Recurrent Neural Network passes the information of
the previous words to the next network that can use and process that information.

05/17/2025 Dr. Kumod kumar Gupta Programming 92


for Data analytics Unit 4
Transformer

• The following picture shows how usually a sequence to sequence model


works using Recurrent Neural Networks.
• Each word is processed separately, and the resulting sentence is generated
by passing a hidden state to the decoding stage that, then, generates the
output.

The problem of long-term dependencies


Consider a language model that is trying to predict the next word based on
the previous ones. If we are trying to predict the next word of the
sentence “the clouds in the sky”, we don’t need further context. It’s pretty
obvious that the next word is going to be sky.

05/17/2025 Dr. Kumod kumar Gupta Programming 93


for Data analytics Unit 4
Transformer

• In this case where the difference between the relevant information and
the place that is needed is small, RNNs can learn to use past information
and figure out what is the next word for this sentence.

• But there are cases where we need more context. For example, let’s
say that you are trying to predict the last word of the text: “I grew up
in France… I speak fluent …”.
• Recent information suggests that the next word is probably a
language, but if we want to narrow down which language, we need
context of France, that is further back in the text.

05/17/2025 Dr. Kumod kumar Gupta Programming 94


for Data analytics Unit 4
Transformer

• RNNs become very ineffective when the gap between the relevant information
and the point where it is needed become very large. That is due to the fact that
the information is passed at each step and the longer the chain is, the more
probable the information is lost along the chain.

05/17/2025 Dr. Kumod kumar Gupta Programming 95


for Data analytics Unit 4
Transformer

In theory, RNNs could learn this long-term dependencies. In practice, they don’t
seem to learn them. LSTM, a special type of RNN, tries to solve this kind of
problem.
Transformers
• To solve the problem of parallelization, Transformers try to solve the problem
by using encoders and decoders together with attention models.
• Attention boosts the speed of how fast the model can translate from one
sequence to another.
• Let’s take a look at how Transformer works. Transformer is a model that
uses attention to boost the speed. More specifically, it uses self-attention.

The Transformer.

05/17/2025 Dr. Kumod kumar Gupta Programming 96


for Data analytics Unit 4
Transformer
Internally, the Transformer has a similar kind of architecture as the previous
models above. But the Transformer consists of six encoders and six
decoders.

05/17/2025 Dr. Kumod kumar Gupta Programming 97


for Data analytics Unit 4
Transformer

• Each encoder is very similar to each other. All encoders have the same
architecture. Decoders share the same property, i.e. they are also very
similar to each other.
• Each encoder consists of two layers: Self-attention and a feed Forward
Neural Network.

05/17/2025 Dr. Kumod kumar Gupta Programming 98


for Data analytics Unit 4
Transformer
• The encoder’s inputs first flow through a self-attention layer. It helps the
encoder look at other words in the input sentence as it encodes a specific
word.
• The decoder has both those layers, but between them is an attention layer
that helps the decoder focus on relevant parts of the input sentence.

05/17/2025 Dr. Kumod kumar Gupta Programming 99


for Data analytics Unit 4
Transformer

Self-Attention
Let’s start to look at the various vectors/tensors and how they flow between
these components to turn the input of a trained model into an output. As is the
case in NLP applications in general, we begin by turning each input word into a
vector using an embedding algorithm.

Each word is embedded into a vector of size 512. We’ll represent those vectors
with these simple boxes.
The embedding only happens in the bottom-most encoder. The abstraction that
is common to all the encoders is that they receive a list of vectors each of the
size 512.

05/17/2025 Dr. Kumod kumar Gupta Programming 100


for Data analytics Unit 4
Transformer
In the bottom encoder that would be the word embeddings, but in other
encoders, it would be the output of the encoder that’s directly below. After
embedding the words in our input sequence, each of them flows through each
of the two layers of the encoder.

05/17/2025 Dr. Kumod kumar Gupta Programming 101


for Data analytics Unit 4
Transformer
• Here we begin to see one key property of the Transformer, which is that
the word in each position flows through its own path in the encoder.
• There are dependencies between these paths in the self-attention layer.
• The feed-forward layer does not have those dependencies, however, and
thus the various paths can be executed in parallel while flowing through
the feed-forward layer.

Next, we’ll switch up the example to a shorter sentence and we’ll look at
what happens in each sub-layer of the encoder.

Self-Attention
Let’s first look at how to calculate self-attention using vectors, then
proceed to look at how it’s actually implemented — using matrices.

05/17/2025 Dr. Kumod kumar Gupta Programming 102


for Data analytics Unit 4
Transformer

Figuring out relation of words within a sentence and giving the


right attention to it

05/17/2025 Dr. Kumod kumar Gupta Programming 103


for Data analytics Unit 4
Transformer

• The first step in calculating self-attention is to create three vectors from each of
the encoder’s input vectors (in this case, the embedding of each word). So for
each word, we create a Query vector, a Key vector, and a Value vector. These
vectors are created by multiplying the embedding by three matrices that we
trained during the training process.
• Notice that these new vectors are smaller in dimension than the embedding vector.
Their dimensionality is 64, while the embedding and encoder input/output vectors
have dimensionality of 512. They don’t HAVE to be smaller, this is an
architecture choice to make the computation of multiheaded attention (mostly)
constant.

05/17/2025 Dr. Kumod kumar Gupta Programming 104


for Data analytics Unit 4
Transformer

• Multiplying x1 by the WQ weight matrix produces q1, the “query” vector


associated with that word. We end up creating a “query”, a “key”, and a “value”
projection of each word in the input sentence.
• What are the “query”, “key”, and “value” vectors?
• They’re abstractions that are useful for calculating and thinking about attention.
Once you proceed with reading how attention is calculated below, you’ll know
pretty much all you need to know about the role each of these vectors plays.

05/17/2025 Dr. Kumod kumar Gupta Programming 105


for Data analytics Unit 4
Transformer

The second step in calculating self-attention is to calculate a score. Say we’re


calculating the self-attention for the first word in this example, “Thinking”. We
need to score each word of the input sentence against this word. The score
determines how much focus to place on other parts of the input sentence as we
encode a word at a certain position.
• The score is calculated by taking the dot product of the query vector with the
key vector of the respective word we’re scoring. So if we’re processing the
self-attention for the word in position #1, the first score would be the dot
product of q1 and k1. The second score would be the dot product of q1 and
k2.

05/17/2025 Dr. Kumod kumar Gupta Programming 106


for Data analytics Unit 4
Transformer

The third and forth steps are to divide the scores by 8 (the square root of the
dimension of the key vectors used in the paper — 64. This leads to having more
stable gradients. There could be other possible values here, but this is the default),
then pass the result through a softmax operation. Softmax normalizes the scores so
they’re all positive and add up to 1.

05/17/2025 Dr. Kumod kumar Gupta Programming 107


for Data analytics Unit 4
Transformer

For more information click on this link

https://towardsdatascience.com/transformers-
141e32e69591

05/17/2025 Dr. Kumod kumar Gupta Programming 108


for Data analytics Unit 4
DAILY QUIZ

05/17/2025 Dr. Kumod kumar Gupta Programming 109


for Data analytics Unit 4
Weekly Assignment

05/17/2025 Dr. Kumod kumar Gupta Programming 110


for Data analytics Unit 4
TOPIC LINKS

Suggested video link

Testing of hypothesis
https://nptel.ac.in/content/storage2/courses/103106120/LectureNotes/Lec3_1.pdf

Links:
http://www.numpy.org/
https://www.scipy.org/scipylib/
http://pandas.pydata.org/

05/17/2025 Dr. Kumod kumar Gupta Programming 111


for Data analytics Unit 4
MCQ

. Python has a built-in package called?


A. reg
B. regex
C. re
D. regx
2. Which function returns a list containing all matches?
A. findall
B. search
C. split
D. find
3. Which character stand for Starts with in regex?
A. &
B. ^
C. $
D. #

05/17/2025 Dr. Kumod kumar Gupta Programming 112


for Data analytics Unit 4
MCQ

4. In Regex, [a-n] stands for?


A. Returns a match for any digit between 0 and 9
B. Returns a match for any lower case character, alphabetically between a and n
C. Returns a match for any two-digit numbers from 00 and 59
D. Returns a match for any character EXCEPT a, r, and n

5. The expression a{5} will match _____________ characters with the previous
regular expression.
A. 5 or less
B. exactly 5
C. 5 or more
D. exactly 4

05/17/2025 Dr. Kumod kumar Gupta Programming 113


for Data analytics Unit 4
MCQ
6 A statement about a population developed for the purpose of testing is called:
(a) Hypothesis
(b) Hypothesis testing
(c) Level of significance
(d) Test-statistic

7. Which of the following statements about the P value do you believe to be true?
a)The P value is the probability that the null hypothesis is true.
b)The P value is the probability that the alternative hypothesis is true.
c)The P value is the probability of obtaining the observed or more extreme results if the
alternative hypothesis is true.
d)The P value is the probability of obtaining the observed results or results which are
more extreme if the null hypothesis is true.
e)The P value is always less than 0.05.

8. By taking a level of significance of 5% it is the same as saying


a)We are 5% confident the results have not occurred by chance
b) We are 95% confident that the results have not occurred by chance
c) We are 95% confident that the results have occurred by chance

05/17/2025 Dr. Kumod kumar Gupta Programming 114


for Data analytics Unit 4
Old University Question Paper

Not Applicable (First Batch)

05/17/2025 Dr. Kumod kumar Gupta Programming 115


for Data analytics Unit 4
Expected Questions
1. What do we mean by data aggregation?. [CO1]
2. What are some of the essential features provided by Python Pandas?
[CO1]
3. What is the reason behind importing Pandas library in Python? [CO1]
4. What Is Groupby Function In Pandas? [CO1]
5. What are the different ways of creating Dataframe In Pandas. [CO1]
6. How to Delete Indices, Rows or Columns From a Pandas Data Frame?
[CO1]
7. Write a program for changing the dimension of a NumPy array. [CO1]
8. How do you convert Pandas DataFrame to a NumPy array? [CO1]
9. What is the difference between indexing and slicing in NumPy? [CO1]
10. Can you create a plot in NumPy?

05/17/2025 Dr. Kumod kumar Gupta Programming 116


for Data analytics Unit 4
Recap of Unit

Hence, we observe that


NumPy and Pandas make matrix manipulation easy.
This flexibility makes them very useful in Machine Learning
model development.

05/17/2025 Dr. Kumod kumar Gupta Programming 117


for Data analytics Unit 4
References

Text Books:

(1) Glenn J. Myatt, Making sense of Data: A practical Guide to Exploratory Data Analysis
and Data Mining, John
Wiley Publishers, 2007.
(2) Learning TensorFlow by Tom Hope, Yehezkel S. Resheff, Itay Lieder O'Reilly Media,
Inc.
(3) Advanced Deep Learning with TensorFlow 2 and Keras: Apply DL, GANs, VAEs, deep
RL, unsupervised
learning, object detection and segmentation, and more, 2nd Edition.
Reference Books:
(4) Boris lublinsky, Kevin t. Smith, Alexey Yakubovich, “Professional Hadoop Solutions”, 1
st Edition, Wrox, 2013.
(5) Chris Eaton, Dirk Deroos et. al., “Understanding Big data”, Indian Edition, McGraw
Hill, 2015.
(6) Tom White, “HADOOP: The definitive Guide”, 3 rd Edition, O Reilly, 2012

05/17/2025 Dr. Kumod kumar Gupta Programming 118


for Data analytics Unit 4
Thank You

05/17/2025 Dr. Kumod kumar Gupta Programming 119


for Data analytics Unit 4

You might also like