0% found this document useful (0 votes)

3 views

DL_4

The document outlines the course details for 'Programming for Data Analytics' taught by Dr. Kumod Kumar Gupta at the Noida Institute of Engineering and Technology. It includes faculty information, evaluation schemes, syllabus, course objectives, outcomes, and applications of data analytics across various industries. Additionally, it covers specific topics such as Recurrent Neural Networks and their applications in fields like healthcare and finance.

Uploaded by

gs275103

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

DL_4

Uploaded by

gs275103

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 119

Noida Institute of Engineering and Technology,

Greater Noida

Unit: 4

Dr. Kumod Kumar

Gupta
Course Details Associate Professor
(B.Tech 7rth Sem /3rd Year) CS (AI) DEPARTMENT

Dr. Kumod kumar Gupta Programming for Data analytics Unit 4

1
05/17/2025
Noida Institute of Engineering and Technology, Greater Noida

Dr. Kumod Kumar Gupta, Ph.D. from Banasthali University, Jaipur. Presently,
he is working as an assistant professor NIET, Greater Noida (INDIA). He has
more than 16 years of teaching & research experience. His research interests
include VLSI Design, Machine Learning and Artificial Neural Network. He
has published more than 17 papers in refereed international journals. He has
attended various training programs in the area of electronics & communication
engineering.

05/17/2025 Dr. Kumod kumar Gupta Programming 2

for Data analytics Unit 4
Table of Contents

1. Name of Subject with code, Course and Subject Teacher

2. Brief Introduction of Faculty member with Photograph
3. Evaluation Scheme
4. Subject Syllabus
5. Branch wise Applications
6. Course Objective (Point wise)
7. Course Outcomes (COs)
8. Program Outcomes only heading (POs)
9. COs and POs Mapping
10. Program Specific Outcomes (PSOs)

05/17/2025 Dr. Kumod kumar Gupta Programming 3

for Data analytics Unit 4
Table of Contents

11. COs and PSOs Mapping

12. Program Educational Objectives (PEOs)
13. Result Analysis (Department Result, Subject Result and
Indivisual Faculty Result)
14. End Semester Question Paper Templates (Offline
Pattern/Online Pattern)
15. Prerequisite/ Recap
16. Brief Introduction about the Subject with videos
17. Unit Content
18. Unit Objective
19. Topic Objective/Topic Outcome
20. Lecture related to topic
21. Daily Quiz
22. Weekly Assignment
05/17/2025 Dr. Kumod kumar Gupta Programming 4
for Data analytics Unit 4
Table of Contents
23 Topic Links
24 MCQ (End of Unit)
25 Glossary Questions
26 Old Question Papers (Sessional + University)
27 Expected Questions
28 Recap of Unit

05/17/2025 Dr. Kumod kumar Gupta Programming 5

for Data analytics Unit 4
Evaluation Scheme

05/17/2025 Dr. Kumod kumar Gupta Programming 6

for Data analytics Unit 4
Syllabus

05/17/2025 Dr. Kumod kumar Gupta Programming 7

for Data analytics Unit 4
Branch-wise Applications
Data analytics is used in most sectors of businesses. Here are some primary
areas where data analytics does its magic:
1.Data analytics is used in the banking and e-commerce
industries to detect fraudulent transactions.
2.The healthcare sector uses data analytics to improve patient
health by detecting diseases before they happen. It is
commonly used for cancer detection.
3.Data analytics finds its usage in inventory management to
keep track of different items.
4.Logistics companies use data analytics to ensure faster
delivery of products by optimizing vehicle routes.
5.Marketing professionals use analytics to reach out to the right
customers and perform targeted marketing to increase ROI.
6.Data analytics can be used for city planning, to build smart
cities.

05/17/2025 Dr. Kumod kumar Gupta Programming 8

for Data analytics Unit 4
Course Objectives

• Demonstrate knowledge of statistical data analysis

techniques utilized in business decision making.
• Apply principles of Data Science to the analysis of business
problems.
• Use data mining software to solve real-world problems.
• Employ cutting edge tools and technologies to analyze Big
Data.

05/17/2025 Dr. Kumod kumar Gupta Programming 9

for Data analytics Unit 4
Course Outcomes
After completion of this course students will be able to:

CO1 Install, Code and Use Python & R Programming Language

in R Studio IDE to perform basic tasks on Vectors,
Matrices and Data frames.

CO2 Implement the concept of the R packages.

CO3 Understand the basic concept of the MongoDB.

Understand and apply the concept of the RNN and

CO4 tensorflow.
Understand and evaluate the concept of the keras in deep
CO5 learning.
05/17/2025 Dr. Kumod kumar Gupta Programming 10
for Data analytics Unit 4
Program Outcomes
. Engineering knowledge:
2. Problem analysis:
3. Design/development of solutions:
4.Conduct investigations of complex problems
5. Modern tool usage
6. The engineer and society
7. Environment and sustainability
8. Ethics:
9. Individual and team work
10. Communication:
11. Project management and finance
12. Life-long learning

05/17/2025 Dr. Kumod kumar Gupta Programming 11

for Data analytics Unit 4
CO-PO Mapping
Mapping of Course Outcomes and Program Outcomes:

PO PO PO PO PO PO PO PO PO9 PO1 PO1 PO1

1 2 3 4 5 6 7 8 0 1 2

CO.1 3 3 3 2 1 2 3

CO.2 3 3 3 2 1 2 3

CO.3 3 3 3 2 1 2 3

CO.4 3 3 3 2 1 2 3

CO.5 3 3 3 2 1 2 3

05/17/2025 Dr. Kumod kumar Gupta Programming 12

for Data analytics Unit 4
Program Specific Outcomes

On successful completion of graduation degree the

Engineering graduates will be able to:
• PSO1: Design innovative intelligent systems for the welfare
of the people using machine learning and its applications.
• PSO2: Demonstrate ethical, professional and team-oriented
skills while providing innovative solutions in Artificial
Intelligence and Machine Learning for life-long learning.

05/17/2025 Dr. Kumod kumar Gupta Programming 13

for Data analytics Unit 4
CO-PSO Mapping
Mapping of Course Outcomes and Program Specific Outcomes:

PSO1 PSO2
CO.1 2 3

CO.2 2 3

CO.3 2 3

CO.4 2 3

CO.5 2 3

05/17/2025 Dr. Kumod kumar Gupta Programming 14

for Data analytics Unit 4
Program Educational Objectives

PEO1: Pursue higher education and professional career to

excel in the field of Artificial Intelligence and Machine
Learning.
PEO2: Lead by example in innovative research and
entrepreneurial zeal for 21st century skills.
PEO3: Proactively provide innovative solutions for societal
problems to promote life-long learning.

05/17/2025 Dr. Kumod kumar Gupta Programming 15

for Data analytics Unit 4
End Semester Question Paper Templates (Offline Pattern/Online Pattern)

05/17/2025 Dr. Kumod kumar Gupta Programming 16

for Data analytics Unit 4
End Semester Question Paper Templates (Offline Pattern/Online Pattern)

05/17/2025 Dr. Kumod kumar Gupta Programming 17

for Data analytics Unit 4
End Semester Question Paper Templates (Offline Pattern/Online Pattern)

05/17/2025 Dr. Kumod kumar Gupta Programming 18

for Data analytics Unit 4
End Semester Question Paper Templates (Offline Pattern/Online Pattern)

05/17/2025 Dr. Kumod kumar Gupta Programming 19

for Data analytics Unit 4
End Semester Question Paper Templates (Offline Pattern/Online Pattern)

05/17/2025 Dr. Kumod kumar Gupta Programming 20

for Data analytics Unit 4
Prerequisite/Recap

Students must know computer programming and related programming

paradigms

05/17/2025 Dr. Kumod kumar Gupta Programming 21

for Data analytics Unit 4
About the Subject with videos

Data is getting generated at a massive rate, by the minute. Organizations, on the

other hand, are trying to explore every opportunity to make sense of this data.
This is where Data analytics has become crucial in running a business successfully. It
is commonly used in companies to drive profit and business growth
Link:
Unit 1 https://www.ibm.com/cloud/blog/python-vs-r

Unit 2 https://www.youtube.com/watch?v=C5R5SdYzQBI

Unit 3 https://hevodata.com/learn/data-engineering-and-data-engineers
/
Unit 4 https://www.youtube.com/watch?v=IjEZmH7byZQ

Unit 5 https://www.youtube.com/watch?v=pWp3PhYI-OU

05/17/2025 Dr. Kumod kumar Gupta Programming 22

for Data analytics Unit 4
UNIT CONTENT

Pandas data structures - Series and Data Frame,

Data wrangling using pandas,
Statistics with Pandas,
Mathematical Computing Using NumPy,
Data visualization with Python Descriptive and Inferential Statistics,
Introduction to Model Building,
Probability and Hypothesis Testing,
Sensitivity Analysis, Regular expression: RE packages.

05/17/2025 Dr. Kumod kumar Gupta Programming 23

for Data analytics Unit 4
Unit Objectives

Learn what data analytics is and the various applications of data analytics.
Study of different types of data analytics and process steps. And perform data
analytics using Python’s NumPy, Pandas, and Matplotlib libraries.

05/17/2025 Dr. Kumod kumar Gupta Programming 24

for Data analytics Unit 4
Topic Objectives

Understand the fundamentals of the Pandas library in

Python and how it is used to handle data.
Learn how to work with arrays, queries, and dataframes.
Learn how to use the groupby, merge, and join methods in
Pandas.

05/17/2025 Dr. Kumod kumar Gupta Programming 25

for Data analytics Unit 4
Introduction to TensorFlow and AI
Day1 17/02/23
About tensor flow
Basics of tensor flow
Activation function
ANN theory
Assignment day1
Second half
CNN
CNN program
Day2 19/02/23
Revise ANN and CNN
ANN program
Tensor Board
Second Half
RNN
One hot encoding program
Sequence program
Understanding basic RNN
Assignment day2
05/17/2025 Dr. Kumod kumar Gupta Programming 26
for Data analytics Unit 4
Introduction to TensorFlow and AI

Introduction to tensor flow and AI

05/17/2025 Dr. Kumod kumar Gupta Programming 27

for Data analytics Unit 4
Recurrent Neural Networks

Several neural networks can help solve different business problems. Let’s
look at a few of them.
•Feed-Forward Neural Network: Used for general
Regression and Classification problems.
•Convolutional Neural Network: Used for object detection and image
classification.
•Deep Belief Network: Used in healthcare sectors for cancer detection.
•RNN: Used for speech recognition, voice recognition, time series
prediction, and natural language processing.

05/17/2025 Dr. Kumod kumar Gupta Programming 28

for Data analytics Unit 4
Recurrent Neural Networks

What Is a Recurrent Neural Network (RNN)?

• RNN works on the principle of saving the output of a particular layer and
feeding this back to the input in order to predict the output of the layer.

Below is how you can convert a Feed-Forward Neural Network into a

Recurrent Neural Network:

Fig: Simple Recurrent Neural Network

05/17/2025 Dr. Kumod kumar Gupta Programming 29

for Data analytics Unit 4
Recurrent Neural Networks

The nodes in different layers of the neural network are compressed to form
a single layer of recurrent neural networks. A, B, and C are the parameters
of the network.

Fig: Fully connected Recurrent Neural Network

05/17/2025 Dr. Kumod kumar Gupta Programming 30
for Data analytics Unit 4
Recurrent Neural Networks
• Here, “x” is the input layer, “h” is the hidden layer, and “y” is the output
layer. A, B, and C are the network parameters used to improve the output
of the model.
• At any given time t, the current input is a combination of input at x(t) and
x(t-1). The output at any given time is fetched back to the network to
improve on the output.

Fig: Fully connected Recurrent Neural Network

05/17/2025 Dr. Kumod kumar Gupta Programming 31
for Data analytics Unit 4
Recurrent Neural Networks

Why Recurrent Neural Networks?

RNN were created because there were a few issues in the feed-forward
neural network:
•Cannot handle sequential data
•Considers only the current input
•Cannot memorize previous inputs
The solution to these issues is the RNN. An RNN can handle sequential data,
accepting the current input data, and previously received inputs. RNNs can
memorize previous inputs due to their internal memory.

05/17/2025 Dr. Kumod kumar Gupta Programming 32

for Data analytics Unit 4
Recurrent Neural Networks

How Does Recurrent Neural Networks Work?

In Recurrent Neural networks, the information cycles through a loop to the
middle hidden layer.

Fig: Working of Recurrent Neural

Network
05/17/2025 Dr. Kumod kumar Gupta Programming 33
for Data analytics Unit 4
Recurrent Neural Networks

• The input layer ‘x’ takes in the input to the neural network and processes it
and passes it onto the middle layer.
• The middle layer ‘h’ can consist of multiple hidden layers, each with its own
activation functions and weights and biases. If you have a neural network
where the various parameters of different hidden layers are not affected by
the previous layer, ie: the neural network does not have memory, then you
can use a recurrent neural network.
• The Recurrent Neural Network will standardize the different activation
functions and weights and biases so that each hidden layer has the same
parameters. Then, instead of creating multiple hidden layers, it will create
one and loop over it as many times as required.

05/17/2025 Dr. Kumod kumar Gupta Programming 34

for Data analytics Unit 4
Recurrent Neural Networks

Applications of Recurrent Neural Networks

Image Captioning
RNNs are used to caption an image by analyzing the activities
present.

Time Series Prediction

Any time series problem, like predicting the prices of stocks in a particular
month, can be solved using an RNN.

05/17/2025 Dr. Kumod kumar Gupta Programming 35

for Data analytics Unit 4
Recurrent Neural Networks

Natural Language Processing

Text mining and Sentiment analysis can be carried out using an RNN for
Natural Language Processing (NLP).

05/17/2025 Dr. Kumod kumar Gupta Programming 36

for Data analytics Unit 4
Recurrent Neural Networks

Machine Translation
Given an input in one language, RNNs can be used to translate the input into
different languages as output.

05/17/2025 Dr. Kumod kumar Gupta Programming 37

for Data analytics Unit 4
Recurrent Neural Networks

Types of Recurrent Neural Networks

There are four types of Recurrent Neural Networks:
1.One to One
2.One to Many
3.Many to One
4.Many to Many

05/17/2025 Dr. Kumod kumar Gupta Programming 38

for Data analytics Unit 4
Recurrent Neural Networks

One to One RNN

This type of neural network is known as the Vanilla Neural Network. It's
used for general machine learning problems, which has a single input and a
single output.

05/17/2025 Dr. Kumod kumar Gupta Programming 39

for Data analytics Unit 4
Recurrent Neural Networks

One to Many RNN

This type of neural network has a single input and multiple outputs. An
example of this is the image caption.

05/17/2025 Dr. Kumod kumar Gupta Programming 40

for Data analytics Unit 4
Recurrent Neural Networks

Many to One RNN

This RNN takes a sequence of inputs and generates a single output.
Sentiment analysis is a good example of this kind of network where a given
sentence can be classified as expressing positive or negative sentiments .

05/17/2025 Dr. Kumod kumar Gupta Programming 41

for Data analytics Unit 4
Recurrent Neural Networks

Many to Many RNN

This RNN takes a sequence of inputs and generates a sequence of outputs.
Machine translation is one of the examples.

05/17/2025 Dr. Kumod kumar Gupta Programming 42

for Data analytics Unit 4
Recurrent Neural Networks

Two Issues of Standard RNNs

1. Vanishing Gradient Problem
• Recurrent Neural Networks enable you to model time-dependent and
sequential data problems, such as stock market prediction, machine
translation, and text generation. You will find, however, RNN is hard to
train because of the gradient problem.

05/17/2025 Dr. Kumod kumar Gupta Programming 43

for Data analytics Unit 4
Recurrent Neural Networks
• RNNs suffer from the problem of vanishing gradients. The gradients carry
information used in the RNN, and when the gradient becomes too small,
the parameter updates become insignificant. This makes the learning of
long data sequences difficult.

05/17/2025 Dr. Kumod kumar Gupta Programming 44

for Data analytics Unit 4
Recurrent Neural Networks

2. Exploding Gradient Problem

• While training a neural network, if the slope tends to grow exponentially
instead of decaying, this is called an Exploding Gradient.
• This problem arises when large error gradients accumulate, resulting in
very large updates to the neural network model weights during the
training process.
Long training time, poor performance, and bad accuracy are the major issues
in gradient problems.
Gradient Problem Solutions

05/17/2025 Dr. Kumod kumar Gupta Programming 45

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Now, let’s discuss the most popular and efficient way to deal with gradient
problems, i.e., Long Short-Term Memory Network (LSTMs).
First, let’s understand Long-Term Dependencies.
Suppose you want to predict the last word in the text: “The clouds are in the
______.”
The most obvious answer to this is the “sky.” We do not need any further
context to predict the last word in the above sentence.
Consider this sentence: “I have been staying in Spain for the last 10 years…I
can speak fluent ______.”
The word you predict will depend on the previous few words in context. Here,
you need the context of Spain to predict the last word in the text, and the most
suitable answer to this sentence is “Spanish.” The gap between the relevant
information and the point where it's needed may have become very large.
LSTMs help you solve this problem.

05/17/2025 Dr. Kumod kumar Gupta Programming 46

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

• LSTM (Long Short-Term Memory) is a type of recurrent neural network (RNN) that is
specifically designed to address the vanishing gradient problem.
• The vanishing gradient problem occurs in RNNs when the gradients of the loss
function with respect to the model parameters become very small or even zero, as the
network iterates over long sequences of data. This can make it difficult for the network
to learn long-term dependencies between inputs and outputs.
• LSTMs solve the vanishing gradient problem by using three gates to control the flow
of information through the network:
• The forget gate determines how much of the previous cell state is forgotten at each
time step.
•The input gate determines how much of the current input is added to the cell state.
•The output gate determines how much of the cell state is output to the next time step.
The forget gate is particularly important for preventing the vanishing gradient problem.
The forget gate can be used to selectively forget information from the previous cell state,
even if that information is important for long-term dependencies. This allows the LSTM
to learn long-term dependencies without the gradients becoming too small.

05/17/2025 Dr. Kumod kumar Gupta Programming 47

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

• In addition to the forget gate, LSTMs also use a sigmoid activation function for the
input and output gates.
• Sigmoid activation functions have a range of [0, 1], which means that they can be
used to control the flow of information in a more gradual way than other activation
functions, such as the rectified linear unit (ReLU) activation function.
• This helps to prevent the gradients from becoming too large or too small, which can
also contribute to the vanishing gradient problem.
• As a result of these design features, LSTMs are able to learn long-term dependencies
much more effectively than RNNs without these features.
• This makes them well-suited for tasks that require the network to remember
information over long periods of time, such as machine translation, speech
recognition, and natural language processing.
• However, it is important to note that LSTMs do not completely solve the
vanishing gradient problem. In some cases, the gradients in LSTMs can still
become very small, which can make it difficult for the network to learn. However,
LSTMs are much more resistant to the vanishing gradient problem than RNNs
without these features. this can be solved via GRU

05/17/2025 Dr. Kumod kumar Gupta Programming 48

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)
Backpropagation Through Time
Backpropagation through time is when we apply a Backpropagation
algorithm to a Recurrent Neural network that has time series data as its
input.
In a typical RNN, one input is fed into the network at a time, and a single
output is obtained. But in backpropagation, you use the current as well as
the previous inputs as input. This is called a timestep and one timestep will
consist of many time series data points entering the RNN simultaneously.
Once the neural network has trained on a time set and given you an output,
that output is used to calculate and accumulate the errors. After this, the
network is rolled back up and weights are recalculated and updated keeping
the errors in mind.

https://towardsdatascience.com/
understanding-rnns-lstms-and-grus-
ed62eb584d90

05/17/2025 Dr. Kumod kumar Gupta Programming 49

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Long Short-Term Memory Networks

LSTMs are a special kind of RNN — capable of learning long-term
dependencies by remembering information for long periods is the default
behavior.
All RNN are in the form of a chain of repeating modules of a neural network.
In standard RNNs, this repeating module will have a very simple structure,
such as a single tanh layer.

Forget Gate

o/p
Gate

Input Gate

Fig: Long Short Term Memory

05/17/2025 Networks Dr. Kumod kumar Gupta
for Data analytics Unit 4
Programming 50
Long Short-Term Memory Network (LSTMs)

LSTMs also have a chain-like structure, but the repeating module is a bit
different structure. Instead of having a single neural network layer, four
interacting layers are communicating extraordinarily.

Using multiple activation function to refine information and last activation

tanh making the previous output suitable for next layer
05/17/2025 Dr. Kumod kumar Gupta Programming 51
for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Workings of LSTMs in RNN

LSTMs work in a 3-step process.

05/17/2025 Dr. Kumod kumar Gupta Programming 52

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Forget Gate in LSTM Cell Input gate in the LSTM cell

Output gate in the LSTM cell

05/17/2025 Dr. Kumod kumar Gupta Programming 53
for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Forget Gate
• The information that is no longer useful in the cell state is removed with the
forget gate.
• Two inputs x_t (input at the particular time) and h_t-1 (previous cell output) are
fed to the gate and multiplied with weight matrices followed by the addition of
bias.
• The resultant is passed through an activation function which gives a binary
output. If for a particular cell state, the output is 0, the piece of information is
forgotten and for output 1, the information is retained for future use.

Forget Gate in LSTM Cell

05/17/2025 Dr. Kumod kumar Gupta Programming 54

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)
Input gate
• The addition of useful information to the cell state is done by the input gate.
• First, the information is regulated using the sigmoid function and filter the
values to be remembered similar to the forget gate using inputs h_t-1 and x_t.
Then, a vector is created using the tanh function that gives an output from -1 to
+1, which contains all the possible values from h_t-1 and x_t.
• At last, the values of the vector and the regulated values are multiplied to
obtain useful information.

Input gate in the LSTM cell

05/17/2025 Dr. Kumod kumar Gupta Programming 55

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)
Output gate
• The task of extracting useful information from the current cell state to be
presented as output is done by the output gate.
• First, a vector is generated by applying the tanh function on the cell. Then, the
information is regulated using the sigmoid function and filtered by the values
to be remembered using inputs h_t-1 and x_t.
• At last, the values of the vector and the regulated values are multiplied to be
sent as an output and input to the next cell.

Output gate in the LSTM cell

05/17/2025 Dr. Kumod kumar Gupta Programming 56

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Step 1: Decide How Much Past Data It Should Remember

The first step in the LSTM is to decide which information should be
omitted from the cell in that particular time step. The sigmoid
function determines this. It looks at the previous state (ht-1) along
with the current input xt and computes the function.

05/17/2025 Dr. Kumod kumar Gupta Programming 57

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Consider the following two sentences:

Let the output of h(t-1) be “Alice is good in Physics. John, on the other hand, is
good at Chemistry.”
Let the current input at x(t) be “John plays football well. He told me yesterday
over the phone that he had served as the captain of his college football team.”
The forget gate realizes there might be a change in context after encountering
the first full stop. It compares with the current input sentence at x(t). The next
sentence talks about John, so the information on Alice is deleted. The position
of the subject is vacated and assigned to John.

05/17/2025 Dr. Kumod kumar Gupta Programming 58

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Step 2: Decide How Much This Unit Adds to the Current State
In the second layer, there are two parts. One is the sigmoid function, and the
other is the tanh function. In the sigmoid function, it decides which values to
let through (0 or 1). tanh function gives weightage to the values which are
passed, deciding their level of importance (-1 to 1).

With the current input at x(t), the input gate analyzes the important
information — John plays football, and the fact that he was the captain of his
college team is important.
“He told me yesterday over the phone” is less important; hence it's forgotten.
This process of adding some new information can be done via the input gate.

05/17/2025 Dr. Kumod kumar Gupta Programming 59

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Step 3: Decide What Part of the Current Cell State Makes It to the
Output
The third step is to decide what the output will be. First, we run a sigmoid
layer, which decides what parts of the cell state make it to the output. Then,
we put the cell state through tanh to push the values to be between -1 and 1
and multiply it by the output of the sigmoid gate.

05/17/2025 Dr. Kumod kumar Gupta Programming 60

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)

Let’s consider this example to predict the next word in the sentence: “John
played tremendously well against the opponent and won for his team. For
his contributions, brave ____ was awarded player of the match.”
There could be many choices for the empty space. The current input brave
is an adjective, and adjectives describe a noun. So, “John” could be the best
output after brave.

05/17/2025 Dr. Kumod kumar Gupta Programming 61

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)
Drawbacks of Using LSTM Networks
As it is said, everything in this world comes with its own advantages and disadvantages, LSTMs too,
have a few drawbacks which are discussed below:
1.LSTMs became popular because they could solve the problem of vanishing gradients. But it turns
out, they fail to remove it completely. The problem lies in the fact that the data still has to move from
cell to cell for its evaluation. Moreover, the cell has become quite complex now with additional
features (such as forget gates) being brought into the picture.
2.They require a lot of resources and time to get trained and become ready for real-world applications.
In technical terms, they need high memory bandwidth because of the linear layers present in each cell
which the system usually fails to provide. Thus, hardware-wise, LSTMs become quite inefficient.
3.With the rise of data mining, developers are looking for a model that can remember past information
for a longer time than LSTMs. The source of inspiration for such kind of model is the human habit of
dividing a given piece of information into small parts for easy remembrance.
4.LSTMs get affected by different random weight initialization and hence behave quite similarly to that
of a feed-forward neural net. They prefer small-weight initialization instead.
5.LSTMs are prone to overfitting and it is difficult to apply the dropout algorithm to curb this issue.
Dropout is a regularization method where input and recurrent connections to LSTM units are
probabilistically excluded from activation and weight updates while training a network.

05/17/2025 Dr. Kumod kumar Gupta Programming 62

for Data analytics Unit 4
Long Short-Term Memory Network (LSTMs)
Applications of LSTM Networks
LSTM models need to be trained with a training dataset prior to their employment in real-world
applications. Some of the most demanding applications are discussed below:
1.Language modeling or text generation, involves the computation of words when a sequence
of words is fed as input. Language models can be operated at the character level, n-gram level,
sentence level, or even paragraph level.
2.Image processing involves performing an analysis of a picture and concluding its result into a
sentence. For this, it’s required to have a dataset comprising a good amount of pictures with
their corresponding descriptive captions. A model that has already been trained is used to
predict features of images present in the dataset. This is photo data. The dataset is then
processed in such a way that only the words that are most suggestive are present in it. This is
text data. Using these two types of data, we try to fit the model. The work of the model is to
generate a descriptive sentence for the picture one word at a time by taking input words that
were predicted previously by the model and also the image.
3.Speech and Handwriting Recognition.
4.Music generation is quite similar to that of text generation where LSTMs predict musical
notes instead of text by analyzing a combination of given notes fed as input.
5.Language Translation involves mapping a sequence in one language to a sequence in another
language. Similar to image processing, a dataset, containing phrases and their translations, is
first cleaned and only a part of it is used to train the model. An encoder-decoder LSTM model
is used which first converts the input sequence to its vector representation (encoding) and then
outputs it to its translated version.

05/17/2025 Dr. Kumod kumar Gupta Programming 63

for Data analytics Unit 4
Recurrent Neural Networks

• Recurrent neural networks are a powerful and widely used class of neural
network architectures for modeling sequence data.
• The basic idea behind RNN models is that each new element in the
sequence contributes some new information, which updates the
current state of the model.

• RNN models are also based on this notion of chain structure, and vary in
how exactly they maintain and update information. As their name
implies, recurrent neural nets apply some form of “loop.”

05/17/2025 Dr. Kumod kumar Gupta Programming 64

for Data analytics Unit 4
Recurrent Neural Networks

05/17/2025 Dr. Kumod kumar Gupta Programming 65

for Data analytics Unit 4
Recurrent Neural Networks

• As seen in Figure 2, at some point in time t, the network observes an input

xt (a word in a sentence) and updates its “state vector” to ht from the
previous vector h(t-1).
• When we process new input (the next word), it will be done in some manner
that is dependent on ht and thus on the history of the sequence (the previous
words we’ve seen affect our understanding of the current word).

Figure 2. Recurrent neural networks updating with new information

received over time.
05/17/2025 Dr. Kumod kumar Gupta Programming 66
for Data analytics Unit 4
Recurrent Neural Networks

• As seen in the illustration, this recurrent structure can simply be viewed as

one long unrolled chain, with each node in the chain performing the same
kind of processing “step” based on the “message” it obtains from the
output of the previous node.
• This, of course, is very related to the Markov chain models.

tangent function that has its range in [–1,1] and is strongly connected to the
sigmoid function,
• and xt and ht are the input and state vectors as defined previously.
• Finally, the hidden state vector is multiplied by another set of weights,
yielding the outputs that appear in Figure 2.

05/17/2025 Dr. Kumod kumar Gupta Programming 67

for Data analytics Unit 4
RNN

RNN is used in NLP

Google Translate

05/17/2025 Dr. Kumod kumar Gupta Programming 68

for Data analytics Unit 4
RNN

05/17/2025 Dr. Kumod kumar Gupta Programming 69

for Data analytics Unit 4
RNN

Sequential

05/17/2025 Dr. Kumod kumar Gupta Programming 70

for Data analytics Unit 4
RNN

05/17/2025 Dr. Kumod kumar Gupta Programming 71

for Data analytics Unit 4
RNN

How RNN works

05/17/2025 Dr. Kumod kumar Gupta Programming 72

for Data analytics Unit 4
RNN

05/17/2025 Dr. Kumod kumar Gupta Programming 73

for Data analytics Unit 4
RNN

05/17/2025 Dr. Kumod kumar Gupta Programming 74

for Data analytics Unit 4
RNN

05/17/2025 Dr. Kumod kumar Gupta Programming 75

for Data analytics Unit 4
RNN

05/17/2025 Dr. Kumod kumar Gupta Programming 76

for Data analytics Unit 4
RNN

• A recurrent neural network (RNN) is a variation of a basic neural network.

• RNNs are good for processing sequential data such as natural language
processing and audio recognition.
• They had, until recently, suffered from short-term-memory problems.
• In this We will try explaining what an
(1) RNN is,
(2) the vanishing gradient problem, and the solutions to this problem known as long-
short-term memory (LSTM)and gated recurrent units(GRU).

05/17/2025 Dr. Kumod kumar Gupta Programming 77

for Data analytics Unit 4
RNN
What is an RNN?
First, lets cover the basic neural network architecture, neural nets are trained
with 3 basic steps:

(1) A forward pass that makes a prediction.

(2) A comparison of the prediction to the ground truth using a loss function.
The loss function outputs an error value.
(3) Using that error value, perform back propagation which calculates the
gradients for each node in the network.

05/17/2025 Dr. Kumod kumar Gupta Programming 78

for Data analytics Unit 4
RNN

In contrast, an RNN contains a hidden state that is feeding it information

from previous states:

• The concept of a hidden state is analogous to integrating sequential data in

order to make a more accurate prediction.
• Consider how much easier it is to predict the motion of a ball if your data
is still shots of the ball in motion:

05/17/2025 Dr. Kumod kumar Gupta Programming 79

for Data analytics Unit 4
RNN

With no sequence information, it is impossible to predict where it is

moving, in contrast, if you know the previous locations:

Predictions will be more accurate. The same logic is applicable to estimating

the next word in a sentence, or the next piece of audio in a song. This
information is the hidden state, which is a representation of previous inputs.

05/17/2025 Dr. Kumod kumar Gupta Programming 80

for Data analytics Unit 4
RNN

Vanishing Gradient Problem

• However, this becomes problematic, to train an RNN, you use an application of
back-propagation called back-propagation through time (BPTT).
• Since the weights at each layer are tuned via the Chain Rule, their gradient
values will exponentially shrink as it propagates through each time step,
eventually “vanishing”:

05/17/2025 Dr. Kumod kumar Gupta Programming 81

for Data analytics Unit 4
RNN

To illustrate this phenomenon in an NLP application:

Vanishing Gradient

And you can see that by output 5, the information from “What” and
“time” have all but disappeared, how well do you think you can predict
what comes after “is” and “it” be without these?

05/17/2025 Dr. Kumod kumar Gupta Programming 82

for Data analytics Unit 4
LSTM
LSTM and GRU as solutions
• LSTMs and GRUs were created as a solution to the vanishing gradient
problem. They have internal mechanisms called gates that can regulate
the flow of information.
• For the LSTM, there is a main cell state, or conveyor belt, and several
gates that control whether new information can pass into the belt:

In the above problem, suppose we want to determine the gender of the speaker
in the new sentence. We would have to selectively forget certain things about
the previous states, namely, about who Bob is, and whether he likes apples, and
remember other things, that Alice is a woman and that she likes oranges.

05/17/2025 Dr. Kumod kumar Gupta Programming 83

for Data analytics Unit 4
LSTM

Zooming in, the gates in an LSTM do this as a 3-step process:

(1) Decide what to forget (state)

(2) Decide what to remember (state)
(3) The actual “forgetting” and update of the state
(4) Production of the output

05/17/2025 Dr. Kumod kumar Gupta Programming 84

for Data analytics Unit 4
GRU

To wrap up, in an LSTM,

• the forget gate (1) decides what is relevant to keep from prior steps.
• The input (2) gate decides what information is relevant to add from the
current step.
• The output gate (3) determines what the next hidden state should be.
For the GRU, which is the newer generation of RNNs, it is quite similar to
the LSTM, except that GRUs got rid of the cell state and used the hidden
state to transfer information. It also only has two gates, a reset
gate and update gate:

05/17/2025 Dr. Kumod kumar Gupta Programming 85

for Data analytics Unit 4
GRU
(1) the update gate acts similar to the forget and input gate of an LSTM, it
decides what information to keep and which to throw away, and what new
information to add.
The candidate hidden state is done the same way as the LSTM, the
difference is that, inside of the candidate computation, it will reset some of
the previous Gate (vectors between 0–1, defined by linear
transformations).
(2) the reset gate is used to decide how much of the past information to
forget.

05/17/2025 Dr. Kumod kumar Gupta Programming 86

for Data analytics Unit 4
GRU

• GRU (Gated Recurrent Unit) is a type of RNN that is specifically designed to address the
vanishing gradient problem.
• GRUs are similar to LSTMs in that they use gates to control the flow of information
through the network. However, GRUs do not have an output gate, which makes them
simpler and more efficient than LSTMs.
• GRUs solve the vanishing gradient problem by using a reset gate and an update gate.
The reset gate determines how much of the previous cell state is reset at each time step.
The update gate determines how much of the current input is added to the cell state.
• The reset gate is particularly important for preventing the vanishing gradient problem. The
reset gate can be used to selectively reset information from the previous cell state, even if
that information is important for long-term dependencies. This allows the GRU to learn
long-term dependencies without the gradients becoming too small.
• In addition to the reset gate, GRUs also use a tanh activation function for the update gate.
Tanh activation functions have a range of [-1, 1], which means that they can be used to
control the flow of information in a more gradual way than other activation functions,
such as the sigmoid activation function. This helps to prevent the gradients from becoming
too large or too small, which can also contribute to the vanishing gradient problem.

05/17/2025 Dr. Kumod kumar Gupta Programming 87

for Data analytics Unit 4
Transformer

https://towardsdatascience.com/transformers-141e32e69591

https://medium.com/inside-machine-learning/what-is-a-
transformer-d07dd1fbec04

• Transformers are a type of neural network architecture that have been

gaining popularity.
• Transformers were recently used by OpenAI in their language models, and
also used recently by DeepMind for AlphaStar — their program to defeat a
top professional Starcraft player.

05/17/2025 Dr. Kumod kumar Gupta Programming 88

for Data analytics Unit 4
Transformer

• Transformers were developed to solve the problem of

sequence transduction, or neural machine
translation. That means any task that transforms an
input sequence to an output sequence. This includes
speech recognition, text-to-speech transformation, etc..

Sequence transduction. The input is represented in green, the model is

represented in blue, and the output is represented in purple. fig1

• For models to perform sequence transduction, it is necessary to have

some sort of memory. For example let’s say that we are translating the
following sentence to another language (French):

05/17/2025 Dr. Kumod kumar Gupta Programming 89

for Data analytics Unit 4
Transformer

“The Transformers” are a Japanese [[hardcore punk]] band. The band

was formed in 1968, during the height of Japanese music history”

• In this example, the word “the band” in the second sentence refers to the band
“The Transformers” introduced in the first sentence.
• When you read about the band in the second sentence, you know that it is
referencing to the “The Transformers” band. That may be important for
translation.
• There are many examples, where words in some sentences refer to words in
previous sentences.
• For translating sentences like that, a model needs to figure out these sort of
dependencies and connections.
• Recurrent Neural Networks (RNNs) and Convolutional Neural Networks
(CNNs) have been used to deal with this problem because of their properties.
• Let’s go over these two architectures and their drawbacks.

05/17/2025 Dr. Kumod kumar Gupta Programming 90

for Data analytics Unit 4
Transformer

Recurrent Neural Networks

Recurrent Neural Networks have loops in them, allowing information to
persist.

The input is represented as x_t

In the figure above, we see part of the neural network, A, processing some
input x_t and outputs h_t. A loop allows information to be passed from one
step to the next.

05/17/2025 Dr. Kumod kumar Gupta Programming 91

for Data analytics Unit 4
Transformer
• The loops can be thought in a different way. A Recurrent Neural
Network can be thought of as multiple copies of the same network, A,
each network passing a message to a successor.
• Consider what happens if we unroll the loop:

An unrolled recurrent neural network

This chain-like nature shows that recurrent neural networks are clearly related to
sequences and lists. In that way, if we want to translate some text, we can set each
input as the word in that text. The Recurrent Neural Network passes the information of
the previous words to the next network that can use and process that information.

05/17/2025 Dr. Kumod kumar Gupta Programming 92

for Data analytics Unit 4
Transformer

• The following picture shows how usually a sequence to sequence model

works using Recurrent Neural Networks.
• Each word is processed separately, and the resulting sentence is generated
by passing a hidden state to the decoding stage that, then, generates the
output.

The problem of long-term dependencies

Consider a language model that is trying to predict the next word based on
the previous ones. If we are trying to predict the next word of the
sentence “the clouds in the sky”, we don’t need further context. It’s pretty
obvious that the next word is going to be sky.

05/17/2025 Dr. Kumod kumar Gupta Programming 93

for Data analytics Unit 4
Transformer

• In this case where the difference between the relevant information and
the place that is needed is small, RNNs can learn to use past information
and figure out what is the next word for this sentence.

• But there are cases where we need more context. For example, let’s
say that you are trying to predict the last word of the text: “I grew up
in France… I speak fluent …”.
• Recent information suggests that the next word is probably a
language, but if we want to narrow down which language, we need
context of France, that is further back in the text.

05/17/2025 Dr. Kumod kumar Gupta Programming 94

for Data analytics Unit 4
Transformer

• RNNs become very ineffective when the gap between the relevant information
and the point where it is needed become very large. That is due to the fact that
the information is passed at each step and the longer the chain is, the more
probable the information is lost along the chain.

05/17/2025 Dr. Kumod kumar Gupta Programming 95

for Data analytics Unit 4
Transformer

In theory, RNNs could learn this long-term dependencies. In practice, they don’t
seem to learn them. LSTM, a special type of RNN, tries to solve this kind of
problem.
Transformers
• To solve the problem of parallelization, Transformers try to solve the problem
by using encoders and decoders together with attention models.
• Attention boosts the speed of how fast the model can translate from one
sequence to another.
• Let’s take a look at how Transformer works. Transformer is a model that
uses attention to boost the speed. More specifically, it uses self-attention.

The Transformer.

05/17/2025 Dr. Kumod kumar Gupta Programming 96

for Data analytics Unit 4
Transformer
Internally, the Transformer has a similar kind of architecture as the previous
models above. But the Transformer consists of six encoders and six
decoders.

05/17/2025 Dr. Kumod kumar Gupta Programming 97

for Data analytics Unit 4
Transformer

• Each encoder is very similar to each other. All encoders have the same
architecture. Decoders share the same property, i.e. they are also very
similar to each other.
• Each encoder consists of two layers: Self-attention and a feed Forward
Neural Network.

05/17/2025 Dr. Kumod kumar Gupta Programming 98

for Data analytics Unit 4
Transformer
• The encoder’s inputs first flow through a self-attention layer. It helps the
encoder look at other words in the input sentence as it encodes a specific
word.
• The decoder has both those layers, but between them is an attention layer
that helps the decoder focus on relevant parts of the input sentence.

05/17/2025 Dr. Kumod kumar Gupta Programming 99

for Data analytics Unit 4
Transformer

Self-Attention
Let’s start to look at the various vectors/tensors and how they flow between
these components to turn the input of a trained model into an output. As is the
case in NLP applications in general, we begin by turning each input word into a
vector using an embedding algorithm.

Each word is embedded into a vector of size 512. We’ll represent those vectors
with these simple boxes.
The embedding only happens in the bottom-most encoder. The abstraction that
is common to all the encoders is that they receive a list of vectors each of the
size 512.

05/17/2025 Dr. Kumod kumar Gupta Programming 100

for Data analytics Unit 4
Transformer
In the bottom encoder that would be the word embeddings, but in other
encoders, it would be the output of the encoder that’s directly below. After
embedding the words in our input sequence, each of them flows through each
of the two layers of the encoder.

05/17/2025 Dr. Kumod kumar Gupta Programming 101

for Data analytics Unit 4
Transformer
• Here we begin to see one key property of the Transformer, which is that
the word in each position flows through its own path in the encoder.
• There are dependencies between these paths in the self-attention layer.
• The feed-forward layer does not have those dependencies, however, and
thus the various paths can be executed in parallel while flowing through
the feed-forward layer.

Next, we’ll switch up the example to a shorter sentence and we’ll look at
what happens in each sub-layer of the encoder.

Self-Attention
Let’s first look at how to calculate self-attention using vectors, then
proceed to look at how it’s actually implemented — using matrices.

05/17/2025 Dr. Kumod kumar Gupta Programming 102

for Data analytics Unit 4
Transformer

Figuring out relation of words within a sentence and giving the

right attention to it

05/17/2025 Dr. Kumod kumar Gupta Programming 103

for Data analytics Unit 4
Transformer

• The first step in calculating self-attention is to create three vectors from each of
the encoder’s input vectors (in this case, the embedding of each word). So for
each word, we create a Query vector, a Key vector, and a Value vector. These
vectors are created by multiplying the embedding by three matrices that we
trained during the training process.
• Notice that these new vectors are smaller in dimension than the embedding vector.
Their dimensionality is 64, while the embedding and encoder input/output vectors
have dimensionality of 512. They don’t HAVE to be smaller, this is an
architecture choice to make the computation of multiheaded attention (mostly)
constant.

05/17/2025 Dr. Kumod kumar Gupta Programming 104

for Data analytics Unit 4
Transformer

• Multiplying x1 by the WQ weight matrix produces q1, the “query” vector

associated with that word. We end up creating a “query”, a “key”, and a “value”
projection of each word in the input sentence.
• What are the “query”, “key”, and “value” vectors?
• They’re abstractions that are useful for calculating and thinking about attention.
Once you proceed with reading how attention is calculated below, you’ll know
pretty much all you need to know about the role each of these vectors plays.

05/17/2025 Dr. Kumod kumar Gupta Programming 105

for Data analytics Unit 4
Transformer

The second step in calculating self-attention is to calculate a score. Say we’re

calculating the self-attention for the first word in this example, “Thinking”. We
need to score each word of the input sentence against this word. The score
determines how much focus to place on other parts of the input sentence as we
encode a word at a certain position.
• The score is calculated by taking the dot product of the query vector with the
key vector of the respective word we’re scoring. So if we’re processing the
self-attention for the word in position #1, the first score would be the dot
product of q1 and k1. The second score would be the dot product of q1 and
k2.

05/17/2025 Dr. Kumod kumar Gupta Programming 106

for Data analytics Unit 4
Transformer

The third and forth steps are to divide the scores by 8 (the square root of the
dimension of the key vectors used in the paper — 64. This leads to having more
stable gradients. There could be other possible values here, but this is the default),
then pass the result through a softmax operation. Softmax normalizes the scores so
they’re all positive and add up to 1.

05/17/2025 Dr. Kumod kumar Gupta Programming 107

for Data analytics Unit 4
Transformer

For more information click on this link

https://towardsdatascience.com/transformers-
141e32e69591

05/17/2025 Dr. Kumod kumar Gupta Programming 108

for Data analytics Unit 4
DAILY QUIZ

05/17/2025 Dr. Kumod kumar Gupta Programming 109

for Data analytics Unit 4
Weekly Assignment

05/17/2025 Dr. Kumod kumar Gupta Programming 110

for Data analytics Unit 4
TOPIC LINKS

05/17/2025 Dr. Kumod kumar Gupta Programming 111

for Data analytics Unit 4
MCQ

. Python has a built-in package called?

A. reg
B. regex
C. re
D. regx
2. Which function returns a list containing all matches?
A. findall
B. search
C. split
D. find
3. Which character stand for Starts with in regex?
A. &
B. ^
C. $
D. #

05/17/2025 Dr. Kumod kumar Gupta Programming 112

for Data analytics Unit 4
MCQ

4. In Regex, [a-n] stands for?

A. Returns a match for any digit between 0 and 9
B. Returns a match for any lower case character, alphabetically between a and n
C. Returns a match for any two-digit numbers from 00 and 59
D. Returns a match for any character EXCEPT a, r, and n

5. The expression a{5} will match _____________ characters with the previous
regular expression.
A. 5 or less
B. exactly 5
C. 5 or more
D. exactly 4

05/17/2025 Dr. Kumod kumar Gupta Programming 113

for Data analytics Unit 4
MCQ
6 A statement about a population developed for the purpose of testing is called:
(a) Hypothesis
(b) Hypothesis testing
(c) Level of significance
(d) Test-statistic

7. Which of the following statements about the P value do you believe to be true?
a)The P value is the probability that the null hypothesis is true.
b)The P value is the probability that the alternative hypothesis is true.
c)The P value is the probability of obtaining the observed or more extreme results if the
alternative hypothesis is true.
d)The P value is the probability of obtaining the observed results or results which are
more extreme if the null hypothesis is true.
e)The P value is always less than 0.05.

8. By taking a level of significance of 5% it is the same as saying

a)We are 5% confident the results have not occurred by chance
b) We are 95% confident that the results have not occurred by chance
c) We are 95% confident that the results have occurred by chance

05/17/2025 Dr. Kumod kumar Gupta Programming 114

for Data analytics Unit 4
Old University Question Paper

Not Applicable (First Batch)

05/17/2025 Dr. Kumod kumar Gupta Programming 115

for Data analytics Unit 4
Expected Questions
1. What do we mean by data aggregation?. [CO1]
2. What are some of the essential features provided by Python Pandas?
[CO1]
3. What is the reason behind importing Pandas library in Python? [CO1]
4. What Is Groupby Function In Pandas? [CO1]
5. What are the different ways of creating Dataframe In Pandas. [CO1]
6. How to Delete Indices, Rows or Columns From a Pandas Data Frame?
[CO1]
7. Write a program for changing the dimension of a NumPy array. [CO1]
8. How do you convert Pandas DataFrame to a NumPy array? [CO1]
9. What is the difference between indexing and slicing in NumPy? [CO1]
10. Can you create a plot in NumPy?

05/17/2025 Dr. Kumod kumar Gupta Programming 116

for Data analytics Unit 4
Recap of Unit

Hence, we observe that

NumPy and Pandas make matrix manipulation easy.
This flexibility makes them very useful in Machine Learning
model development.

05/17/2025 Dr. Kumod kumar Gupta Programming 117

for Data analytics Unit 4
References

Text Books:

(1) Glenn J. Myatt, Making sense of Data: A practical Guide to Exploratory Data Analysis
and Data Mining, John
Wiley Publishers, 2007.
(2) Learning TensorFlow by Tom Hope, Yehezkel S. Resheff, Itay Lieder O'Reilly Media,
Inc.
(3) Advanced Deep Learning with TensorFlow 2 and Keras: Apply DL, GANs, VAEs, deep
RL, unsupervised
learning, object detection and segmentation, and more, 2nd Edition.
Reference Books:
(4) Boris lublinsky, Kevin t. Smith, Alexey Yakubovich, “Professional Hadoop Solutions”, 1
st Edition, Wrox, 2013.
(5) Chris Eaton, Dirk Deroos et. al., “Understanding Big data”, Indian Edition, McGraw
Hill, 2015.
(6) Tom White, “HADOOP: The definitive Guide”, 3 rd Edition, O Reilly, 2012

05/17/2025 Dr. Kumod kumar Gupta Programming 118

for Data analytics Unit 4
Thank You

05/17/2025 Dr. Kumod kumar Gupta Programming 119

for Data analytics Unit 4

Ocs353dsf Unit Wise Notes
100% (2)
Ocs353dsf Unit Wise Notes
121 pages
Data Analytics Quantum
100% (1)
Data Analytics Quantum
148 pages
DSBDA Lab Manual 2022-23
100% (2)
DSBDA Lab Manual 2022-23
148 pages
The New 3D Layout for Oil & Gas Offshore Projects: How to ensure success
From Everand
The New 3D Layout for Oil & Gas Offshore Projects: How to ensure success
Jacques Daubian
4.5/5 (3)
Unit 4 Final PPT Kumod
No ratings yet
Unit 4 Final PPT Kumod
120 pages
Unit 2 Final
No ratings yet
Unit 2 Final
116 pages
Unit 4 DA Revised
No ratings yet
Unit 4 DA Revised
102 pages
DS Minor Degree Specialization Scheme 2022 23
No ratings yet
DS Minor Degree Specialization Scheme 2022 23
19 pages
ML and more
No ratings yet
ML and more
21 pages
CCS334 UPDATED 05-05-2025
No ratings yet
CCS334 UPDATED 05-05-2025
19 pages
Data Roadmap
No ratings yet
Data Roadmap
9 pages
PDF
No ratings yet
PDF
25 pages
Unit3
No ratings yet
Unit3
99 pages
SEM 4 stuff
No ratings yet
SEM 4 stuff
27 pages
Data Science_syllabus
No ratings yet
Data Science_syllabus
14 pages
Data Analytics Quantum
No ratings yet
Data Analytics Quantum
148 pages
Data-Analytics-2025-V2.0
No ratings yet
Data-Analytics-2025-V2.0
18 pages
310251: Data Science and Big Data Analytics
No ratings yet
310251: Data Science and Big Data Analytics
2 pages
DS&BD Lab Manul
No ratings yet
DS&BD Lab Manul
98 pages
MTECH Handbook
No ratings yet
MTECH Handbook
18 pages
M.E CSE Syllabus
No ratings yet
M.E CSE Syllabus
7 pages
BDDA - Course Outline
No ratings yet
BDDA - Course Outline
3 pages
Data Science AFA
No ratings yet
Data Science AFA
15 pages
TE Computer 2019 Course 22.06.2021-52-99
No ratings yet
TE Computer 2019 Course 22.06.2021-52-99
48 pages
IITH Executive MTech Brochure 2017
No ratings yet
IITH Executive MTech Brochure 2017
13 pages
Syllabus Sem 6
No ratings yet
Syllabus Sem 6
14 pages
Fundamentals of Machine Learning 4341603
No ratings yet
Fundamentals of Machine Learning 4341603
9 pages
20IT503 - Big Data Analytics - Unit1
No ratings yet
20IT503 - Big Data Analytics - Unit1
59 pages
Master of Science (Data Science and Analytics)
No ratings yet
Master of Science (Data Science and Analytics)
10 pages
Da Handbook
No ratings yet
Da Handbook
18 pages
INTRODUCTION TO DATA SCIENCE
No ratings yet
INTRODUCTION TO DATA SCIENCE
2 pages
ML Unit 1 CSE
No ratings yet
ML Unit 1 CSE
184 pages
DS Curriculum
No ratings yet
DS Curriculum
4 pages
MLDL Brochure
No ratings yet
MLDL Brochure
31 pages
BE Elex and Comp Engg - 2019 Course
No ratings yet
BE Elex and Comp Engg - 2019 Course
91 pages
Data Analytics Course Guide 2024
No ratings yet
Data Analytics Course Guide 2024
14 pages
CU MSDS All Semesters Syllabus
No ratings yet
CU MSDS All Semesters Syllabus
10 pages
Roadmap AI
No ratings yet
Roadmap AI
19 pages
IITH Executive MTech Brochure
No ratings yet
IITH Executive MTech Brochure
13 pages
Brochure Big Data
No ratings yet
Brochure Big Data
6 pages
Unit 1 - DA - Introduction To Data Science
No ratings yet
Unit 1 - DA - Introduction To Data Science
70 pages
Bda Aids Syllabus
No ratings yet
Bda Aids Syllabus
3 pages
DA Full
No ratings yet
DA Full
738 pages
Data Science - Curriculum Brochure
No ratings yet
Data Science - Curriculum Brochure
31 pages
Introduction to Data Science
No ratings yet
Introduction to Data Science
25 pages
Data Science Syl Lab Us
No ratings yet
Data Science Syl Lab Us
4 pages
Data Analysis
No ratings yet
Data Analysis
8 pages
LecturePlan CS201 20SMP-460
No ratings yet
LecturePlan CS201 20SMP-460
5 pages
Foundation of Data Science Syllabus
No ratings yet
Foundation of Data Science Syllabus
4 pages
227C4A data science
No ratings yet
227C4A data science
2 pages
Data Science Course Outline CES LUMS
No ratings yet
Data Science Course Outline CES LUMS
4 pages
BDA Syllabus - Sem VII - Mumbai University
No ratings yet
BDA Syllabus - Sem VII - Mumbai University
3 pages
Unit2
No ratings yet
Unit2
119 pages
DSML - Curriculum Brochure
No ratings yet
DSML - Curriculum Brochure
32 pages
Unit 1 Data Analytics
No ratings yet
Unit 1 Data Analytics
81 pages
Intro To Data Science
No ratings yet
Intro To Data Science
73 pages
Data Analytics Brouchure
No ratings yet
Data Analytics Brouchure
15 pages
Data Science
No ratings yet
Data Science
244 pages
Module 1 PPT
No ratings yet
Module 1 PPT
96 pages
The TOGAF® Standard, 10th Edition – Architecture Development Method
From Everand
The TOGAF® Standard, 10th Edition – Architecture Development Method
The Open Group
No ratings yet
Unit2_SHRUTISHARMA
No ratings yet
Unit2_SHRUTISHARMA
122 pages
ML unit-1
No ratings yet
ML unit-1
106 pages
java project report
No ratings yet
java project report
17 pages
Unit4_PRAVEEN KUMAR
No ratings yet
Unit4_PRAVEEN KUMAR
111 pages
ssc-cgl-algebra-questions--637ef17e14906c7733f9a292
No ratings yet
ssc-cgl-algebra-questions--637ef17e14906c7733f9a292
13 pages
Unit1_DrVikasSagar
No ratings yet
Unit1_DrVikasSagar
101 pages
DL_3
No ratings yet
DL_3
98 pages
5 - CH 5-K-Means Clustering
No ratings yet
5 - CH 5-K-Means Clustering
54 pages
Domnic Object Detecion Basics
No ratings yet
Domnic Object Detecion Basics
62 pages
NNFL
No ratings yet
NNFL
2 pages
Anatomy of Neural Networks
No ratings yet
Anatomy of Neural Networks
2 pages
DS203 2024-02-09 Clustering K Means and Hierarchical v2
No ratings yet
DS203 2024-02-09 Clustering K Means and Hierarchical v2
35 pages
Intro 2 Netlab
No ratings yet
Intro 2 Netlab
10 pages
A Text Classification Model Based On GCN and BiGRU Fusion
No ratings yet
A Text Classification Model Based On GCN and BiGRU Fusion
5 pages
Deep Learning With PyTorch 1
No ratings yet
Deep Learning With PyTorch 1
1 page
Ai Ga1
No ratings yet
Ai Ga1
7 pages
ioegc-10-032-100471
No ratings yet
ioegc-10-032-100471
8 pages
Week 10
No ratings yet
Week 10
3 pages
CP- THEORY- ML (1)
No ratings yet
CP- THEORY- ML (1)
6 pages
Ca-3 QB (Pec-It602b) - 2024-1
No ratings yet
Ca-3 QB (Pec-It602b) - 2024-1
12 pages
Types of Neural Networks
No ratings yet
Types of Neural Networks
7 pages
An Introduction To Kohonen Self Organizing Maps: Rajarshi Guha
No ratings yet
An Introduction To Kohonen Self Organizing Maps: Rajarshi Guha
12 pages
5. DEEP UNIT 3 F (1)
No ratings yet
5. DEEP UNIT 3 F (1)
51 pages
Unit Ii ML MCQ
No ratings yet
Unit Ii ML MCQ
9 pages
Unit 12
No ratings yet
Unit 12
26 pages
AD3501 Deep Learning Course Plan
No ratings yet
AD3501 Deep Learning Course Plan
6 pages
MLT numericals
No ratings yet
MLT numericals
4 pages
Large Language Models Are Zero Shot Text Classifiers
No ratings yet
Large Language Models Are Zero Shot Text Classifiers
9 pages
UNIT - 1 DLNN
No ratings yet
UNIT - 1 DLNN
36 pages
IT 802 ML Unit-2 Notes
No ratings yet
IT 802 ML Unit-2 Notes
19 pages
Data Mining Modul 3 Notes
No ratings yet
Data Mining Modul 3 Notes
3 pages
Guided Backpropagation
No ratings yet
Guided Backpropagation
11 pages
CVDL
No ratings yet
CVDL
3 pages
Data Mining Disease Diagnosis Presentation
No ratings yet
Data Mining Disease Diagnosis Presentation
35 pages
AAM Sample Paper
100% (2)
AAM Sample Paper
4 pages
Deep Learning - IIT Ropar - Unit 11 - Week 8
No ratings yet
Deep Learning - IIT Ropar - Unit 11 - Week 8
4 pages
Lab Session 9
No ratings yet
Lab Session 9
2 pages