Unit 3

The document discusses the fundamentals of neural networks and artificial neural networks (ANNs), including their architecture, key concepts, and training methods such as gradient descent and backpropagation. It highlights various applications of neural networks, their advantages and limitations, as well as the challenges of overfitting and the use of recurrent neural networks (RNNs) for sequential data processing. Additionally, it covers techniques to introduce nonlinearity in models and the importance of managing model complexity to improve generalization.

Uploaded by

Shivani Bhagat

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Unit 3

Uploaded by

Shivani Bhagat

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Artificial Intelligence for Big Data Mining

Unit III Neural networks for big data

Fundamental of Neural networks and artificial neural networks, perceptron and linear models,
nonlinearities model, feed forward neural networks, Gradient descent and backpropagation, Overfitting,
Recurrent neural networks

Fundamentals of Neural Networks:

Neural networks are computing systems inspired by the biological neural networks of animal brains. They
are composed of interconnected nodes, called neurons, which process information using a connectionist
approach to computation. Neural networks can adapt and learn from data, making them particularly
useful for tasks such as pattern recognition, classification, and regression.

Artificial Neural Networks (ANNs):

Artificial Neural Networks are a subset of neural networks designed to simulate the way the human brain
analyzes and processes information. ANNs consist of layers of interconnected nodes, with each node
performing a simple mathematical operation. Information flows through the network, with the network
learning by adjusting the weights of the connections between nodes based on the input data.

Key Concepts:
- Neurons: The basic building blocks of neural networks, neurons receive input, process it, and pass the
output to the next layer.
- Layers: Neurons are organized into layers, including an input layer, one or more hidden layers, and an
output layer.
- Weights and Biases: Each connection between neurons has a weight that determines the strength of the
connection. Biases are additional parameters that allow neurons to have different activation thresholds.
- Activation Function: An activation function determines the output of a neuron based on its input.
Common activation functions include sigmoid, tanh, and ReLU.
- Feedforward and Backpropagation: Feedforward is the process of passing input data through the
network to get an output. Backpropagation is the process of adjusting the weights of the connections
based on the error in the output, allowing the network to learn from the data.
Applications:
Neural networks have a wide range of applications, including:
- Image and speech recognition
- Natural language processing
- Medical diagnosis
- Financial forecasting
- Autonomous vehicles
Limitations:
- Requires a large amount of data for training
- Prone to overfitting
- Computationally intensive

Perceptron and Linear Models:

The perceptron is a type of artificial neuron that can make binary decisions. It's the simplest form of a
neural network and is based on a linear model. In a perceptron, the inputs are multiplied by weights,
summed up, and passed through an activation function to produce an output. This output is then
compared to a threshold to make a decision.
Single layer
Multi layer

Linear models, on the other hand, are mathematical models that assume a linear relationship between
the input variables and the output. They are used for regression and classification tasks. Linear models can
be simple, like linear regression, or more complex, like logistic regression for classification.

Nonlinearities in Models:
Nonlinearities in models refer to the introduction of nonlinearity into the relationship between the input
and output variables. In linear models, the relationship is linear, which means the output is a linear
combination of the input variables. However, in many real-world scenarios, the relationship is more
complex and cannot be captured by a linear model.
To introduce nonlinearity, various techniques can be used, such as:
- Adding polynomial features to the input variables
- Using nonlinear activation functions in neural networks, such as sigmoid, tanh, or ReLU
- Using kernel methods, such as the kernel trick in support vector machines, to map the input variables
into a higher-dimensional space where they can be separated linearly
Adding nonlinearity to models allows them to capture more complex relationships in the data, making
them more flexible and capable of modeling a wider range of phenomena.
Feedforward neural networks, also known as multilayer perceptrons (MLPs), are a type of artificial
neural network where connections between nodes do not form a cycle. They are called "feedforward"
because information flows in one direction, from the input nodes through the hidden nodes (if any) to the
output nodes.
Architecture:
- Input Layer: The input layer receives the initial data or features.
- Hidden Layers: One or more hidden layers process the inputs using weights that are adjusted during
training.
- Output Layer: The output layer produces the final output, such as class probabilities in a classification
task.

Activation Functions:
Each neuron in a feedforward neural network uses an activation function to introduce nonlinearity into
the model, allowing it to learn complex patterns in the data. Common activation functions include
sigmoid, tanh, and ReLU (Rectified Linear Unit).
Training:
Feedforward neural networks are trained using a process called backpropagation. This involves feeding
the input forward through the network, comparing the output to the desired output, and then adjusting
the weights of the connections using an optimization algorithm (e.g., gradient descent) to minimize the
error.
Applications:
Feedforward neural networks are used in a wide range of applications, including:
- Image and speech recognition
- Natural language processing
- Financial forecasting
- Recommendation systems
- Robotics
Advantages:
- Can model complex relationships in data
- Can learn from large amounts of data
- Can generalize well to unseen data
Disadvantages:
- Require a large amount of data for training
- Prone to overfitting, especially with deep architectures
- Can be computationally expensive to train and deploy, especially with large networks

Gradient Descent:
Gradient descent is an optimization algorithm used to minimize a function by iteratively moving in the
direction of the steepest descent of the function. In machine learning, it is commonly used to minimize
the loss function of a model by adjusting its parameters (weights and biases) based on the gradient of the
loss function with respect to the parameters.
The basic idea behind gradient descent is to update the parameters in the opposite direction of the
gradient of the loss function, multiplied by a small value called the learning rate. The learning rate controls
how big of a step we take in each iteration. If the learning rate is too small, the algorithm may take a long
time to converge. If it is too large, the algorithm may overshoot the minimum and fail to converge.
There are different variants of gradient descent, such as batch gradient descent, stochastic gradient
descent, and mini-batch gradient descent, which differ in how they update the parameters and how they
use the training data.

Backpropagation:
Backpropagation is a method used to calculate the gradient of the loss function of a neural network with
respect to its weights. It is a key algorithm for training feedforward neural networks.
The basic idea behind backpropagation is to propagate the error backwards through the network, starting
from the output layer and moving towards the input layer. At each layer, the error is used to calculate the
gradient of the loss function with respect to the weights of that layer, using the chain rule of calculus.
These gradients are then used to update the weights of the network using gradient descent or its variants.
Backpropagation allows neural networks to learn from data by iteratively adjusting their weights to
minimize the error between the predicted output and the actual output. It is an essential algorithm for
training deep neural networks, enabling them to learn complex patterns in data.
Overfitting is a common problem in machine learning where a model learns the training data too well,
to the point that it negatively impacts its performance on new, unseen data. In other words, the model
becomes too complex and starts capturing noise in the training data, rather than the underlying patterns.
Causes of Overfitting:
- Model Complexity: A model that is too complex for the given data can lead to overfitting. This is often
the case with models that have too many parameters relative to the amount of training data.
- Insufficient Training Data: When there is not enough training data available, the model may memorize
the training examples instead of learning the underlying patterns. This can lead to overfitting, especially
with complex models.
- Noise in the Data: If the training data contains noise or irrelevant features, the model may learn to fit the
noise rather than the true underlying patterns
Effects of Overfitting:
- Poor Generalization: An overfitted model may perform well on the training data but poorly on new,
unseen data, because it has memorized the training examples rather than learned the underlying
patterns.
- Reduced Model Interpretability: Overly complex models can be difficult to interpret, making it hard to
understand how they are making predictions.
Methods to Prevent Overfitting:
- Cross-validation: Splitting the data into training, validation, and test sets can help evaluate the model's
performance on unseen data and prevent overfitting.
- Regularization: Techniques like L1 and L2 regularization can help prevent overfitting by adding a penalty
term to the loss function that discourages large weights.
- Simplifying the Model: Using a simpler model with fewer parameters can help prevent overfitting,
especially when there is limited training data.
- Feature Selection: Removing irrelevant features or reducing the dimensionality of the data can help
prevent overfitting by focusing on the most important features.

Recurrent Neural Networks (RNNs) are a type of artificial neural network designed to work with
sequential data. They are particularly effective for tasks where the input data is a sequence, such as time
series forecasting, natural language processing (NLP), and speech recognition.
Architecture:
Unlike feedforward neural networks, where information flows in one direction (from input to output),
RNNs have connections that form a directed cycle, allowing information to persist. This cyclic structure
enables RNNs to maintain a "memory" of previous inputs, making them suitable for tasks that require
understanding context or temporal dependencies.
Key Components:
- Hidden State: At each time step, an RNN produces an output and updates its hidden state. The hidden
state is a representation of the network's memory at that time step, incorporating information from
previous time steps.
- Recurrent Connections: Recurrent connections allow information to flow from one time step to the next,
enabling the network to process sequences of inputs.
- Activation Function: RNNs typically use a nonlinear activation function, such as tanh or ReLU, to
introduce nonlinearity into the model.
Training:
RNNs are trained using the backpropagation through time (BPTT) algorithm, which is an extension of the
backpropagation algorithm for feedforward neural networks. BPTT calculates the gradient of the loss
function with respect to the weights of the network, taking into account the sequential nature of the data.
Applications:
RNNs are used in a variety of applications, including:
- Language Modeling: Predicting the next word in a sentence.
- Machine Translation: Translating text from one language to another.
- Speech Recognition: Converting spoken language into text.
- Time Series Prediction: Forecasting future values in a time series.
Challenges:
RNNs are prone to the vanishing gradient problem, where gradients become very small as they are
propagated back in time, making it difficult for the network to learn long-range dependencies. To address
this issue, variants of RNNs, such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU),
have been developed, which are better at capturing long-term dependencies.

(refer Diagram as well)

Drones As A Service (DaaS) For 5G Networks and Blockchain-Assisted IoT-based Smart City Infrastructure
No ratings yet
Drones As A Service (DaaS) For 5G Networks and Blockchain-Assisted IoT-based Smart City Infrastructure
64 pages
New Proposal
No ratings yet
New Proposal
29 pages
Deep Learning Interview Questions and Answers
No ratings yet
Deep Learning Interview Questions and Answers
21 pages
Complete Deep Learning Interview Question
No ratings yet
Complete Deep Learning Interview Question
46 pages
What are Neural Networks
No ratings yet
What are Neural Networks
5 pages
Deep_Learning_1687744660
No ratings yet
Deep_Learning_1687744660
26 pages
Soft Compute
No ratings yet
Soft Compute
21 pages
Deep Learning Questions
50% (2)
Deep Learning Questions
51 pages
DL Practicals
No ratings yet
DL Practicals
10 pages
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
No ratings yet
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
65 pages
DL Unit-3
No ratings yet
DL Unit-3
9 pages
2_notes (2)
No ratings yet
2_notes (2)
2 pages
Unit 2 v1.
No ratings yet
Unit 2 v1.
41 pages
Deep_Learning_Interview_Q&A
No ratings yet
Deep_Learning_Interview_Q&A
10 pages
MODULE 2 DL
No ratings yet
MODULE 2 DL
9 pages
Unit II - Neural Networks -Most Important Questions_with Answers-Exam
No ratings yet
Unit II - Neural Networks -Most Important Questions_with Answers-Exam
22 pages
Machine Learning Unit 3-5
No ratings yet
Machine Learning Unit 3-5
13 pages
Machine Learning
No ratings yet
Machine Learning
13 pages
Unit 1
No ratings yet
Unit 1
20 pages
SC_03
No ratings yet
SC_03
17 pages
Types of MAC Protocols
No ratings yet
Types of MAC Protocols
32 pages
Deep Learning
No ratings yet
Deep Learning
8 pages
Unit 4 notes
No ratings yet
Unit 4 notes
19 pages
Machine Learning Unit 4
No ratings yet
Machine Learning Unit 4
21 pages
Image Processing 7
No ratings yet
Image Processing 7
193 pages
Secrets of Deep Learning 1716536527
No ratings yet
Secrets of Deep Learning 1716536527
12 pages
ML Unit-5
No ratings yet
ML Unit-5
11 pages
EPS-DL-Handout4- Steps to Build ANN From Scratch
No ratings yet
EPS-DL-Handout4- Steps to Build ANN From Scratch
14 pages
unit-1
No ratings yet
unit-1
19 pages
AIML-UNIT-5
No ratings yet
AIML-UNIT-5
34 pages
Module 5-1
No ratings yet
Module 5-1
8 pages
Report OCR
No ratings yet
Report OCR
34 pages
Cs3491-Artificial Intelligence and Machine Learning-1221091049-Unit 5 Aiml
No ratings yet
Cs3491-Artificial Intelligence and Machine Learning-1221091049-Unit 5 Aiml
38 pages
Unit 2
No ratings yet
Unit 2
19 pages
Seminar
No ratings yet
Seminar
13 pages
Neural Networks and Their Statistical Application
No ratings yet
Neural Networks and Their Statistical Application
41 pages
Unit 5
No ratings yet
Unit 5
8 pages
Computer Vision NN Architecture
No ratings yet
Computer Vision NN Architecture
19 pages
UNIT II DL
No ratings yet
UNIT II DL
17 pages
Neural Networks
No ratings yet
Neural Networks
17 pages
AAM Question Bank With Solution
No ratings yet
AAM Question Bank With Solution
9 pages
AAM ut answer
No ratings yet
AAM ut answer
11 pages
10 Myths About Neutral Networks
No ratings yet
10 Myths About Neutral Networks
10 pages
Deep Learning Basics
No ratings yet
Deep Learning Basics
28 pages
Neural Networks Tutorial answers
No ratings yet
Neural Networks Tutorial answers
32 pages
Deep Learning and Its Applications
No ratings yet
Deep Learning and Its Applications
21 pages
Unit 3
No ratings yet
Unit 3
8 pages
ISP560 Notes
No ratings yet
ISP560 Notes
139 pages
DL_PRESENTATION
No ratings yet
DL_PRESENTATION
82 pages
Neural Networks
No ratings yet
Neural Networks
16 pages
Unit Iv DM
No ratings yet
Unit Iv DM
58 pages
unit-2
No ratings yet
unit-2
16 pages
Interview Questions Answers
No ratings yet
Interview Questions Answers
7 pages
1
No ratings yet
1
15 pages
FFNN,GD,Backpropagation
No ratings yet
FFNN,GD,Backpropagation
18 pages
SHAI - Task 3 - NN
No ratings yet
SHAI - Task 3 - NN
10 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
16 pages
2 marks
No ratings yet
2 marks
5 pages
Assignment 4
No ratings yet
Assignment 4
7 pages
UNIT I II NOTES SOFT Computing
No ratings yet
UNIT I II NOTES SOFT Computing
46 pages
12 ASAP TimeSeriesForcasting - Neural Networks - Day 12-15
No ratings yet
12 ASAP TimeSeriesForcasting - Neural Networks - Day 12-15
49 pages
Artificial Intelligence Algorithms
From Everand
Artificial Intelligence Algorithms
akosnemeth
No ratings yet
SRM 230402356
No ratings yet
SRM 230402356
13 pages
End Term Exam Announcement January 2025
No ratings yet
End Term Exam Announcement January 2025
5 pages
Encoder Decoder
No ratings yet
Encoder Decoder
8 pages
Crop Yield Prediction Analysis Using Feed Forward and Recurrent Neural Network
No ratings yet
Crop Yield Prediction Analysis Using Feed Forward and Recurrent Neural Network
5 pages
AI notes Module- 4
No ratings yet
AI notes Module- 4
13 pages
Trend in Spatial Analysis
100% (1)
Trend in Spatial Analysis
217 pages
Fake News Detection Using Multi (1)
No ratings yet
Fake News Detection Using Multi (1)
9 pages
CS 446: Machine Learning: Dan Roth University of Illinois, Urbana-Champaign
No ratings yet
CS 446: Machine Learning: Dan Roth University of Illinois, Urbana-Champaign
71 pages
Advanced Machine Learning and Artificial Intelligence
No ratings yet
Advanced Machine Learning and Artificial Intelligence
9 pages
Deep Learning Enabled Perceptive Wearable Sensor: An Interactive Gadget For Tracking Movement Disorder
No ratings yet
Deep Learning Enabled Perceptive Wearable Sensor: An Interactive Gadget For Tracking Movement Disorder
11 pages
Regression - Elements of AI 4-2
100% (2)
Regression - Elements of AI 4-2
20 pages
Artificial Intelligence in Bus
No ratings yet
Artificial Intelligence in Bus
138 pages
Transformers 2023
No ratings yet
Transformers 2023
36 pages
Pratik_Ratadiya_Resume
No ratings yet
Pratik_Ratadiya_Resume
2 pages
IITMandixMasai Brochure
No ratings yet
IITMandixMasai Brochure
12 pages
Machine Learning For Networking Workflow, Advances and Opportunities
No ratings yet
Machine Learning For Networking Workflow, Advances and Opportunities
8 pages
Get Mathematical statistics basic ideas and selected topics Volume II Bickel P.J. PDF ebook with Full Chapters Now
100% (2)
Get Mathematical statistics basic ideas and selected topics Volume II Bickel P.J. PDF ebook with Full Chapters Now
55 pages
Ai Data
No ratings yet
Ai Data
8 pages
Fuzzy CMeans
No ratings yet
Fuzzy CMeans
3 pages
100 ChatGPT Programming Prompts
100% (1)
100 ChatGPT Programming Prompts
15 pages
2.tomato Leaf Disease Identification Based On Different Deep Learning Techniques
No ratings yet
2.tomato Leaf Disease Identification Based On Different Deep Learning Techniques
1 page
complete IT report
No ratings yet
complete IT report
32 pages
Pentachart Example 3
No ratings yet
Pentachart Example 3
1 page
Session-2 AI Domains and Technologies
No ratings yet
Session-2 AI Domains and Technologies
2 pages
solution for dwdm problems (1)
No ratings yet
solution for dwdm problems (1)
24 pages
Lung Cancer Detection Using CT Scan Images: Sciencedirect
No ratings yet
Lung Cancer Detection Using CT Scan Images: Sciencedirect
8 pages
DM
No ratings yet
DM
4 pages
09 Milestone Project 2 Skimlit
No ratings yet
09 Milestone Project 2 Skimlit
32 pages

Unit 3

Uploaded by

Unit 3

Uploaded by

Artificial Intelligence for Big Data Mining

Unit III Neural networks for big data

Fundamentals of Neural Networks:

Artificial Neural Networks (ANNs):

Perceptron and Linear Models:

(refer Diagram as well)

You might also like