0% found this document useful (0 votes)
9 views

Unit 1 DLT

UNIT 1

Uploaded by

sarmiladevin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Unit 1 DLT

UNIT 1

Uploaded by

sarmiladevin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Deep Learning

Fundamentals
Deep learning is a powerful branch of artificial intelligence that enables
machines to learn from data and make predictions. This presentation delves into
the fundamental concepts of deep learning, covering essential mathematical
foundations, probability distributions, optimization techniques, and key
challenges. We will explore the core data structures, the role of probability in
machine learning, and the importance of optimization algorithms for training
deep learning models.
Linear Algebra: The Foundation of Deep Learning
Scalars Vectors Matrices and Tensors

Scalars are single numbers, representing Vectors are ordered arrays of single Matrices are rectangular arrays of
0th-order tensors. They are denoted by x numbers, representing 1st-order tensors. numbers, representing 2nd-order tensors.
∈ ℝ, indicating a scalar belonging to the They are fragments of vector spaces, Tensors generalize scalars, vectors, and
set of real numbers. which encompass all possible vectors of a matrices, encompassing higher-order
specific length or dimension. entities.
Probability Distributions:
Describing Random
Variables
1 Discrete Variables 2 Continuous Variables
Discrete variables can take on Continuous variables can take
a finite number of values. on any value within a range.
Their probability distributions Their probability distributions
are described using are described using
probability mass functions probability density functions
(PMFs). (PDFs).
Gradient-Based Optimization: Minimizing Loss Functions
1 2 3 4

Role of an Optimizer Gradient Descent Stochastic Gradient Mini-Batch Gradient


Descent (SGD) Descent
Optimizers update the Gradient descent is a
parameters of neural fundamental optimization SGD is a variant of gradient Mini-batch gradient descent
networks, such as weights and algorithm that iteratively descent that uses a single combines the benefits of batch
learning rates, to minimize the updates parameters in the random training example or a gradient descent and
loss function. direction of the steepest small batch to calculate the stochastic gradient descent by
descent to minimize the loss gradient and update using a small batch of data to
function. parameters, making it update parameters, striking a
computationally efficient for balance between
large datasets. computational efficiency and
stability.
Machine Learning Basics:
Capacity, Overfitting, and
Underfitting
Capacity Overfitting
Capacity refers to the number of Overfitting occurs when a model learns
functions a machine learning model the training data too well, including
can select as a possible solution. noise and irrelevant patterns, leading
Overfitting and underfitting are two to poor generalization on unseen data.
common problems that arise from
model capacity.

Underfitting
Underfitting occurs when a model is too simple to capture the underlying patterns in
the data, resulting in poor performance on both training and test data.
Estimators: Quantifying
Guesses in Machine
Learning
Estimator Description

Sufficient Estimates the total population's


parameter from a limited dataset.

Unbiased Provides an exact-match estimate


value that neither underestimates
nor overestimates.
Bias and Variance in Machine Learning: Understanding
Prediction Errors

High Bias, Low Variance High Bias, High Variance Bias-Variance Trade-Off
Predictions are consistent but inaccurate on Predictions are inconsistent and inaccurate on Finding a balance between bias and variance is
average, indicating underfitting. average, indicating a model that is not learning crucial for building accurate machine learning
well. models. A model with low bias and low variance
is ideal.
Stochastic Gradient Descent (SGD): Optimizing for
Large Datasets

Computational Efficiency Memory Efficiency Avoidance of Local Minima


SGD is computationally efficient, especially for SGD is memory-efficient, as it processes data The noisy updates in SGD can help it escape
large datasets, as it updates parameters using a one example at a time, making it suitable for from local minima and converge to a global
single example or a small batch. datasets that cannot fit into memory. minimum.
Challenges in Deep Learning: Data, Hardware, and
Interpretability

1 Data Requirements 2 Hardware Requirements 3 Interpretability


Deep learning algorithms require large Training deep learning models can be Deep learning models are often
amounts of data to train effectively, computationally intensive, requiring high- considered black boxes, making it difficult
making data collection and processing a performance hardware such as GPUs, to understand how they arrive at their
significant challenge. which can be costly and power- predictions, posing challenges for
consuming. interpretability and trust.
Deep Networks: Feed-
Forward Process in Deep
Neural Networks
Deep neural networks consist of multiple layers, including an input layer, hidden
layers, and an output layer. The feed-forward process involves passing data
through these layers, applying activation functions, and generating predictions.

You might also like