Application of Depplearning and Intro To Autoencoders

Natural language processing (NLP) is a field that develops technologies to allow computers to understand, interpret, and generate human language. NLP has many applications including speech recognition, language translation, sentiment analysis, and question answering. Deep learning techniques like neural networks have improved NLP by learning word embeddings from large datasets. Recurrent neural networks are commonly used for NLP tasks since they can model sequential data and context. Speech recognition uses NLP techniques to convert speech to text by preprocessing audio data into spectrograms and then recognizing characters from short sounds using neural networks.

Uploaded by

Bhavani G

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views

Application of Depplearning and Intro To Autoencoders

Uploaded by

Bhavani G

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 28

Natural Processing

An Application of Deep Learning

What is NLP?
● Natural language processing is a field at the intersection of computer science, artificial
intelligence and linguistics.
● Goal: for computers to process or “understand” natural language in order to perform tasks
that are useful like queries answering.
● Applications of NLP:
○ Spell checking
○ Searching keywords
○ Extracting information from internet sources
○ Sentiment analysis
○ Complex question answering
○ Language translations
○ Speech recognition
● Voice-enabled applications such as Alexa, Siri, and Google Assistant use
NLP and Machine Learning (ML) to answer our questions, add activities to
our calendars and call the contacts that we state in our voice commands.
How is NLP applied?
● Translating languages is more complex than a simple word-to-word
replacement method. Since each language has grammar rules, the
challenge of translating a text is to do so without changing its meaning
and style. Since computers do not understand grammar, they need a
process in which they can deconstruct a sentence, then reconstruct it in
another language in a way that makes sense.
● Speech recognition is a machine’s ability to identify and interpret phrases
and words from spoken language and convert them into a machine-
readable format. It uses NLP to allow computers to simulate human
interaction, and ML to respond in a way that mimics human responses.
● Sentiment analysis uses NLP and ML to interpret and analyze
emotions in subjective data like news articles and tweets. Positive,
negative, and neutral opinions can be identified to determine a
customer’s sentiment towards a brand, product, or service. Sentiment
analysis is used to gauge public opinion, monitor brand reputation,
and better understand customer experiences.
● Automatic text summarization is the task of condensing a piece of text
to a shorter version, by extracting its main ideas and preserving the
meaning of content. This application of NLP is used in news headlines,
result snippets in web search, and bulletins of market reports.
Advantages and Disadvantages of NLP
● Advantages:
○ Very efficient and less expensive
○ Faster customer service for organizations
○ Easy to implement as many trained models are already
available
● Disadvantages:
○ Training is time consuming
○ Not reliable all the time
● It is constantly evolving.
● With the recent popularity and success of word embeddings (low
dimensional, distributed representations), neural-based models have
achieved superior results on various language-related tasks as
compared to traditional machine learning models like SVM or logistic
regression.
● Word Embeddings: Distributional vectors, also called word
embeddings, are based on the so-called distributional hypothesis —
words appearing within similar context possess similar meaning. Word
embeddings are pre-trained on a task where the objective is to predict a
word based on its context, typically using a shallow neural network.
● Wide variety of NLP tasks such as sentiment analysis and sentence
compositionality are done using this.
● One of the challenges with word embedding methods is when we want to
obtain vector representations for phrases such as “hot potato” or “Boston
Globe”. We can’t just simply combine the individual word vector
representations since these phrases don’t represent the combination of
meaning of the individual words. And it gets even more complicated
when longer phrases and sentences are considered.
● Word2vec: is a technique for natural language processing published in
2013. The word2vec algorithm uses a neural network model to learn word
associations from a large corpus of text. Once trained, such a model can
detect synonymous words or suggest additional words for a partial
sentence.
● To overcome all the drawbacks, CNN is used.
● A CNN is basically a neural-based approach which represents a feature
function that is applied to constituting words or n-grams to extract
higher-level features. The resulting abstract features have been effectively
used for sentiment analysis, machine translation, and question
answering, among other tasks.
● The goal of their method was to transform words into a vector
representation via a lookup table, which resulted in a primitive word
embedding approach that learn weights during the training of the
network.
● In order to perform sentence modeling with a
basic CNN, sentences are first tokenized into
words, which are further transformed into a
word embedding matrix (i.e., input embedding
layer) of d dimension.
● Then, convolutional filters are applied on this
input embedding layer which consists of
applying a filter of all possible window sizes to
produce what’s called a feature map.
● This is then followed by a max-pooling
operation which applies a max operation on
each filter to obtain a fixed length output and
reduce the dimensionality of the output.
● One of the shortcoming with basic CNNs is
there inability to model long distance
dependencies, which is important for various
NLP tasks. To address this problem, CNNs have
been coupled with time-delayed neural
networks (TDNN) which enable larger contextual
range at once during training.
NLP using RNN
● The main strength of an RNN is the capacity to memorize the results of
previous computations and use that information in the current computation.
This makes RNN models suitable to model context dependencies in inputs of
arbitrary length so as to create a proper composition of the input. RNNs have
been used to study various NLP tasks such as machine translation, image
captioning, and language modeling, among others.
● As it compares with a CNN model, an RNN model can be similarly effective or
even better at specific natural language tasks but not necessarily superior. This
is because they model very different aspects of the data, which only makes
them effective depending on the semantics required by the task at hand.
Speech Recognition
The first step in speech recognition is obvious — we need to feed sound waves into a computer.

Sound waves are one-dimensional. At every moment in time, they have a single value based on the
height of the wave. Let’s zoom in on one tiny part of the sound wave and take a look:
To turn this sound wave into numbers, we just record of the height of the wave at equally-spaced
points:

This is called sampling. We are taking a reading thousands of times a second and recording a number
representing the height of the sound wave at that point in time.
“CD Quality” audio is sampled at 44.1khz (44,100 readings per second). But for speech recognition, a
sampling rate of 16khz (16,000 samples per second) is enough to cover the frequency range of human
speech.
Pre-processing our Sampled Sound Data
We now have an array of numbers with each number representing the sound wave’s amplitude at
1/16,000th of a second intervals.
We could feed these numbers right into a neural network. But trying to recognize speech patterns by
processing these samples directly is difficult. Instead, we can make the problem easier by doing some
pre-processing on the audio data.
Let’s start by grouping our sampled audio into 20-millisecond-long chunks. Here’s our first 20
milliseconds of audio
Plotting those numbers as a simple line graph gives us a rough approximation of the original sound
wave for that 20 millisecond period of time:

This recording is only 1/50th of a second long. But even this short recording is a complex mish-mash
of different frequencies of sound. There’s some low sounds, some mid-range sounds, and even some
high-pitched sounds sprinkled in. But taken all together, these different frequencies mix together to
make up the complex sound of human speech.
To make this data easier for a neural network to process, we are going to break apart this complex
sound wave into it’s component parts. We’ll break out the low-pitched parts, the next-lowest-pitched-
parts, and so on. Then by adding up how much energy is in each of those frequency bands (from low
to high), we create a fingerprint of sorts for this audio snippet.
You can see that our 20 millisecond sound snippet

If we repeat this process on every 20 millisecond chunk of audio, we end up with a

spectrogram (each column from left-to-right is one 20ms chunk):

A neural network can find patterns in this kind of data more easily than raw sound waves.
So this is the data representation we’ll actually feed into our neural network.
Recognizing Characters from Short Sounds
Now that we have our audio in a format that’s easy to process, we will feed it into a deep neural
network. The input to the neural network will be 20 millisecond audio chunks. For each little audio
slice, it will try to figure out the letter that corresponds the sound currently being spoken.
We’ll use a recurrent neural network — that is, a neural network that has a memory that influences future
predictions. That’s because each letter it predicts should affect the likelihood of the next letter it will predict
too. For example, if we have said “HEL” so far, it’s very likely we will say “LO” next to finish out the word
“Hello”. It’s much less likely that we will say something unpronounceable next like “XYZ”. So having that
memory of previous predictions helps the neural network make more accurate predictions going forward.
After we run our entire audio clip through the neural network (one chunk at a time), we’ll end up with a
mapping of each audio chunk to the letters most likely spoken during that chunk. Here’s what that mapping
looks like for me saying “Hello”:
Our neural net is predicting that one likely thing I said was “HHHEE_LL_LLLOOO”. But it also
thinks that it was possible that I said “HHHUU_LL_LLLOOO” or even “AAAUU_LL_LLLOOO”.
First, we’ll replace any repeated characters a single character, then we’ll remove any blanks:
● HE_L_LO becomes HELLO
● HU_L_LO becomes HULLO
● AU_L_LO becomes AULLO
That leaves us with three possible transcriptions — “Hello”, “Hullo” and “Aullo”. If you say them out
loud, all of these sound similar to “Hello”.

The trick is to combine these pronunciation-based predictions with likelihood scores based on large
database of written text (books, news articles, etc). You throw out transcriptions that seem the least
likely to be real and keep the transcription that seems the most realistic.

So we’ll pick “Hello” as our final transcription instead of the others. Done!
Thank you
Introduction
to
Autoencoders
Autoencoders
An autoencoder is a neural network that is trained to attempt to
copy its input to its output. Internally, it has a hidden layer h that
describes a code used to represent the input. The network may be
viewed as consisting of two parts: an encoder function h = f (x)
and a decoder that produces a reconstruction r = g(h).

Traditionally, autoencoders were used for dimensionality

reduction or feature learning.

Encoder – This transforms the input (high-dimensional into a

code that is crisp and short.
Decoder – This transforms the shortcode into a high-
dimensional input.
Undercomplete Autoencoders
An autoencoder whose code dimension is less than the input
dimension is called undercomplete. Learning an undercomplete
representation forces the autoencoder to capture the most salient
features of the training data.

The learning process is described simply as minimizing a loss

function L(x, g(f(x))).

if the encoder and decoder are allowed too much capacity, the
autoencoder can learn to perform the copying task without
extracting useful information about the distribution of the data.
Regularized Autoencoders
in the overcomplete case in which the hidden code has dimension greater than the input. In these
cases, even a linear encoder and linear decoder can learn to copy the input to the output without
learning anything useful about the data distribution.

Ideally, one could train any architecture of autoencoder successfully, choosing the code dimension
and the capacity of the encoder and decoder based on the complexity of distribution to be modeled.
Regularized autoencoders provide the ability to do so. Rather than limiting the model capacity by
keeping the encoder and decoder shallow and the code size small, regularized autoencoders use a
loss function that encourages the model to have other properties besides the ability to copy its input
to its output.
Sparse Autoencoders
◦ A sparse autoencoder tries to ensure the neuron is inactive most of the time.
◦ A sparse autoencoder is simply an autoencoder whose training criterion involves a sparsity penalty Ω(h) on the
code layer h, in addition to the reconstruction error
Recontruction Error -> L(x, g(f(x))) + Ω(h).
{regularLoss} {loss to maintain activation value 0}

◦ Sparse autoencoders are typically used to learn features for another task such as classification.
Denoising autoencoders
Rather than adding a penalty Ω to the cost function, we can obtain an autoencoder
that learns something useful by changing the reconstruction error term of the cost
function. Traditionally, autoencoders minimize some function L(x, g(f(x)))

where L is a loss function penalizing g(f (x)) for being dissimilar from x, such as
the L^2 norm of their difference. This encourages g ◦ f to learn to be merely an
identity function if they have the capacity to do so. A denoising autoencoder or
DAE instead minimizes L(x, g(f(x˜)))

where x˜ is a copy of x that has been corrupted by some form of noise. Denoising
autoencoders must therefore undo this corruption rather than simply copying their
input.

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6134)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (627)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1148)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (935)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4/5 (8215)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (631)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1253)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4/5 (8365)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (860)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (877)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (954)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4/5 (2923)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (484)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (277)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (4973)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (444)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibin
3.5/5 (2061)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4281)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (447)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (1988)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (278)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2283)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1068)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (1993)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2641)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (1936)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (125)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (692)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (1912)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4074)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (75)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (830)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (901)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (143)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2544)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M L Stedman
4.5/5 (790)
CK40N Edit Costing Run
100% (1)
CK40N Edit Costing Run
39 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Little Women
From Everand
Little Women
Louisa May Alcott
4/5 (105)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
3.5/5 (109)
Customer Churn Case Study
100% (2)
Customer Churn Case Study
19 pages
Operating Systems (OS)
No ratings yet
Operating Systems (OS)
2 pages
Isodraft User Guide PDF
No ratings yet
Isodraft User Guide PDF
137 pages
QSP 7.1-03 Control of Organizational Knowledge (Preview)
100% (2)
QSP 7.1-03 Control of Organizational Knowledge (Preview)
4 pages
The Static Condensation Algorithm
No ratings yet
The Static Condensation Algorithm
6 pages
Sprint - How To Solve Big Problems and Test New Ideas in Just Five Days Joonas Koivumaa Book Review Lumen 2 - 2017
No ratings yet
Sprint - How To Solve Big Problems and Test New Ideas in Just Five Days Joonas Koivumaa Book Review Lumen 2 - 2017
3 pages
9399PLC5UM
No ratings yet
9399PLC5UM
542 pages
Shafer - Probabilistic Expert Systems
No ratings yet
Shafer - Probabilistic Expert Systems
91 pages
Density 2 Go
No ratings yet
Density 2 Go
8 pages
Cse-Iii-Data Structures With C (10CS35) - Notes PDF
No ratings yet
Cse-Iii-Data Structures With C (10CS35) - Notes PDF
119 pages
Visual Programming Notes1 Kinindia
No ratings yet
Visual Programming Notes1 Kinindia
101 pages
Problem Set - VI
No ratings yet
Problem Set - VI
2 pages
CPDS Notes Final Unit 1 PDF
No ratings yet
CPDS Notes Final Unit 1 PDF
25 pages
Colorburst Manual PDF
No ratings yet
Colorburst Manual PDF
145 pages
EM78P4589
No ratings yet
EM78P4589
76 pages
Web Miningppt
No ratings yet
Web Miningppt
29 pages
Plant Diesase Thesis
No ratings yet
Plant Diesase Thesis
48 pages
Deadlocks Slides
No ratings yet
Deadlocks Slides
17 pages
DTR Sept
No ratings yet
DTR Sept
1 page
Lessob
No ratings yet
Lessob
27 pages
Stack Recursion Quick Sort Word
No ratings yet
Stack Recursion Quick Sort Word
38 pages
Object Oriented Programming Lab: Beyond The Syllabus Overloading New and Delete Operator Aim
No ratings yet
Object Oriented Programming Lab: Beyond The Syllabus Overloading New and Delete Operator Aim
3 pages
ZTE BSS Features
No ratings yet
ZTE BSS Features
71 pages
Based Project Using ATMEGA 16
No ratings yet
Based Project Using ATMEGA 16
23 pages
Mastering Delphi6
No ratings yet
Mastering Delphi6
1 page
Weka Exercise 1
No ratings yet
Weka Exercise 1
7 pages
Muhammad Zahid Farid - Resume PDF
No ratings yet
Muhammad Zahid Farid - Resume PDF
2 pages
Ukuran Dan Bentuk Lengkung Gigi Rahang Bawah Pada Orang Papua
No ratings yet
Ukuran Dan Bentuk Lengkung Gigi Rahang Bawah Pada Orang Papua
6 pages
Deep Dive Into Litecoin (LTC) : Cryptocurrency For Instant, Near-Zero Cost Payments
No ratings yet
Deep Dive Into Litecoin (LTC) : Cryptocurrency For Instant, Near-Zero Cost Payments
17 pages

Application of Depplearning and Intro To Autoencoders

Uploaded by

Application of Depplearning and Intro To Autoencoders

Uploaded by

Natural Processing

An Application of Deep Learning

If we repeat this process on every 20 millisecond chunk of audio, we end up with a

Traditionally, autoencoders were used for dimensionality

Encoder – This transforms the input (high-dimensional into a

The learning process is described simply as minimizing a loss

You might also like