0% found this document useful (0 votes)
23 views

Introduction To Machine Learning

Companion slides to help demonstrate and explain the concept of machine learning to an audience with basic understanding of technical principles.

Uploaded by

kelsey
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Introduction To Machine Learning

Companion slides to help demonstrate and explain the concept of machine learning to an audience with basic understanding of technical principles.

Uploaded by

kelsey
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Introduction to Machine Learning

Ksenia Zakirova
Landon Barnickle
What is
Our goal is to enable machines to make decisions
based on data without explicit programming.

Machine Learning?
Contains a large variety of algorithms aimed at solving
various problems:
● Prediction
● Classification
● Natural Language Processing
● Game AI
● Image recognition
Types of
Three main types of ML:
● Supervised

Machine Learning ● Unsupervised


● Reinforcement Learning
Supervised Learning

Given labeled data, can we predict future labels?

MNIST Dataset: 60,000 28x28 images of


handwritten digits 0-9 with corresponding label
SL Example: Optical Character Recognition

Tesseract is a software/model that can extract text


from images. For English, it was trained on 400,000
lines of text spanning 4500 different fonts.

It allows you to fine-tune on additional data (if


working with a weird font) or retrain on your own
dataset.

Similar models are now showing up elsewhere:

● iPhone camera live text


● Google camera translate
Under the Hood: Linear Regression Primer

We want to find a line that best explains our data and


minimizes error.

In this simple example,


# umbrellas sold ≈ 0.5 * rainfall (mm) + 10

Here, 0.5 is the slope/weight and 10 is the y-intercept

It’s not an exact model, as some points are


over/under. However, this is the best linear
regression model possible.
Under the Hood: Linear Regression Primer

How do we calculate the error?

n
1
Average Error = n ∑ (prediction – actual)^2
i=1
Under the Hood: Gradient Descent Primer

Let say we have no idea what the best model is. How do
we begin?

Pick a random slope/weight!

Calculate the error of the model using our starting weight

Find the slope (derivative/gradient) of the error function.


This tells us whether to increase or decrease the weight to
reduce error.

Repeat until the slope is zero and the error is


minimized!
Under the Hood: Animation
How Do We Use This?

We can use this principle to fit a lot of different types of


models.

Lets go look at neural networks.


SL Example: Neural Network

A neuron is a cell in your brain that uses electricity to


transfer information.

Neuron accepts electric input via dendrite.

If input from previous neurons is strong enough, pass


on to the next neurons.
SL Example: Neural Network

Neuron accepts numeric input

If input from previous neurons is strong enough, pass


on to the next neurons

One neuron by itself is practically a linear regression


in multiple variables; it takes the weighted average of
inputs.
SL Example: Neural Network

Rainfall (mm)

How many
Did it rain yesterday? Y/N
umbrellas sold?

Average rainfall for


current month (mm)
Types of
Three main types of ML:
● Supervised

Machine Learning ● Unsupervised


● Reinforcement Learning
Unsupervised Learning

Assume we do not have labels. Can we group


“similar” images together?

Lets go look at our handwritten image data again


UL Example: word2vec

The method word2vec clusters words based on


context and meaning:

● What words is this word often seen next to?


● What words are often used in the same place?

This method was invented in 2013 and has


revolutionized natural language processing (NLP).
UL Example: word2vec Under the Hood

There are more than 100,000 words in the English


language! Three million if we include proper nouns
and various word forms (pretty/prettier/unpretty).

We generally use a 300-dimensional vector to store


them all without overlap.

One version is trained on Google news data (100


billion words). Another is trained on Wikipedia.
UL Example: word2vec Under the Hood

We can do word math!

King – Man + Woman = Queen


UL Example: Semantle

If you like wordle-type games, check out Semantle!

https://semantle.com/

Guess the secret word - the closer you are, the higher
your word2vec score/the lower the distance.
Types of
Three main types of ML:
● Supervised

Machine Learning ● Unsupervised


● Reinforcement Learning
Reinforcement Learning

Agent interacts with environment in such a


way as to maximize reward.

For example:

● TD-Gammon, a computer program that


plays backgammon
● Rewarded when winning!
● On par with top human backgammon
players
● Taught them some new strategies
Where Are You
As a user, you already see it everywhere:
● LinkedIn Learning recommendations

Going to See ● Phone camera apps


● Etc etc

Machine Learning? As a developer, you may use it more often:


● Pretrained models in third-party software like
Tesseract-OCR and AWS Rekognition
● Github Copilot?
● Etc etc

As a data analyst/scientist:
● Create and use models as part of analyses
● On some level, it’s no different than running a
linear regression
Thanks for Bearing
With Me!
Questions?

You might also like