Framing A Machine Learning Problem
Framing A Machine Learning Problem
Facilitators:
Rahman, Brian, Eva, Andrew, George,
Mark, Peter, Confred
Today`s Agenda
Defining a ML problem and proposing a
solution;
Identifying good ML problems
Deciding on ML
Formulating a problem as an ML problem
Classification Predict label for previously Identify image of dog from that of
unseen example cat, bicycle from motor bike
Regression Predict numerical values Predicting price of houses
https://developers.google.com/machine-
learning/problem-framing/cases#check-
your-understanding
"Machine Learning changes the way
you think about a problem. The focus
shifts from a mathematical science to a
natural science, running experiments
and using statistics, not logic, to
analyse its results." - Peter Norvig -
Google Research Director
Scientific method
It is helpful to think of the ML process as an
experiment where we run test after test after test
to converge on a workable model
Like an experiment, the process can be exciting,
challenging, and ultimately worthwhile
1. Set the research goal I want to predict how heavy traffic will be
on a given day.
2. Make a hypothesis I think the weather forecast is an
informative signal for traffic prediction!
3. Collect the required data Collect historical traffic data and weather
data on each day
4. Test your hypothesis Train a model using this data to predict
traffic.
5. Analyze the results you get Is this model better than existing systems
for traffic prediction?
6. Draw a conclusion I should (not) use this model to make
traffic predictions, because of X, Y, and Z.
7. Refine your hypothesis and Time of year could be a helpful signal for
repeat traffic
ML Bootcamp Sept prediction?
16 - Oct 7, 2023
Identifying good problems for ML
Characteristics of a good ML problem
Clear use case
* Start with the problem, not the solution. Make sure you aren't treating ML as a
hammer for your problems
Focus on problems that would be difficult to solve with traditional
programming e.g,
Smart Reply – automated email reply, saves user time
Google Photos – find a specific photo by keyword search without
manual tagging
* ML solves problems by examining patterns in data/adapting with them
Ask yourself the following questions,
What is the problem being faced?
Would it be a good problem for ML?
ML Bootcamp Sept 16 - Oct 7, 2023
Identifying good problems for ML
Characteristics of a good ML problem...
Know the problem before focusing on the data
* Be prepared to have your assumptions challenged
Once you`ve clear understanding of problem, list potential
solutions to test in order to generate the best model
Understand that you`ll have to try out a few solutions before you
land on a good working model
EDA helps you understand your data, but you can't yet
claim that patterns you find generalize until you check
those patterns against previously unseen data
Failure to check could lead you in the wrong direction or reinforce
stereotypes or bias
ML Bootcamp Sept 16 - Oct 7, 2023
Identifying good problems for ML
Characteristics of a good ML problem...
Data, data, more data
* ML requires a lot of relevant data
Data collected specifically for your task is most useful
In practice, secondary data is used in majority of applications
How much is a lot? - depends on the ML problem
but more data will improve your model (e.g, robustness) and
it's predictive power. A good rule of thumb is to have at least
000`s of examples for basic linear models, and 100`s of
000`s for neural networks. If you have less data, consider a
non-ML solution first and/or transfer learning methods
ML Bootcamp Sept 16 - Oct 7, 2023
Identifying good problems for ML
Characteristics of a good ML problem...
Predictive Power
* Your features should contain predictive power
Ensure your data set contains relevant features that
correlate with the phenomenon being investigated
e.g, is bedroom count a good predictor for house prices?
Don`t try out features arbitrarily without a hypothesis
Your goal is to build a model that generalizes well to
previously unseen samples and this is possible only
if you use the right features
ML Bootcamp Sept 16 - Oct 7, 2023
Identifying good problems for ML
Characteristics of a good ML problem...
Predictions vs. Decisions
* Aim to make decisions, not just predictions
Your product take action on output of ML model
ML better at making decisions than deriving insight from
data (for the latter, use statistical approaches)
Ensure predictions allow you to take a useful action e.g,
a model that predicts likelihood of clicking certain videos
could allow a system to pre-fetch the videos most likely to
be clicked
Prediction Decision
What video the learner Show those videos in the
wants to watch next recommendation bar
Not all problems require or need to be solved
using ML