SlideShare a Scribd company logo
11
Moving Your Machine
Learning Models to
Production with
TensorFlow Extended
Jonathan Mugan
22
Moving From Our Hut to the Production Floor
Your model is going to live for a long time. Not just for a demo.
You must know when to update it. The world changes.
You must ensure production data matches training data. Data reflects its origins.
You may need to track multiple model versions. E.g., for different states.
You need to batch the input to serving. One-at-a-time is slow.
33
Interchangeable Parts
and the ML Revolution
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow ServingImage by https://www.flickr.com/photos/36224933@N07
https://creativecommons.org/licenses/by-sa/2.0/deed.en
44
Interchangeable Parts
and the ML Revolution
• TensorFlow Extended (TFX)
• TFX used internally by Google and
recently open sourced
• Represents your pipeline to
production as a sequence of
components
• Building any one model is more
work, but for large endeavors, TFX
helps to keep you organized
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
55
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
66
TensorFlow ExtendedData Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
ML Metadata (MLMD)
• Individual components talk to the
metadata store (MLMD).
• MLMD doesn’t store data itself. It
stores data about data.
https://www.tensorflow.org/tfx/guide
https://www.tensorflow.org/tfx/guide/mlmd
Each component has three pieces
1. Driver (gets information from MLMD)
2. Executor (does what the component does)
3. Publisher (writes results to MLMD)
Organized by library (components in parenthesis)
77
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
• Pulls in your data and put it into binary format
• Also splits it into train and test
• Protocol Buffers
• tf.Example into a TFRecord file
https://www.tensorflow.org/tutorials/load_data/tfrecord
https://www.tensorflow.org/tfx/guide/examplegen
Data Ingestion:
ExampleGen
88
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
Looks at your data and generates a schema, which
you manually update.
It makes sure the data you pass in later during serving
is still in the same format and hasn’t drifted.
Also has a great way to visualize data, FACETS, we will
see later.
https://www.tensorflow.org/tfx/guide/tfdv
TensorFlow Data Validation:
StatisticsGen, SchemaGen,
Example Validator
99
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
Example Schema
1010
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
Converts your data
• E.g., One-hot encoding, categorical with a vocab
• Part of TensorFlow graph, for better or worse
• Good for transformations that require looking at
all values
TensorFlow Transform:
Transform
Example: tft.scale_to_z_score
subtracts mean and divides by standard deviation
Features come in many types, and TensorFlow
Transform converts them into a format that can be
ingested by a machine learning model.
Nice to have this explicit.
https://www.tensorflow.org/tfx/guide/transform
1111
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
• Trains the model: Part we are all
familiar with
• Except uses an Estimator
• Can use KERAS
tf.keras.estimator.model_to_estimator()
https://www.tensorflow.org/tfx/guide/trainer
Estimator or Keras Model:
Trainer
1212
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
Evaluator Component
• Evaluates the model.
• Uses TensorFlow Model Analysis (TFMA), which we
will see shortly.
• https://www.tensorflow.org/tfx/guide/evaluator
TensorFlow Model Analysis:
Evaluator, Model Validator
Model Validator Component
• You set a baseline (such as the current
serving model) and a metric (such as AUC)
• Marks in the metadata if the model passes
the baseline.
• https://www.tensorflow.org/tfx/guide/modelval
1313
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
For a deeper understanding, see Ice-T’s 1988
hit song, “I’m Your Pusher”
https://www.tensorflow.org/tfx/guide/pusher
Validation Outcomes:
Pusher
• Pushes the model to Serving if it is validated
• I.e., if your new model is better than the
existing model, push it to the model server.
1414
Data Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
• Uses the model to perform inference
• Called via gRPC APIs or RESTFUL APIs
• Easy to get running with Docker
• You can call a particular version of a model
• Takes care of batching
https://www.tensorflow.org/tfx/guide/serving
TensorFlow Serving
1515
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
1616
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• ML Metadata
• Apache Airflow
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
1717
Metadata StoreData Ingestion
(ExampleGen)
TensorFlow Data Validation
(StatisticsGen, SchemaGen, Example Validator)
TensorFlow Transform
(Transform)
Estimator or Keras Model
(Trainer)
TensorFlow Model Analysis
(Evaluator, Model Validator)
Validation Outcomes
(Pusher)
TensorFlow Serving
(Model Server)
ML Metadata (MLMD)
https://www.tensorflow.org/tfx/guide/mlmd
1818
Looking at the table Artifact using the DB Browser for SQLite
1919
The Types of
Artifacts
The type_id
field from the
previous slide
maps here
2020
You can see the properties of the artifacts in the ArtifactProperty table
2121
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• ML Metadata
• Apache Airflow
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
2222
Pipeline Management with Apache Airflow
Allows you to trigger and keep track of pipelines.
2323
Pipeline Management with Apache Airflow
2424
Pipeline Management with Apache Airflow
• You can also use Kubeflow https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi_pipeline/taxi_pipeline_kubeflow_gcp.py
• And Apache Beam https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi_pipeline/taxi_pipeline_beam.py
2525
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
2626
TensorFlow 2.0
• Don’t have to define the graph separately
• More like PyTorch
• There are two ways you can do computation:
• Eager: like PyTorch, just compute
• tf.function: You decorate a function and call it
2727
TensorFlow 1.x
session
TensorFlow 2.x
function
TensorFlow 2.x
eager
output
output output
Still get performance of Session
https://www.tensorflow.org/guide/function https://www.tensorflow.org/guide/eager
Debug like a civilized person
2828
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
2929
Data
• tf.train.Example is tf.train.Feature protobuf
message, where each value has a name
and a type (tf.train.BytesList,
tf.train.FloatList, tf.train.Int64List)
• TFRecord is a format for storing sequences
of binary records, each record is
tf.train.Example
• tf.data.Dataset can take in TFRecord and
create an iterator for batching
• tf.parse_example unpacks tf.Example into
standard tensors.
https://www.tensorflow.org/tutorials/load_data/tf_records
Features
• tf.feature_column,
where you further
specify what it is, such as
one-hot, vocabulary, and
embeddings and such.
https://www.tensorflow.org/guide/feature_columns
tf.train.Example specifies what it is
for storage, and tf.feature_column is
for the input to a model.
3030
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
3131
To build a model you need
• format of model input
• tf.feature_column
• model architecture and hyperparameters
• tf.estimator
• (or KERAS with tf.keras.estimator.model_to_estimator)
• function to deliver training data
• tf.estimator.TrainSpec from tf.data
• function to deliver eval data
• tf.estimator.EvalSpec from tf.data
• function to deliver serving data
• tf.estimator.FinalExporter
3232
TensorFlow Estimator
• Estimator is a wrapper for regular
TensorFlow that automatically
scales to multiple machines and
automatically outputs results to
TensorBoard
Shout out to model explainability using estimator using boosted trees
https://www.tensorflow.org/tutorials/estimator/boosted_trees_model_understanding
https://www.tensorflow.org/tutorials/estimator/boosted_trees
3333
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
3434
TensorBoard
Plotting of prescribers
Red has more overdose
Green has fewer
3D plots not that useful,
but they look cool
You can even use TensorBoard
from PyTorch
https://pytorch.org/docs/stable/tensorboard.html
3535
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
3636
TensorFlow Data Validation (TFDV)
• We need to understand our data as well as possible.
• TFDV provides tools that make that less difficult.
• Helps to identify bugs in the data by showing you
pictures that don’t look right.
• https://www.tensorflow.org/tfx/data_validation/get_started
3737
3838
3939
By sorting by non-uniformity,
we can debug features.
4040
In general, we can make sure the distributions are what we would expect.
4141
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
4242
TensorFlow
Model
Analysis
(TFMA)
We can see how well
our model does
by each slice.
We see that this model
does much better for
females than males.
https://www.tensorflow.org/tfx/guide/tfma
4343
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• TensorFlow 2.0
• TensorFlow Data and Features
• TensorFlow Estimators
• TensorBoard
• TensorFlow Data Visualization (TFDV) [Facets]
• TensorFlow Model Analysis (TFMA)
• What-If Tool
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
4444
What-IF Tool
• The What-If Tool applies a model from TensorFlow
Serving to any data you give it.
• https://pair-code.github.io/what-if-tool/index.html
• Change a record and see what the model does
• Find the most similar record with a different classification
• Can be used for fairness. Adjust the model so it is equally
likely to predict “yes” for each group
• https://www.coursera.org/lecture/machine-learning-business-professionals/activity-
applying-fairness-concerns-with-the-what-if-tool-review-0mYda
4545
Looking at the data
by race, age, and
inference score
4646
What-If Tool showing the
the probability of overdose
for individual features.
4747
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
4848
Alternatives (kind of)
• MLflow https://mlflow.org/docs/latest/index.html
• Netflix Metaflow https://github.com/Netflix/metaflow
• Sacred https://github.com/IDSIA/sacred
• Dataiku DSS https://www.dataiku.com/product/
• Polyaxon https://polyaxon.com/
• Facebook Ax https://www.ax.dev/
They all do something a little different,
with pieces straddling different sides of the data science/production divide
4949
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Streamlit Dashboard
• Python Typing, Dataclasses, and Enum
• Conclusion
5050
Streamlit
Dashboard
• Writes to the browser
• Works well for artifacts in the ML pipeline.
https://github.com/streamlit/streamlit/
https://streamlit.io/docs/getting_started.html
5151
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Streamlit Dashboard
• Python Typing, Dataclasses, and Enum
• Conclusion
5252
Typing, Dataclasses,
and Enum
You can build interchangeable
parts right in Python.
Not new of course, but they make
Python a little less wild west.
Output:
[Turtle(size=6, name='Anita'),
Turtle(size=2, name='Anita')]
5353
Outline
• Introduction to TensorFlow Extended (TFX)
• TensorFlow Extended Pipeline Components
• Running the Pipeline
• TensorFlow and TensorFlow Tools
• Alternatives to TensorFlow Extended
• Other Useful Tools
• Conclusion
5454
TFX Disadvantages
• Steep learning curve
• Changes constantly (but not while you are watching it)
• Somewhat inflexible, you can create your own
components, but steep learning curve
• No hyperparameter search (yet,
https://github.com/tensorflow/tfx/issues/182)
5555
TFX Advantages
• Set up to scale
• Documents your process through artifacts
• Warm-starting: as new data comes in, you don’t have to
start training over. Keeps models fresh
• Tools to see data and debug problems
• Don’t have to rerun what is already run
5656
Where to Start
• Jupyter notebook tutorial
https://www.tensorflow.org/tfx/tutorials/tfx/components
• Airflow tutorial
https://www.tensorflow.org/tfx/tutorials/tfx/airflow_workshop
5757
Happy Hour!
6500 River Place Blvd.
Bldg. 3, Suite 120
Austin, TX. 78730
Jonathan Mugan, Ph. D.
Email: jmugan@deumbra.com
5858
Appendix
• Original TFX paper
https://ai.google/research/pubs/pub46484
• Documentation
• https://www.tensorflow.org/tfx
• https://www.tensorflow.org/tfx/tutorials
• https://www.tensorflow.org/tfx/guide
• https://www.tensorflow.org/tfx/api_docs
Ad

Recommended

8th sem.pptx
8th sem.pptx
Snehalkarki1
 
Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)
Amol Patil
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine Learning
Haptik
 
What Is A Neural Network? | How Deep Neural Networks Work | Neural Network Tu...
What Is A Neural Network? | How Deep Neural Networks Work | Neural Network Tu...
Simplilearn
 
Machine Learning Overview
Machine Learning Overview
Mykhailo Koval
 
AI for Art Generation / Dave Savio (Artifactory.ai)
AI for Art Generation / Dave Savio (Artifactory.ai)
DevGAMM Conference
 
Introduction to machine learning
Introduction to machine learning
Ganesh Satpute
 
"Deep Learning for Manufacturing Inspection Applications," a Presentation fro...
"Deep Learning for Manufacturing Inspection Applications," a Presentation fro...
Edge AI and Vision Alliance
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image Generation
Jason Anderson
 
Analysis by semantic segmentation of Multispectral satellite imagery using de...
Analysis by semantic segmentation of Multispectral satellite imagery using de...
Yogesh S Awate
 
Character recognition project
Character recognition project
Monsif sakienah
 
Artificial Intelligence in Gaming.pptx
Artificial Intelligence in Gaming.pptx
Md. Rakib Trofder
 
Ethics and Responsible AI Deployment.pptx
Ethics and Responsible AI Deployment.pptx
Petar Radanliev
 
메타버스와 생성형 AI 조사(김병철) 20230323.pptx
메타버스와 생성형 AI 조사(김병철) 20230323.pptx
ssuserc0b359
 
DC02. Interpretation of predictions
DC02. Interpretation of predictions
Anton Kulesh
 
3D Vision Technology
3D Vision Technology
basuabhishek92
 
Machine Learning vs Deep Learning vs Artificial Intelligence | ML vs DL vs AI...
Machine Learning vs Deep Learning vs Artificial Intelligence | ML vs DL vs AI...
Simplilearn
 
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
Edureka!
 
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...
Simplilearn
 
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Edureka!
 
Fairness in Machine Learning and AI
Fairness in Machine Learning and AI
Seth Grimes
 
Modern Recommendation for Advanced Practitioners
Modern Recommendation for Advanced Practitioners
Flavian Vasile
 
Supervised Machine Learning With Types And Techniques
Supervised Machine Learning With Types And Techniques
SlideTeam
 
genetic algorithm based music recommender system
genetic algorithm based music recommender system
neha pevekar
 
Deep learning in medicine: An introduction and applications to next-generatio...
Deep learning in medicine: An introduction and applications to next-generatio...
Allen Day, PhD
 
CNN Attention Networks
CNN Attention Networks
Taeoh Kim
 
Future of AI
Future of AI
Pantech ProLabs India Pvt Ltd
 
TensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache Beam
markgrover
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Chris Fregly
 

More Related Content

What's hot (20)

Variational Autoencoders For Image Generation
Variational Autoencoders For Image Generation
Jason Anderson
 
Analysis by semantic segmentation of Multispectral satellite imagery using de...
Analysis by semantic segmentation of Multispectral satellite imagery using de...
Yogesh S Awate
 
Character recognition project
Character recognition project
Monsif sakienah
 
Artificial Intelligence in Gaming.pptx
Artificial Intelligence in Gaming.pptx
Md. Rakib Trofder
 
Ethics and Responsible AI Deployment.pptx
Ethics and Responsible AI Deployment.pptx
Petar Radanliev
 
메타버스와 생성형 AI 조사(김병철) 20230323.pptx
메타버스와 생성형 AI 조사(김병철) 20230323.pptx
ssuserc0b359
 
DC02. Interpretation of predictions
DC02. Interpretation of predictions
Anton Kulesh
 
3D Vision Technology
3D Vision Technology
basuabhishek92
 
Machine Learning vs Deep Learning vs Artificial Intelligence | ML vs DL vs AI...
Machine Learning vs Deep Learning vs Artificial Intelligence | ML vs DL vs AI...
Simplilearn
 
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
Edureka!
 
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...
Simplilearn
 
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Edureka!
 
Fairness in Machine Learning and AI
Fairness in Machine Learning and AI
Seth Grimes
 
Modern Recommendation for Advanced Practitioners
Modern Recommendation for Advanced Practitioners
Flavian Vasile
 
Supervised Machine Learning With Types And Techniques
Supervised Machine Learning With Types And Techniques
SlideTeam
 
genetic algorithm based music recommender system
genetic algorithm based music recommender system
neha pevekar
 
Deep learning in medicine: An introduction and applications to next-generatio...
Deep learning in medicine: An introduction and applications to next-generatio...
Allen Day, PhD
 
CNN Attention Networks
CNN Attention Networks
Taeoh Kim
 
Future of AI
Future of AI
Pantech ProLabs India Pvt Ltd
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image Generation
Jason Anderson
 
Analysis by semantic segmentation of Multispectral satellite imagery using de...
Analysis by semantic segmentation of Multispectral satellite imagery using de...
Yogesh S Awate
 
Character recognition project
Character recognition project
Monsif sakienah
 
Artificial Intelligence in Gaming.pptx
Artificial Intelligence in Gaming.pptx
Md. Rakib Trofder
 
Ethics and Responsible AI Deployment.pptx
Ethics and Responsible AI Deployment.pptx
Petar Radanliev
 
메타버스와 생성형 AI 조사(김병철) 20230323.pptx
메타버스와 생성형 AI 조사(김병철) 20230323.pptx
ssuserc0b359
 
DC02. Interpretation of predictions
DC02. Interpretation of predictions
Anton Kulesh
 
Machine Learning vs Deep Learning vs Artificial Intelligence | ML vs DL vs AI...
Machine Learning vs Deep Learning vs Artificial Intelligence | ML vs DL vs AI...
Simplilearn
 
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
Edureka!
 
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...
Simplilearn
 
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Edureka!
 
Fairness in Machine Learning and AI
Fairness in Machine Learning and AI
Seth Grimes
 
Modern Recommendation for Advanced Practitioners
Modern Recommendation for Advanced Practitioners
Flavian Vasile
 
Supervised Machine Learning With Types And Techniques
Supervised Machine Learning With Types And Techniques
SlideTeam
 
genetic algorithm based music recommender system
genetic algorithm based music recommender system
neha pevekar
 
Deep learning in medicine: An introduction and applications to next-generatio...
Deep learning in medicine: An introduction and applications to next-generatio...
Allen Day, PhD
 
CNN Attention Networks
CNN Attention Networks
Taeoh Kim
 

Similar to Moving Your Machine Learning Models to Production with TensorFlow Extended (20)

TensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache Beam
markgrover
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Chris Fregly
 
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
Databricks
 
A Tour of Tensorflow's APIs
A Tour of Tensorflow's APIs
Dean Wyatte
 
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
Fei Chen
 
running Tensorflow in Production
running Tensorflow in Production
Matthias Feys
 
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
gdgsurrey
 
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
Simplilearn
 
Meetup tensorframes
Meetup tensorframes
Paolo Platter
 
TensorFlow example for AI Ukraine2016
TensorFlow example for AI Ukraine2016
Andrii Babii
 
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
Gabriel Moreira
 
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
Gabriel Moreira
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processing
ananth
 
Tensor flow 2.0 what's new
Tensor flow 2.0 what's new
Poo Kuan Hoong
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
Jim Dowling
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
Jan Kirenz
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Advanced Spark and TensorFlow Meetup May 26, 2016
Advanced Spark and TensorFlow Meetup May 26, 2016
Chris Fregly
 
Neural Networks with Google TensorFlow
Neural Networks with Google TensorFlow
Darshan Patel
 
BlaBlaConf'22 The art of MLOps in TensorFlow Ecosystem
BlaBlaConf'22 The art of MLOps in TensorFlow Ecosystem
Taha Bouhsine
 
TensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache Beam
markgrover
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Chris Fregly
 
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
Databricks
 
A Tour of Tensorflow's APIs
A Tour of Tensorflow's APIs
Dean Wyatte
 
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
Fei Chen
 
running Tensorflow in Production
running Tensorflow in Production
Matthias Feys
 
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
gdgsurrey
 
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
Simplilearn
 
TensorFlow example for AI Ukraine2016
TensorFlow example for AI Ukraine2016
Andrii Babii
 
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
Gabriel Moreira
 
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
Gabriel Moreira
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processing
ananth
 
Tensor flow 2.0 what's new
Tensor flow 2.0 what's new
Poo Kuan Hoong
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
Jim Dowling
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
Jan Kirenz
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Advanced Spark and TensorFlow Meetup May 26, 2016
Advanced Spark and TensorFlow Meetup May 26, 2016
Chris Fregly
 
Neural Networks with Google TensorFlow
Neural Networks with Google TensorFlow
Darshan Patel
 
BlaBlaConf'22 The art of MLOps in TensorFlow Ecosystem
BlaBlaConf'22 The art of MLOps in TensorFlow Ecosystem
Taha Bouhsine
 
Ad

More from Jonathan Mugan (9)

How to build someone we can talk to
How to build someone we can talk to
Jonathan Mugan
 
Generating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural Networks
Jonathan Mugan
 
Data Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AI
Jonathan Mugan
 
Data Day Seattle, Chatbots from First Principles
Data Day Seattle, Chatbots from First Principles
Jonathan Mugan
 
Chatbots from first principles
Chatbots from first principles
Jonathan Mugan
 
From Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial Intelligence
Jonathan Mugan
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
Jonathan Mugan
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
Jonathan Mugan
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
Jonathan Mugan
 
How to build someone we can talk to
How to build someone we can talk to
Jonathan Mugan
 
Generating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural Networks
Jonathan Mugan
 
Data Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AI
Jonathan Mugan
 
Data Day Seattle, Chatbots from First Principles
Data Day Seattle, Chatbots from First Principles
Jonathan Mugan
 
Chatbots from first principles
Chatbots from first principles
Jonathan Mugan
 
From Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial Intelligence
Jonathan Mugan
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
Jonathan Mugan
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
Jonathan Mugan
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
Jonathan Mugan
 
Ad

Recently uploaded (20)

War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
biswajitbanerjee38
 
10 Key Challenges for AI within the EU Data Protection Framework.pdf
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
 
MuleSoft for AgentForce : Topic Center and API Catalog
MuleSoft for AgentForce : Topic Center and API Catalog
shyamraj55
 
CapCut Pro Crack For PC Latest Version {Fully Unlocked} 2025
CapCut Pro Crack For PC Latest Version {Fully Unlocked} 2025
pcprocore
 
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Alliance
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
revolcs10
 
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Alliance
 
OpenPOWER Foundation & Open-Source Core Innovations
OpenPOWER Foundation & Open-Source Core Innovations
IBM
 
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
Edge AI and Vision Alliance
 
You are not excused! How to avoid security blind spots on the way to production
You are not excused! How to avoid security blind spots on the way to production
Michele Leroux Bustamante
 
FIDO Seminar: New Data: Passkey Adoption in the Workforce.pptx
FIDO Seminar: New Data: Passkey Adoption in the Workforce.pptx
FIDO Alliance
 
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance
 
Improving Data Integrity: Synchronization between EAM and ArcGIS Utility Netw...
Improving Data Integrity: Synchronization between EAM and ArcGIS Utility Netw...
Safe Software
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
Techniques for Automatic Device Identification and Network Assignment.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
 
The Future of Technology: 2025-2125 by Saikat Basu.pdf
The Future of Technology: 2025-2125 by Saikat Basu.pdf
Saikat Basu
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
 
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
Priyanka Aash
 
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
biswajitbanerjee38
 
10 Key Challenges for AI within the EU Data Protection Framework.pdf
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
 
MuleSoft for AgentForce : Topic Center and API Catalog
MuleSoft for AgentForce : Topic Center and API Catalog
shyamraj55
 
CapCut Pro Crack For PC Latest Version {Fully Unlocked} 2025
CapCut Pro Crack For PC Latest Version {Fully Unlocked} 2025
pcprocore
 
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Alliance
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
revolcs10
 
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Alliance
 
OpenPOWER Foundation & Open-Source Core Innovations
OpenPOWER Foundation & Open-Source Core Innovations
IBM
 
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
Edge AI and Vision Alliance
 
You are not excused! How to avoid security blind spots on the way to production
You are not excused! How to avoid security blind spots on the way to production
Michele Leroux Bustamante
 
FIDO Seminar: New Data: Passkey Adoption in the Workforce.pptx
FIDO Seminar: New Data: Passkey Adoption in the Workforce.pptx
FIDO Alliance
 
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance
 
Improving Data Integrity: Synchronization between EAM and ArcGIS Utility Netw...
Improving Data Integrity: Synchronization between EAM and ArcGIS Utility Netw...
Safe Software
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
Techniques for Automatic Device Identification and Network Assignment.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
 
The Future of Technology: 2025-2125 by Saikat Basu.pdf
The Future of Technology: 2025-2125 by Saikat Basu.pdf
Saikat Basu
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
 
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
Priyanka Aash
 

Moving Your Machine Learning Models to Production with TensorFlow Extended

  • 1. 11 Moving Your Machine Learning Models to Production with TensorFlow Extended Jonathan Mugan
  • 2. 22 Moving From Our Hut to the Production Floor Your model is going to live for a long time. Not just for a demo. You must know when to update it. The world changes. You must ensure production data matches training data. Data reflects its origins. You may need to track multiple model versions. E.g., for different states. You need to batch the input to serving. One-at-a-time is slow.
  • 3. 33 Interchangeable Parts and the ML Revolution Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow ServingImage by https://www.flickr.com/photos/36224933@N07 https://creativecommons.org/licenses/by-sa/2.0/deed.en
  • 4. 44 Interchangeable Parts and the ML Revolution • TensorFlow Extended (TFX) • TFX used internally by Google and recently open sourced • Represents your pipeline to production as a sequence of components • Building any one model is more work, but for large endeavors, TFX helps to keep you organized Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving
  • 5. 55 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 6. 66 TensorFlow ExtendedData Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving ML Metadata (MLMD) • Individual components talk to the metadata store (MLMD). • MLMD doesn’t store data itself. It stores data about data. https://www.tensorflow.org/tfx/guide https://www.tensorflow.org/tfx/guide/mlmd Each component has three pieces 1. Driver (gets information from MLMD) 2. Executor (does what the component does) 3. Publisher (writes results to MLMD) Organized by library (components in parenthesis)
  • 7. 77 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving • Pulls in your data and put it into binary format • Also splits it into train and test • Protocol Buffers • tf.Example into a TFRecord file https://www.tensorflow.org/tutorials/load_data/tfrecord https://www.tensorflow.org/tfx/guide/examplegen Data Ingestion: ExampleGen
  • 8. 88 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving Looks at your data and generates a schema, which you manually update. It makes sure the data you pass in later during serving is still in the same format and hasn’t drifted. Also has a great way to visualize data, FACETS, we will see later. https://www.tensorflow.org/tfx/guide/tfdv TensorFlow Data Validation: StatisticsGen, SchemaGen, Example Validator
  • 9. 99 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving Example Schema
  • 10. 1010 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving Converts your data • E.g., One-hot encoding, categorical with a vocab • Part of TensorFlow graph, for better or worse • Good for transformations that require looking at all values TensorFlow Transform: Transform Example: tft.scale_to_z_score subtracts mean and divides by standard deviation Features come in many types, and TensorFlow Transform converts them into a format that can be ingested by a machine learning model. Nice to have this explicit. https://www.tensorflow.org/tfx/guide/transform
  • 11. 1111 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving • Trains the model: Part we are all familiar with • Except uses an Estimator • Can use KERAS tf.keras.estimator.model_to_estimator() https://www.tensorflow.org/tfx/guide/trainer Estimator or Keras Model: Trainer
  • 12. 1212 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving Evaluator Component • Evaluates the model. • Uses TensorFlow Model Analysis (TFMA), which we will see shortly. • https://www.tensorflow.org/tfx/guide/evaluator TensorFlow Model Analysis: Evaluator, Model Validator Model Validator Component • You set a baseline (such as the current serving model) and a metric (such as AUC) • Marks in the metadata if the model passes the baseline. • https://www.tensorflow.org/tfx/guide/modelval
  • 13. 1313 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving For a deeper understanding, see Ice-T’s 1988 hit song, “I’m Your Pusher” https://www.tensorflow.org/tfx/guide/pusher Validation Outcomes: Pusher • Pushes the model to Serving if it is validated • I.e., if your new model is better than the existing model, push it to the model server.
  • 14. 1414 Data Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving • Uses the model to perform inference • Called via gRPC APIs or RESTFUL APIs • Easy to get running with Docker • You can call a particular version of a model • Takes care of batching https://www.tensorflow.org/tfx/guide/serving TensorFlow Serving
  • 15. 1515 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 16. 1616 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • ML Metadata • Apache Airflow • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 17. 1717 Metadata StoreData Ingestion (ExampleGen) TensorFlow Data Validation (StatisticsGen, SchemaGen, Example Validator) TensorFlow Transform (Transform) Estimator or Keras Model (Trainer) TensorFlow Model Analysis (Evaluator, Model Validator) Validation Outcomes (Pusher) TensorFlow Serving (Model Server) ML Metadata (MLMD) https://www.tensorflow.org/tfx/guide/mlmd
  • 18. 1818 Looking at the table Artifact using the DB Browser for SQLite
  • 19. 1919 The Types of Artifacts The type_id field from the previous slide maps here
  • 20. 2020 You can see the properties of the artifacts in the ArtifactProperty table
  • 21. 2121 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • ML Metadata • Apache Airflow • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 22. 2222 Pipeline Management with Apache Airflow Allows you to trigger and keep track of pipelines.
  • 24. 2424 Pipeline Management with Apache Airflow • You can also use Kubeflow https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi_pipeline/taxi_pipeline_kubeflow_gcp.py • And Apache Beam https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi_pipeline/taxi_pipeline_beam.py
  • 25. 2525 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • TensorFlow 2.0 • TensorFlow Data and Features • TensorFlow Estimators • TensorBoard • TensorFlow Data Visualization (TFDV) [Facets] • TensorFlow Model Analysis (TFMA) • What-If Tool • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 26. 2626 TensorFlow 2.0 • Don’t have to define the graph separately • More like PyTorch • There are two ways you can do computation: • Eager: like PyTorch, just compute • tf.function: You decorate a function and call it
  • 27. 2727 TensorFlow 1.x session TensorFlow 2.x function TensorFlow 2.x eager output output output Still get performance of Session https://www.tensorflow.org/guide/function https://www.tensorflow.org/guide/eager Debug like a civilized person
  • 28. 2828 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • TensorFlow 2.0 • TensorFlow Data and Features • TensorFlow Estimators • TensorBoard • TensorFlow Data Visualization (TFDV) [Facets] • TensorFlow Model Analysis (TFMA) • What-If Tool • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 29. 2929 Data • tf.train.Example is tf.train.Feature protobuf message, where each value has a name and a type (tf.train.BytesList, tf.train.FloatList, tf.train.Int64List) • TFRecord is a format for storing sequences of binary records, each record is tf.train.Example • tf.data.Dataset can take in TFRecord and create an iterator for batching • tf.parse_example unpacks tf.Example into standard tensors. https://www.tensorflow.org/tutorials/load_data/tf_records Features • tf.feature_column, where you further specify what it is, such as one-hot, vocabulary, and embeddings and such. https://www.tensorflow.org/guide/feature_columns tf.train.Example specifies what it is for storage, and tf.feature_column is for the input to a model.
  • 30. 3030 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • TensorFlow 2.0 • TensorFlow Data and Features • TensorFlow Estimators • TensorBoard • TensorFlow Data Visualization (TFDV) [Facets] • TensorFlow Model Analysis (TFMA) • What-If Tool • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 31. 3131 To build a model you need • format of model input • tf.feature_column • model architecture and hyperparameters • tf.estimator • (or KERAS with tf.keras.estimator.model_to_estimator) • function to deliver training data • tf.estimator.TrainSpec from tf.data • function to deliver eval data • tf.estimator.EvalSpec from tf.data • function to deliver serving data • tf.estimator.FinalExporter
  • 32. 3232 TensorFlow Estimator • Estimator is a wrapper for regular TensorFlow that automatically scales to multiple machines and automatically outputs results to TensorBoard Shout out to model explainability using estimator using boosted trees https://www.tensorflow.org/tutorials/estimator/boosted_trees_model_understanding https://www.tensorflow.org/tutorials/estimator/boosted_trees
  • 33. 3333 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • TensorFlow 2.0 • TensorFlow Data and Features • TensorFlow Estimators • TensorBoard • TensorFlow Data Visualization (TFDV) [Facets] • TensorFlow Model Analysis (TFMA) • What-If Tool • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 34. 3434 TensorBoard Plotting of prescribers Red has more overdose Green has fewer 3D plots not that useful, but they look cool You can even use TensorBoard from PyTorch https://pytorch.org/docs/stable/tensorboard.html
  • 35. 3535 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • TensorFlow 2.0 • TensorFlow Data and Features • TensorFlow Estimators • TensorBoard • TensorFlow Data Visualization (TFDV) [Facets] • TensorFlow Model Analysis (TFMA) • What-If Tool • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 36. 3636 TensorFlow Data Validation (TFDV) • We need to understand our data as well as possible. • TFDV provides tools that make that less difficult. • Helps to identify bugs in the data by showing you pictures that don’t look right. • https://www.tensorflow.org/tfx/data_validation/get_started
  • 37. 3737
  • 38. 3838
  • 39. 3939 By sorting by non-uniformity, we can debug features.
  • 40. 4040 In general, we can make sure the distributions are what we would expect.
  • 41. 4141 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • TensorFlow 2.0 • TensorFlow Data and Features • TensorFlow Estimators • TensorBoard • TensorFlow Data Visualization (TFDV) [Facets] • TensorFlow Model Analysis (TFMA) • What-If Tool • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 42. 4242 TensorFlow Model Analysis (TFMA) We can see how well our model does by each slice. We see that this model does much better for females than males. https://www.tensorflow.org/tfx/guide/tfma
  • 43. 4343 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • TensorFlow 2.0 • TensorFlow Data and Features • TensorFlow Estimators • TensorBoard • TensorFlow Data Visualization (TFDV) [Facets] • TensorFlow Model Analysis (TFMA) • What-If Tool • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 44. 4444 What-IF Tool • The What-If Tool applies a model from TensorFlow Serving to any data you give it. • https://pair-code.github.io/what-if-tool/index.html • Change a record and see what the model does • Find the most similar record with a different classification • Can be used for fairness. Adjust the model so it is equally likely to predict “yes” for each group • https://www.coursera.org/lecture/machine-learning-business-professionals/activity- applying-fairness-concerns-with-the-what-if-tool-review-0mYda
  • 45. 4545 Looking at the data by race, age, and inference score
  • 46. 4646 What-If Tool showing the the probability of overdose for individual features.
  • 47. 4747 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 48. 4848 Alternatives (kind of) • MLflow https://mlflow.org/docs/latest/index.html • Netflix Metaflow https://github.com/Netflix/metaflow • Sacred https://github.com/IDSIA/sacred • Dataiku DSS https://www.dataiku.com/product/ • Polyaxon https://polyaxon.com/ • Facebook Ax https://www.ax.dev/ They all do something a little different, with pieces straddling different sides of the data science/production divide
  • 49. 4949 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Streamlit Dashboard • Python Typing, Dataclasses, and Enum • Conclusion
  • 50. 5050 Streamlit Dashboard • Writes to the browser • Works well for artifacts in the ML pipeline. https://github.com/streamlit/streamlit/ https://streamlit.io/docs/getting_started.html
  • 51. 5151 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Streamlit Dashboard • Python Typing, Dataclasses, and Enum • Conclusion
  • 52. 5252 Typing, Dataclasses, and Enum You can build interchangeable parts right in Python. Not new of course, but they make Python a little less wild west. Output: [Turtle(size=6, name='Anita'), Turtle(size=2, name='Anita')]
  • 53. 5353 Outline • Introduction to TensorFlow Extended (TFX) • TensorFlow Extended Pipeline Components • Running the Pipeline • TensorFlow and TensorFlow Tools • Alternatives to TensorFlow Extended • Other Useful Tools • Conclusion
  • 54. 5454 TFX Disadvantages • Steep learning curve • Changes constantly (but not while you are watching it) • Somewhat inflexible, you can create your own components, but steep learning curve • No hyperparameter search (yet, https://github.com/tensorflow/tfx/issues/182)
  • 55. 5555 TFX Advantages • Set up to scale • Documents your process through artifacts • Warm-starting: as new data comes in, you don’t have to start training over. Keeps models fresh • Tools to see data and debug problems • Don’t have to rerun what is already run
  • 56. 5656 Where to Start • Jupyter notebook tutorial https://www.tensorflow.org/tfx/tutorials/tfx/components • Airflow tutorial https://www.tensorflow.org/tfx/tutorials/tfx/airflow_workshop
  • 57. 5757 Happy Hour! 6500 River Place Blvd. Bldg. 3, Suite 120 Austin, TX. 78730 Jonathan Mugan, Ph. D. Email: [email protected]
  • 58. 5858 Appendix • Original TFX paper https://ai.google/research/pubs/pub46484 • Documentation • https://www.tensorflow.org/tfx • https://www.tensorflow.org/tfx/tutorials • https://www.tensorflow.org/tfx/guide • https://www.tensorflow.org/tfx/api_docs