0% found this document useful (0 votes)
29 views

ML Kit in Actions

mk scripts

Uploaded by

rockkyshakaib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

ML Kit in Actions

mk scripts

Uploaded by

rockkyshakaib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

ML Kit in Action (Android)

Mobile Things S02E03


When machine learning meets augmented reality

Qian JIN | @bonbonking | [email protected]

Image Credit: https://becominghuman.ai/part-1-migrate-deep-learning-training-onto-mobile-devices-c28029ffeb30


ML Kit in Action
• The building blocks of ML Kit
• Vision APIs: text recognition, face detection, barcode scanning,
image labeling, landmark recognition

• Custom Models
• Custom TensorFlow build
• General feedbacks
The building blocks of ML Kit
Google Cloud
Mobile Vision API + = ML Kit Vision APIs
Vision API

ML Kit Custom Models


TensorFlow Lite + Neural Network API =
/ TF Lite Build
Vision APIs
Vision:
You talking to me?
FirebaseVisionImage
• fromBitmap
• fromByteArray
• fromByteBuffer
• fromFilePath
• fromMediaImage
Text Recognition

On-device Cloud

Free for first 1000 uses of this feature per


Pricing Free
month

High-accuracy text recognition


Ideal use cases Real-time processing
Document scanning

Language A broad range of languages and special


Latin characters
support characters
INPUT FirebaseVisionImage

FirebaseVisionTextDetector

OUTPUT FirebaseVisionText
MobileThings: ML Kit in action

for (FirebaseVisionText.Block block: firebaseVisionText.getBlocks()) {


Rect boundingBox = block.getBoundingBox();
Point[] cornerPoints = block.getCornerPoints();
String text = block.getText();

for (FirebaseVisionText.Line line: block.getLines()) {


// ...
for (FirebaseVisionText.Element element: line.getElements()) {
// ...
}
}
}
Face Detection: Key Capabilities

• Recognise and locate facial features


• Recognise facial expressions
• Track faces across video frames
• Process video frames in real time
Face Orientation

Face tracking

Landmark

Classification
Face Orientation

• Euler X
• Euler Y
• Euler Z
Landmarks
• A landmark is a point of interest within a face. The left
eye, right eye, and nose base are all examples of
landmarks
Classification

• 2 classifications are supported: Eye open (left & right eye) & Smiling
• Inspiration: Android Things photo booth
Face Detection Options

FirebaseVisionFaceDetectorOptions options =
new FirebaseVisionFaceDetectorOptions.Builder()
.setModeType(FirebaseVisionFaceDetectorOptions.ACCURATE_MODE)
.setLandmarkType(FirebaseVisionFaceDetectorOptions.ALL_LANDMARKS)
.setClassificationType(FirebaseVisionFaceDetectorOptions.ALL_CLASSIFICATIONS)
.setMinFaceSize(0.15f)
.setTrackingEnabled(true)
.build();
INPUT FirebaseVisionImage

FirebaseVisionFaceDetector

OUTPUT List<FirebaseVisionFace>
!
➡ boundingBox: Rect
➡ trackingId: Int
➡ headEulerAngleY: Float
➡ headEulerAngleZ: Float
➡ smilingProbability: Float
➡ leftEyeOpenProbability: Float
➡ rightEyeOpenProbability: Float
Feedback

• Real-time application: pay attention to the image size


/fr.xebia.mlkitinactions E/pittpatt: online_face_detector.cc:236] inconsistent
image dimensions detector.cc:220] inconsistent image dimensions
/fr.xebia.mlkitinactions E/NativeFaceDetectorImpl: Native face detection failed
java.lang.RuntimeException: Error detecting faces.
at
com.google.android.gms.vision.face.NativeFaceDetectorImpl.detectFacesJni(Native
Method)
Image Labeling
On-device Cloud

Pricing Free Free for first 1000 uses of this feature per month

400+ labels that cover the most 10,000+ labels in many categories. See below.
Label coverage commonly-found concepts in Also, try the Cloud Vision API demo to see what
photos. See below. labels can be found for an image you provide.

Knowledge
Graph entity ID
support
INPUT FirebaseVisionImage

FirebaseVisionLabelDetector

OUTPUT List<FirebaseVisionLabel>
🎒 ➡ label: String
➡ confidence: Float
➡ entityId: String
Landmark Recognition

• Still in preview, using Cloud Vision API instead


• Recognizes well-known landmarks
• Get Google Knowledge Graph entity IDs
• Low-volume use free (first 1000 images)
Custom Model
Some night towards the end of 2016
!40
Android SDK (Java) Android NDK (C++)

Camera
1 Image (Bitmap) 2 input_tensor
Preview

Classifier TensorFlow
Implementation JNI wrapper
Trained Model

Overlay
Classifications + Confidence 4 top_results 3
Display

Ref: https://jalammar.github.io/Supercharging-android-apps-using-tensorflow/
Magritte
Ceci n’est pas une pomme.
Android Makers Paris, April 2017
Model Size

All weights are stored as they are (64-bit floats) => 80MB

!46
Weights Quantization
6.372638493746383 => 6.4

~80MB -> ~20MB

!47
Source: https://www.tensorflow.org/performance/quantization
Model Inception V3
Optimized & Quantized
Google I/O, May 2017
Google AI Blog, June 2017
MobileNet
Mobile-first computer vision models for TensorFlow

!52
Image credit : https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md
~80Mb => ~20Mb => ~1-5Mb

Source: https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html
DevFest Nantes, October 2017
Model MobileNets_0.25_224
Google I/O, May 2018
Custom Model: Key capabilities

• TensorFlow lite model hosting


• On-device ML inference
• Automatic model fallback
• Automatic model updates
Train your Convert model Host your TF Use the TF Lite
TF model to TF Lite Lite model on model for
(model.pb) (model.lite) Firebase inference

TOCO
(TensorFlow Lite
Optimizing
Converter)
How to train your dragon model?
Train your model
python -m tensorflow/examples/image_retraining/retrain.py \
--bottleneck_dir=tf_files/bottlenecks \
--how_many_training_steps=500 \
--model_dir=tf_files/models/ \
--summaries_dir=tf_files/training_summaries/ \
--output_graph=tf_files/retrained_graph.pb \
--output_labels=tf_files/retrained_labels.txt \
—architecture=mobilenet_0.50_224 \
--image_dir=tf_files/fruit_photos

Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/
TF Lite conversion for retrained quantized model is currently unavailable.
Firebase quickstart ML Kit sample only aimes quantized model.
Convert to tflite format
bazel run --config=opt \
//tensorflow/contrib/lite/toco:toco -- \
--input_file=/tmp/magritte_retrained_graph.pb \
--output_file=/tmp/magritte_graph.tflite \
--inference_type=FLOAT \
--input_shape=1,224,224,3 \
--input_array=input \
--output_array=final_result \
--mean_value=128 \
--std_value=128 \
--default_ranges_min=0 \
--default_ranges_max=6

Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/
Do you need custom bob models?
Use custom models if

• Specific needs CAN NOT be met by general purpose APIs


• Need high matching precision
• You are an experienced ML developer (or you know Yoann Benoit)

Let me train your model!


INPUT FirebaseModelInputs

FirebaseModelInterpreter

OUTPUT FirebaseModelOutputs
// input & output options for non-quantized model

val inputDims = intArrayOf(DIM_BATCH_SIZE, DIM_IMG_SIZE_X, DIM_IMG_SIZE_Y,


DIM_PIXEL_SIZE)
val outputDims = intArrayOf(1, labelList.size)
inputOutputOptions = FirebaseModelInputOutputOptions.Builder()
.setInputFormat(0, FirebaseModelDataType.FLOAT32, inputDims)
.setOutputFormat(0, FirebaseModelDataType.FLOAT32, outputDims)
.build()
// input & output options for non-quantized model

val inputDims = intArrayOf(DIM_BATCH_SIZE, DIM_IMG_SIZE_X, DIM_IMG_SIZE_Y,


DIM_PIXEL_SIZE)
val outputDims = intArrayOf(1, labelList.size)
inputOutputOptions = FirebaseModelInputOutputOptions.Builder()
.setInputFormat(0, FirebaseModelDataType.FLOAT32, inputDims)
.setOutputFormat(0, FirebaseModelDataType.FLOAT32, outputDims)
.build()
🥝 INPUT FirebaseModelInputs

ByteBuffer
🍎
🍌
🥝
🍊
OUTPUT FirebaseModelOutputs
🍓
🍇
🍉
🍋
🍍
Performance Benchmarks
Model MobileNets_1.0_224
Model MobileNets_1.0_224
• No callback or other feedback for model downloading
• Model downloading seems to be blocking => do not use on main
thread

• Lack of documentations at this point (e.g. how to stop the


interpreter?)

• Slight performance loss comparing to TensorFlow Lite


• A/B test your machine learning model!
HowTo: Face Recognition Model

• Trained with Keras + FaceNet


• Converted to TensorFlow
• Then converted to TensorFlow lite
• Then we got stuck…
Custom TensorFlow Lite build
Custom TF Lite build

• ML Kit uses a pre-built TensorFlow Lite library


• Build your own AAR with bazel
• Add custom ops for example
Takeaway
ML Kit: State of the art

• Lack of high quality demos (e.g. firebase mlkit quickstart, bugs,


deprecated camera API, deformed camera preview)

• Lack of high level guidelines / best practises


• Performance issue on old devices
The best is yet to come

• Face contours: 100 data points


• Smart Reply: conversation model
• Online model compression
References
References
• Talk Magritte for DroidCon London
• Medium article: Android meets Machine Learning
• Github Repo for demo
• Joe Birch: Exploring Firebase MLKit on Android: Introducing MLKit (Part one)
• Joe Birch: Exploring Firebase MLKit on Android: Face Detection (Part Two)
• Merci Sandra ;)
Questions?

You might also like