SlideShare a Scribd company logo
6
Most read
16
Most read
23
Most read
1
High-resolution 3D Reconstruction
on a Mobile Processor
Michael Mangan
Senior Product Manager
Qualcomm Technologies, Inc.
May 3, 2016
2
30
years of driving
the evolution
of wireless
#1
in 3G/4G
LTE modem
#1
in RF
Source: Qualcomm Incorporated data. Currently, Qualcomm semiconductors are products of Qualcomm Technologies, Inc. or its subsidiaries
IHS, Jan. ’16 (RF); Strategy Analytics, Dec. ’15 (modem, AP)
3
Qualcomm® Snapdragon™ Chipsets
drive new experiences
Context aware
computing
Machine learning
Computing
performance
VR / AR - beyond small screen
360 degree camera
3D and low-light photography
Security
Biometric sensor
Virtual SIM/Multiple devices
Ultra HD VoLTE / audio quality
4G+
Wi-FiSuperior converged connectivity
Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc.
Gaming
4
What is Active Depth Capture?
Depth provides z-dimension to a scene; a photograph provides only x-y information.
Two ways to capture depth information from a scene or object:
Passive Depth Capture:
(No IR Transmitter)
• Stereo RGB cameras can passively
generate a depth map of a scene.
• Baseline separation between the cameras
causes parallax between the two received
images.
• Parallax can be used to infer a disparity
estimate, which in turn is used to
generate a depth map.
Active Depth Capture:
(IR Transmitter)
• IR laser transmits, various
techniques are used to infer depth
from the reflected laser.
» Time of Flight
» Active Stereo
» Structured Light
5
Depth from Structured Light—
Technology Overview
Depth information is generated
using a structured light sensor
• Coded pattern is projected onto the scene
using near infrared (NIR) light
• NIR camera receives the reflected,
distorted pattern
• Codes in the received image are matched
against known codes in the transmitted
pattern
• Depth at each code location estimated
from the disparity between original and
received code positions, leading to
a dense depth map
NIR image
Depth map
coded pattern
transmitter receiver
6
Scanner Flow in Action
7
Scanner Block Diagram
Scan
Starts
Color + Depth
(Structure Light Depth
Based Generation)
Live 3D
Renderer/Viewer
USER MOVES USER STOPS
Scan
Finishes
USE CASE:
3D Printing, Social
Networking, Gaming
Avatars, etc.
Computer Vision Based
Initial Pose Estimation
Inertial Motion
Sensor Fusion
Bundle
Adjustment
HD Texture
Generation
3D Mesh
Generation
Color
Correction
TRACKING/ALIGNMENT
8
Scanner System Architecture
3D Scanner Application
RGBD Image Grabber
Camera 2 APIDepth JNI 3D Scanner JNI
Depth Engine
(DSP/HVX)
RGB
Grabber
NIR
Grabber
3D Scanner Engine
(CPU/GPU)
SysFS Camera HAL Camera HAL
Raw
RGB Data
Raw
NIR Data
Driver
Laser NIR Camera
RGB
Camera
Active Sensing Module
Note: Arrows indicate
dependency, not dataflow
Apps(Java)Middleware(C++)Drivers(C)Hardware
9
3DR Workload Summary—
Running on Snapdragon 820
3D Reconstruction requires running
several computational demanding
processes simultaneously:
1. Camera Pose Tracking
2. Sensor Fusion
3. Bundle Adjustment
4. Rendering
5. Mesh Generation
6. Texture Mapping
7. Structured Light Sensor Decoding
Thanks to the heterogeneous computational
framework of the Snapdragon 820, we are able
to do all of this at 15 FPS:
Cryo—CPU/Neon:
• Pose Tracking
• Bundle Adjustment
• Sensor Fusion
• Mesh Generation
Adreno—GPU:
• Rendering
• Texture Mapping
Hexagon—DSP/HVX:
• Depth from Structured
Light
3DR powered by
Snapdragon 820
Spectra ISP:
• RGB sensor processing
• Depth sensor interface
10
Highest quality 3DR requires
great HW & SW. Efficient CV
SW algorithms, operating with
accurate depth sensors, &
power efficient processors,
bring commercial grade 3DR
to mobile platforms.
Lessons Learned
Running 3DR on mobile
requires tuning algorithms for
power as well as performance.
Power efficient heterogeneous
processors are mandatory for
3DR to run within mobile power
and thermal envelopes.
The heterogeneous
processing cores on
Snapdragon 820, enable a
high-quality, 3DR experience
on mobile platforms.
11
3DR Algorithmic Details
12
Scanner Block Diagram
Scan
Starts
Color + Depth
(Structure Light Depth
Based Generation)
Live 3D
Renderer/Viewer
USER MOVES USER STOPS
Scan
Finishes
USE CASE:
3D Printing, Social
Networking, Gaming
Avatars, etc.
Computer Vision Based
Initial Pose Estimation
Inertial Motion
Sensor Fusion
Bundle
Adjustment
HD Texture
Generation
3D Mesh
Generation
Color
Correction
TRACKING/ALIGNMENT
13
Based on the Iterative Closest Point (ICP) Concept, minimize the sum of pixel
intensity differences (errors) and the sum of depth errors to align Images
𝑐𝑜𝑠𝑡 = 𝑃𝑖𝑥𝑒𝑙 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝐸𝑟𝑟𝑜𝑟 2
+ 𝜆 𝑃𝑖𝑥𝑒𝑙 𝐷𝑒𝑝𝑡ℎ 𝐸𝑟𝑟𝑜𝑟 2
Pixel Intensity Error Depth Error
• F. Steinbruecker,et al., “Real-Time Visual Odometry from Dense RGB-D Images”, ICCV 2011
• C. Kerl et al., “Dense Continuous-Time Tracking and Mapping with Rolling Shutter RGB-D Cameras”, ICCV 2015
Computer Vision Based
Pose Estimation (6-DOF)
14
Flow
Reference Image
Current Image
Warp
subtract
Repeat to
Minimize Error
– =
Warped Image Error Image
Computer Vision Based
Pose Estimation (6-DOF)
15
Example
Computer Vision Based
Pose Estimation (6-DOF)
16
The Vision Pose will likely contain some errors.
• One example is lack of geometrical and textural structures
This can be overcome by fusing the vision pose with the Inertial Motion Unit (IMU) of the tablet
Using The Extended Kalman Filter (EKF) concept, one can predict poses from the IMU.
These are then fused in the update step of EKF to obtain the fused pose estimate
Motion Sensor Fusion
• M. Li et al., “3-D motion estimation and online temporal calibration for camera-IMU systems”, ICRA 2013
• S. Weiss et al., “Real-Time Metric State Estimation for Modular Vision-Inertial Systems. in IEEE International Conference on Robotics and Automation ”, ICRA 2011
Extended
Kalman Filter
(Predict)
Vision Based
Pose
Estimation
Extended
Kalman Filter
(Update)
Gyro
Accelerometer
17
Fused Poses need to be refined in order
to reduce the visual errors.
• Reason: Poses are being computed locally,
“between consecutive frames”
We use bundle adjustment to find optimal
global or semi-global poses
• Construct links (red lines) between captured frames
(blue nodes). Links are established if the re-projection
between captured images is above a certain threshold
• Jointly optimize the connected nodes
Bundle Adjustment
• V. Indelman et al., “Incremental Light Bundle Adjustment for Robotics Navigation”, IROS 2013
• R. Newcombe et al., “KinectFusion: Real-Time Dense Surface Mapping and Tracking”, IEEE ISMAR 2011
• K. Konolige et al., “FrameSLAM: from Bundle Adjustment to Realtime Visual Mappping”. IEEE Transactions on Robotics 2008
-0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
18
Having computed the 3D points, we need to generate the 3D surface mesh that best describes the
scene while reducing the noise
Many methods are available in the literature for surface reconstruction: Moving Least Squares
(MLS), TSDF & Poisson. Any can be used in theory. TSDF is the least computational demanding,
MLS and Poisson are more demanding
These are then followed by the marching cubes concept to generate the mesh
Surface Reconstruction / Mesh Generation
• S. Fleischmann et al., “Robust Moving Least-squares Fitting with Sharp Features”, ACM SIGGRAPH 2005
• M. Kazdan et al., “Poisson Surface Reconstruction”, Symposium on Geometry Processing 2006
• R. Newcombe et al., “KinectFusion: Real-Time Dense Surface Mapping and Tracking”, IEEE ISMAR 2011
19
Captured color images can suffer from casting due to many reasons like different lighting
sources. We need to correct that so that the overall color of the 3D model is in harmony
Solution: Estimate Color Casts & Remove them
• Gray points provide best estimate about color
• Estimate gray pixels & shift the appropriate channel gain to bring them to neutral gray
• Repeat until convergence
Color Correction
• J. Huo et al., ‘”Robust Automatic White Balance Algorithm Using Gray Color Points in Images”, IEEE Trans. Consumer Electronics, 2006
BEFORE
AFTER
20
The captured images need to be joined in one or more images called Texture Maps
Texture mapping can be thought of as “3D stitching of the images on the 3D model”
Obtaining the Texture Map consists in general of two steps:
• Determine where to put the pixels on a 3D model (texture coordinates)
• Determine what is the color of the pixel given a sequence of input images
Texture Mapping
• P. Debevec et al., “Efficient View-Dependent Image-Based Rendering with Projective Texture-Mapping”, Eurographics Rendering Workshop 1998
• M. Waechter et al., “Let There Be Color! Large-Scale Texturing of 3D Reconstructions”, ECCV 2015
Input Camera Images Output Texture Map Colored 3D Model
Using the Texture Map
21
Some 3DR Examples
22
Using our system we can scan
a small toy, human face/body
or an object
All of this can happen easily
on the Snapdragon 820, thanks
to its powerful heterogeneous
computational framework
Some Results
Thank you
Follow us on:
For more information, visit us at:
www.qualcomm.com & www.qualcomm.com/blog
Nothing in these materials is an offer to sell any of the components or devices referenced herein.
©2016 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Qualcomm and Snapdragon are trademarks of Qualcomm Incorporated, registered in the United States and other countries. Why Wait is a trademark of Qualcomm
Incorporated. Other products and brand names may be trademarks or registered trademarks of their respective owners.
References in this presentation to “Qualcomm” may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries or business units within
the Qualcomm corporate structure, as applicable.

Qualcomm Incorporated includes Qualcomm’s licensing business, QTL, and the vast majority of its patent
portfolio. Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of Qualcomm’s
engineering, research and development functions, and substantially all of its product and services businesses, including its semiconductor business, QCT.
23

More Related Content

What's hot (20)

PDF
Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...
Tomohiro Fukuda
 
PPTX
Cloud Cloud Cloud
kdalma
 
PDF
진정한 하이브리드 환경을 위한 올바른 선택, AWS Outposts! - 강동환 AWS 솔루션즈 아키텍트 :: AWS Summit Seou...
Amazon Web Services Korea
 
PDF
Multi domain product architecture: start integrated, stay integrated
Obeo
 
PPTX
Quantum Computing Explained
jeetendra mandal
 
PPTX
Augmented Reality
Ashita Agrawal
 
PDF
AWS 미디어 서비스를 이용한 글로벌 라이브 스트리밍 서비스 구축 - 황윤상 솔루션즈 아키텍트, AWS / 조용진 솔루션즈 아키텍트, AW...
Amazon Web Services Korea
 
PDF
Universal Robots
Plastindustrien
 
PPTX
Cloud platform comparison
Amit Ghosh
 
PDF
Gaming on aws 〜ゲームにおけるAWS最新活用術〜
Amazon Web Services Japan
 
PPTX
Virtual reality-What you see is what you believe
kaishik gundu
 
PPTX
360 degree video
Yatin Somvanshi
 
PPTX
Akraino and Edge Computing
Liz Warner
 
PDF
AWS Black Belt Online Seminar 2016 AWS IoT
Amazon Web Services Japan
 
PPTX
Comparison of Cloud Providers
Sabapathy Murthi
 
PDF
Tutorial High Efficiency Video Coding Coding - Tools and Specification.pdf
ssuserc5a4dd
 
PDF
Augmented Reality: The Next 20 Years
Mark Billinghurst
 
PDF
5G 엣지 컴퓨팅을 AWS에서! Wavelength 5G Edge 서비스 구현 사례 - 온정상 AWS 솔루션즈 아키텍트 / 이랑혁 대표, ...
Amazon Web Services Korea
 
PDF
Aws concepts-power-point-slides
Sushil Thapa
 
PDF
AWS Summit Seoul 2023 | 금융 디지털 서비스 혁신을 리딩하는 교보정보통신의 클라우드 마이그레이션 사례 소개
Amazon Web Services Korea
 
Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...
Tomohiro Fukuda
 
Cloud Cloud Cloud
kdalma
 
진정한 하이브리드 환경을 위한 올바른 선택, AWS Outposts! - 강동환 AWS 솔루션즈 아키텍트 :: AWS Summit Seou...
Amazon Web Services Korea
 
Multi domain product architecture: start integrated, stay integrated
Obeo
 
Quantum Computing Explained
jeetendra mandal
 
Augmented Reality
Ashita Agrawal
 
AWS 미디어 서비스를 이용한 글로벌 라이브 스트리밍 서비스 구축 - 황윤상 솔루션즈 아키텍트, AWS / 조용진 솔루션즈 아키텍트, AW...
Amazon Web Services Korea
 
Universal Robots
Plastindustrien
 
Cloud platform comparison
Amit Ghosh
 
Gaming on aws 〜ゲームにおけるAWS最新活用術〜
Amazon Web Services Japan
 
Virtual reality-What you see is what you believe
kaishik gundu
 
360 degree video
Yatin Somvanshi
 
Akraino and Edge Computing
Liz Warner
 
AWS Black Belt Online Seminar 2016 AWS IoT
Amazon Web Services Japan
 
Comparison of Cloud Providers
Sabapathy Murthi
 
Tutorial High Efficiency Video Coding Coding - Tools and Specification.pdf
ssuserc5a4dd
 
Augmented Reality: The Next 20 Years
Mark Billinghurst
 
5G 엣지 컴퓨팅을 AWS에서! Wavelength 5G Edge 서비스 구현 사례 - 온정상 AWS 솔루션즈 아키텍트 / 이랑혁 대표, ...
Amazon Web Services Korea
 
Aws concepts-power-point-slides
Sushil Thapa
 
AWS Summit Seoul 2023 | 금융 디지털 서비스 혁신을 리딩하는 교보정보통신의 클라우드 마이그레이션 사례 소개
Amazon Web Services Korea
 

Viewers also liked (6)

PDF
[Project report]digital speedometer with password enabled speed controlling(1...
Shivam Patel
 
PDF
Wujanz_Error_Projection_2011
Jacob Collstrup
 
PPT
Build Your Own 3D Scanner: Surface Reconstruction
Douglas Lanman
 
PDF
Registration 3
ngaybuonte
 
PPTX
Dmitrii Tihonkih - The Iterative Closest Points Algorithm and Affine Transfo...
AIST
 
PDF
Softassign and EM-ICP on GPU
Toru Tamaki
 
[Project report]digital speedometer with password enabled speed controlling(1...
Shivam Patel
 
Wujanz_Error_Projection_2011
Jacob Collstrup
 
Build Your Own 3D Scanner: Surface Reconstruction
Douglas Lanman
 
Registration 3
ngaybuonte
 
Dmitrii Tihonkih - The Iterative Closest Points Algorithm and Affine Transfo...
AIST
 
Softassign and EM-ICP on GPU
Toru Tamaki
 
Ad

Similar to "High-resolution 3D Reconstruction on a Mobile Processor," a Presentation from Qualcomm (20)

PDF
Emerging vision technologies
Qualcomm Research
 
PDF
“Develop Next-gen Camera Apps Using Snapdragon Computer Vision Technologies,”...
Edge AI and Vision Alliance
 
PDF
"Computer Vision and Machine Learning at the Edge," a Presentation from Qualc...
Edge AI and Vision Alliance
 
PDF
“Tools for Creating Next-Gen Computer Vision Apps on Snapdragon,” a Presentat...
Edge AI and Vision Alliance
 
PDF
Understanding the world in 3D with AI.pdf
Qualcomm Research
 
PPTX
Klony Lieberman (Sixdof Space): Ultra Fast 6DOF Tracking
AugmentedWorldExpo
 
PPTX
20110220 computer vision_eruhimov_lecture01
Computer Science Club
 
PDF
Arindam batabyal literature reviewpresentation
Arindam Batabyal
 
PDF
"Using Inertial Sensors and Sensor Fusion to Enhance the Capabilities of Embe...
Edge AI and Vision Alliance
 
PDF
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
NAVER Engineering
 
PPTX
MEMS Laser Scanning, the platform for next generation of 3D Depth Sensors
MicroVision
 
PDF
Dataset creation for Deep Learning-based Geometric Computer Vision problems
PetteriTeikariPhD
 
PDF
"Computational Photography: Understanding and Expanding the Capabilities of S...
Edge AI and Vision Alliance
 
PPT
3D Scanning technology of industrial .ppt
preetheshdj
 
PDF
isvc_draft6_final_1_harvey_mudd (1)
David Tenorio
 
PDF
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
c.choi
 
PPTX
Computer vision series
Perry Lea
 
PPTX
Beginning direct3d gameprogramming08_usingtextures_20160428_jintaeks
JinTaek Seo
 
PDF
Deep Learning and Texture Mapping
Efe Kaptan
 
PDF
“Introduction to Depth Sensing,” a Presentation from Meta
Edge AI and Vision Alliance
 
Emerging vision technologies
Qualcomm Research
 
“Develop Next-gen Camera Apps Using Snapdragon Computer Vision Technologies,”...
Edge AI and Vision Alliance
 
"Computer Vision and Machine Learning at the Edge," a Presentation from Qualc...
Edge AI and Vision Alliance
 
“Tools for Creating Next-Gen Computer Vision Apps on Snapdragon,” a Presentat...
Edge AI and Vision Alliance
 
Understanding the world in 3D with AI.pdf
Qualcomm Research
 
Klony Lieberman (Sixdof Space): Ultra Fast 6DOF Tracking
AugmentedWorldExpo
 
20110220 computer vision_eruhimov_lecture01
Computer Science Club
 
Arindam batabyal literature reviewpresentation
Arindam Batabyal
 
"Using Inertial Sensors and Sensor Fusion to Enhance the Capabilities of Embe...
Edge AI and Vision Alliance
 
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
NAVER Engineering
 
MEMS Laser Scanning, the platform for next generation of 3D Depth Sensors
MicroVision
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
PetteriTeikariPhD
 
"Computational Photography: Understanding and Expanding the Capabilities of S...
Edge AI and Vision Alliance
 
3D Scanning technology of industrial .ppt
preetheshdj
 
isvc_draft6_final_1_harvey_mudd (1)
David Tenorio
 
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
c.choi
 
Computer vision series
Perry Lea
 
Beginning direct3d gameprogramming08_usingtextures_20160428_jintaeks
JinTaek Seo
 
Deep Learning and Texture Mapping
Efe Kaptan
 
“Introduction to Depth Sensing,” a Presentation from Meta
Edge AI and Vision Alliance
 
Ad

More from Edge AI and Vision Alliance (20)

PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
“ONNX and Python to C++: State-of-the-art Graph Compilation,” a Presentation ...
Edge AI and Vision Alliance
 
PDF
“Beyond the Demo: Turning Computer Vision Prototypes into Scalable, Cost-effe...
Edge AI and Vision Alliance
 
PDF
“Running Accelerated CNNs on Low-power Microcontrollers Using Arm Ethos-U55, ...
Edge AI and Vision Alliance
 
PDF
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
Edge AI and Vision Alliance
 
PDF
“A Re-imagination of Embedded Vision System Design,” a Presentation from Imag...
Edge AI and Vision Alliance
 
PDF
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
PDF
“Evolving Inference Processor Software Stacks to Support LLMs,” a Presentatio...
Edge AI and Vision Alliance
 
PDF
“Efficiently Registering Depth and RGB Images,” a Presentation from eInfochips
Edge AI and Vision Alliance
 
PDF
“How to Right-size and Future-proof a Container-first Edge AI Infrastructure,...
Edge AI and Vision Alliance
 
PDF
“Image Tokenization for Distributed Neural Cascades,” a Presentation from Goo...
Edge AI and Vision Alliance
 
PDF
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
Edge AI and Vision Alliance
 
PDF
“Bridging the Gap: Streamlining the Process of Deploying AI onto Processors,”...
Edge AI and Vision Alliance
 
PDF
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
Edge AI and Vision Alliance
 
PDF
“Addressing Evolving AI Model Challenges Through Memory and Storage,” a Prese...
Edge AI and Vision Alliance
 
PDF
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
Edge AI and Vision Alliance
 
PDF
“Solving Tomorrow’s AI Problems Today with Cadence’s Newest Processor,” a Pre...
Edge AI and Vision Alliance
 
PDF
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
Edge AI and Vision Alliance
 
PDF
“How Qualcomm Is Powering AI-driven Multimedia at the Edge,” a Presentation f...
Edge AI and Vision Alliance
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
“ONNX and Python to C++: State-of-the-art Graph Compilation,” a Presentation ...
Edge AI and Vision Alliance
 
“Beyond the Demo: Turning Computer Vision Prototypes into Scalable, Cost-effe...
Edge AI and Vision Alliance
 
“Running Accelerated CNNs on Low-power Microcontrollers Using Arm Ethos-U55, ...
Edge AI and Vision Alliance
 
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
Edge AI and Vision Alliance
 
“A Re-imagination of Embedded Vision System Design,” a Presentation from Imag...
Edge AI and Vision Alliance
 
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
“Evolving Inference Processor Software Stacks to Support LLMs,” a Presentatio...
Edge AI and Vision Alliance
 
“Efficiently Registering Depth and RGB Images,” a Presentation from eInfochips
Edge AI and Vision Alliance
 
“How to Right-size and Future-proof a Container-first Edge AI Infrastructure,...
Edge AI and Vision Alliance
 
“Image Tokenization for Distributed Neural Cascades,” a Presentation from Goo...
Edge AI and Vision Alliance
 
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
Edge AI and Vision Alliance
 
“Bridging the Gap: Streamlining the Process of Deploying AI onto Processors,”...
Edge AI and Vision Alliance
 
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
Edge AI and Vision Alliance
 
“Addressing Evolving AI Model Challenges Through Memory and Storage,” a Prese...
Edge AI and Vision Alliance
 
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
Edge AI and Vision Alliance
 
“Solving Tomorrow’s AI Problems Today with Cadence’s Newest Processor,” a Pre...
Edge AI and Vision Alliance
 
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
Edge AI and Vision Alliance
 
“How Qualcomm Is Powering AI-driven Multimedia at the Edge,” a Presentation f...
Edge AI and Vision Alliance
 

Recently uploaded (20)

PDF
DoS Attack vs DDoS Attack_ The Silent Wars of the Internet.pdf
CyberPro Magazine
 
PDF
Java 25 and Beyond - A Roadmap of Innovations
Ana-Maria Mihalceanu
 
PDF
Simplify Your FME Flow Setup: Fault-Tolerant Deployment Made Easy with Packer...
Safe Software
 
PDF
Kubernetes - Architecture & Components.pdf
geethak285
 
PDF
Bridging CAD, IBM TRIRIGA & GIS with FME: The Portland Public Schools Case
Safe Software
 
PDF
Why aren't you using FME Flow's CPU Time?
Safe Software
 
PPTX
Mastering Authorization: Integrating Authentication and Authorization Data in...
Hitachi, Ltd. OSS Solution Center.
 
PDF
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
 
PDF
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
PPTX
Smarter Governance with AI: What Every Board Needs to Know
OnBoard
 
PDF
Understanding AI Optimization AIO, LLMO, and GEO
CoDigital
 
PPTX
CapCut Pro PC Crack Latest Version Free Free
josanj305
 
PDF
GDG Cloud Southlake #44: Eyal Bukchin: Tightening the Kubernetes Feedback Loo...
James Anderson
 
PPTX
Paycifi - Programmable Trust_Breakfast_PPTXT
FinTech Belgium
 
PDF
Pipeline Industry IoT - Real Time Data Monitoring
Safe Software
 
PDF
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
PDF
Supporting the NextGen 911 Digital Transformation with FME
Safe Software
 
PDF
Darley - FIRST Copenhagen Lightning Talk (2025-06-26) Epochalypse 2038 - Time...
treyka
 
PDF
Dev Dives: Accelerating agentic automation with Autopilot for Everyone
UiPathCommunity
 
PPSX
Usergroup - OutSystems Architecture.ppsx
Kurt Vandevelde
 
DoS Attack vs DDoS Attack_ The Silent Wars of the Internet.pdf
CyberPro Magazine
 
Java 25 and Beyond - A Roadmap of Innovations
Ana-Maria Mihalceanu
 
Simplify Your FME Flow Setup: Fault-Tolerant Deployment Made Easy with Packer...
Safe Software
 
Kubernetes - Architecture & Components.pdf
geethak285
 
Bridging CAD, IBM TRIRIGA & GIS with FME: The Portland Public Schools Case
Safe Software
 
Why aren't you using FME Flow's CPU Time?
Safe Software
 
Mastering Authorization: Integrating Authentication and Authorization Data in...
Hitachi, Ltd. OSS Solution Center.
 
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
 
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
Smarter Governance with AI: What Every Board Needs to Know
OnBoard
 
Understanding AI Optimization AIO, LLMO, and GEO
CoDigital
 
CapCut Pro PC Crack Latest Version Free Free
josanj305
 
GDG Cloud Southlake #44: Eyal Bukchin: Tightening the Kubernetes Feedback Loo...
James Anderson
 
Paycifi - Programmable Trust_Breakfast_PPTXT
FinTech Belgium
 
Pipeline Industry IoT - Real Time Data Monitoring
Safe Software
 
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
Supporting the NextGen 911 Digital Transformation with FME
Safe Software
 
Darley - FIRST Copenhagen Lightning Talk (2025-06-26) Epochalypse 2038 - Time...
treyka
 
Dev Dives: Accelerating agentic automation with Autopilot for Everyone
UiPathCommunity
 
Usergroup - OutSystems Architecture.ppsx
Kurt Vandevelde
 

"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation from Qualcomm

  • 1. 1 High-resolution 3D Reconstruction on a Mobile Processor Michael Mangan Senior Product Manager Qualcomm Technologies, Inc. May 3, 2016
  • 2. 2 30 years of driving the evolution of wireless #1 in 3G/4G LTE modem #1 in RF Source: Qualcomm Incorporated data. Currently, Qualcomm semiconductors are products of Qualcomm Technologies, Inc. or its subsidiaries IHS, Jan. ’16 (RF); Strategy Analytics, Dec. ’15 (modem, AP)
  • 3. 3 Qualcomm® Snapdragon™ Chipsets drive new experiences Context aware computing Machine learning Computing performance VR / AR - beyond small screen 360 degree camera 3D and low-light photography Security Biometric sensor Virtual SIM/Multiple devices Ultra HD VoLTE / audio quality 4G+ Wi-FiSuperior converged connectivity Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc. Gaming
  • 4. 4 What is Active Depth Capture? Depth provides z-dimension to a scene; a photograph provides only x-y information. Two ways to capture depth information from a scene or object: Passive Depth Capture: (No IR Transmitter) • Stereo RGB cameras can passively generate a depth map of a scene. • Baseline separation between the cameras causes parallax between the two received images. • Parallax can be used to infer a disparity estimate, which in turn is used to generate a depth map. Active Depth Capture: (IR Transmitter) • IR laser transmits, various techniques are used to infer depth from the reflected laser. » Time of Flight » Active Stereo » Structured Light
  • 5. 5 Depth from Structured Light— Technology Overview Depth information is generated using a structured light sensor • Coded pattern is projected onto the scene using near infrared (NIR) light • NIR camera receives the reflected, distorted pattern • Codes in the received image are matched against known codes in the transmitted pattern • Depth at each code location estimated from the disparity between original and received code positions, leading to a dense depth map NIR image Depth map coded pattern transmitter receiver
  • 7. 7 Scanner Block Diagram Scan Starts Color + Depth (Structure Light Depth Based Generation) Live 3D Renderer/Viewer USER MOVES USER STOPS Scan Finishes USE CASE: 3D Printing, Social Networking, Gaming Avatars, etc. Computer Vision Based Initial Pose Estimation Inertial Motion Sensor Fusion Bundle Adjustment HD Texture Generation 3D Mesh Generation Color Correction TRACKING/ALIGNMENT
  • 8. 8 Scanner System Architecture 3D Scanner Application RGBD Image Grabber Camera 2 APIDepth JNI 3D Scanner JNI Depth Engine (DSP/HVX) RGB Grabber NIR Grabber 3D Scanner Engine (CPU/GPU) SysFS Camera HAL Camera HAL Raw RGB Data Raw NIR Data Driver Laser NIR Camera RGB Camera Active Sensing Module Note: Arrows indicate dependency, not dataflow Apps(Java)Middleware(C++)Drivers(C)Hardware
  • 9. 9 3DR Workload Summary— Running on Snapdragon 820 3D Reconstruction requires running several computational demanding processes simultaneously: 1. Camera Pose Tracking 2. Sensor Fusion 3. Bundle Adjustment 4. Rendering 5. Mesh Generation 6. Texture Mapping 7. Structured Light Sensor Decoding Thanks to the heterogeneous computational framework of the Snapdragon 820, we are able to do all of this at 15 FPS: Cryo—CPU/Neon: • Pose Tracking • Bundle Adjustment • Sensor Fusion • Mesh Generation Adreno—GPU: • Rendering • Texture Mapping Hexagon—DSP/HVX: • Depth from Structured Light 3DR powered by Snapdragon 820 Spectra ISP: • RGB sensor processing • Depth sensor interface
  • 10. 10 Highest quality 3DR requires great HW & SW. Efficient CV SW algorithms, operating with accurate depth sensors, & power efficient processors, bring commercial grade 3DR to mobile platforms. Lessons Learned Running 3DR on mobile requires tuning algorithms for power as well as performance. Power efficient heterogeneous processors are mandatory for 3DR to run within mobile power and thermal envelopes. The heterogeneous processing cores on Snapdragon 820, enable a high-quality, 3DR experience on mobile platforms.
  • 12. 12 Scanner Block Diagram Scan Starts Color + Depth (Structure Light Depth Based Generation) Live 3D Renderer/Viewer USER MOVES USER STOPS Scan Finishes USE CASE: 3D Printing, Social Networking, Gaming Avatars, etc. Computer Vision Based Initial Pose Estimation Inertial Motion Sensor Fusion Bundle Adjustment HD Texture Generation 3D Mesh Generation Color Correction TRACKING/ALIGNMENT
  • 13. 13 Based on the Iterative Closest Point (ICP) Concept, minimize the sum of pixel intensity differences (errors) and the sum of depth errors to align Images 𝑐𝑜𝑠𝑡 = 𝑃𝑖𝑥𝑒𝑙 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝐸𝑟𝑟𝑜𝑟 2 + 𝜆 𝑃𝑖𝑥𝑒𝑙 𝐷𝑒𝑝𝑡ℎ 𝐸𝑟𝑟𝑜𝑟 2 Pixel Intensity Error Depth Error • F. Steinbruecker,et al., “Real-Time Visual Odometry from Dense RGB-D Images”, ICCV 2011 • C. Kerl et al., “Dense Continuous-Time Tracking and Mapping with Rolling Shutter RGB-D Cameras”, ICCV 2015 Computer Vision Based Pose Estimation (6-DOF)
  • 14. 14 Flow Reference Image Current Image Warp subtract Repeat to Minimize Error – = Warped Image Error Image Computer Vision Based Pose Estimation (6-DOF)
  • 16. 16 The Vision Pose will likely contain some errors. • One example is lack of geometrical and textural structures This can be overcome by fusing the vision pose with the Inertial Motion Unit (IMU) of the tablet Using The Extended Kalman Filter (EKF) concept, one can predict poses from the IMU. These are then fused in the update step of EKF to obtain the fused pose estimate Motion Sensor Fusion • M. Li et al., “3-D motion estimation and online temporal calibration for camera-IMU systems”, ICRA 2013 • S. Weiss et al., “Real-Time Metric State Estimation for Modular Vision-Inertial Systems. in IEEE International Conference on Robotics and Automation ”, ICRA 2011 Extended Kalman Filter (Predict) Vision Based Pose Estimation Extended Kalman Filter (Update) Gyro Accelerometer
  • 17. 17 Fused Poses need to be refined in order to reduce the visual errors. • Reason: Poses are being computed locally, “between consecutive frames” We use bundle adjustment to find optimal global or semi-global poses • Construct links (red lines) between captured frames (blue nodes). Links are established if the re-projection between captured images is above a certain threshold • Jointly optimize the connected nodes Bundle Adjustment • V. Indelman et al., “Incremental Light Bundle Adjustment for Robotics Navigation”, IROS 2013 • R. Newcombe et al., “KinectFusion: Real-Time Dense Surface Mapping and Tracking”, IEEE ISMAR 2011 • K. Konolige et al., “FrameSLAM: from Bundle Adjustment to Realtime Visual Mappping”. IEEE Transactions on Robotics 2008 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 -0.2 0 0.2 0.4 0.6 0.8 1 1.2
  • 18. 18 Having computed the 3D points, we need to generate the 3D surface mesh that best describes the scene while reducing the noise Many methods are available in the literature for surface reconstruction: Moving Least Squares (MLS), TSDF & Poisson. Any can be used in theory. TSDF is the least computational demanding, MLS and Poisson are more demanding These are then followed by the marching cubes concept to generate the mesh Surface Reconstruction / Mesh Generation • S. Fleischmann et al., “Robust Moving Least-squares Fitting with Sharp Features”, ACM SIGGRAPH 2005 • M. Kazdan et al., “Poisson Surface Reconstruction”, Symposium on Geometry Processing 2006 • R. Newcombe et al., “KinectFusion: Real-Time Dense Surface Mapping and Tracking”, IEEE ISMAR 2011
  • 19. 19 Captured color images can suffer from casting due to many reasons like different lighting sources. We need to correct that so that the overall color of the 3D model is in harmony Solution: Estimate Color Casts & Remove them • Gray points provide best estimate about color • Estimate gray pixels & shift the appropriate channel gain to bring them to neutral gray • Repeat until convergence Color Correction • J. Huo et al., ‘”Robust Automatic White Balance Algorithm Using Gray Color Points in Images”, IEEE Trans. Consumer Electronics, 2006 BEFORE AFTER
  • 20. 20 The captured images need to be joined in one or more images called Texture Maps Texture mapping can be thought of as “3D stitching of the images on the 3D model” Obtaining the Texture Map consists in general of two steps: • Determine where to put the pixels on a 3D model (texture coordinates) • Determine what is the color of the pixel given a sequence of input images Texture Mapping • P. Debevec et al., “Efficient View-Dependent Image-Based Rendering with Projective Texture-Mapping”, Eurographics Rendering Workshop 1998 • M. Waechter et al., “Let There Be Color! Large-Scale Texturing of 3D Reconstructions”, ECCV 2015 Input Camera Images Output Texture Map Colored 3D Model Using the Texture Map
  • 22. 22 Using our system we can scan a small toy, human face/body or an object All of this can happen easily on the Snapdragon 820, thanks to its powerful heterogeneous computational framework Some Results
  • 23. Thank you Follow us on: For more information, visit us at: www.qualcomm.com & www.qualcomm.com/blog Nothing in these materials is an offer to sell any of the components or devices referenced herein. ©2016 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved. Qualcomm and Snapdragon are trademarks of Qualcomm Incorporated, registered in the United States and other countries. Why Wait is a trademark of Qualcomm Incorporated. Other products and brand names may be trademarks or registered trademarks of their respective owners. References in this presentation to “Qualcomm” may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries or business units within the Qualcomm corporate structure, as applicable.

Qualcomm Incorporated includes Qualcomm’s licensing business, QTL, and the vast majority of its patent portfolio. Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of Qualcomm’s engineering, research and development functions, and substantially all of its product and services businesses, including its semiconductor business, QCT. 23