SlideShare a Scribd company logo
MANTLE
THE SMALL BATCH (AND OTHER)
SOLUTIONS IN MANTLE API
GUENNADI RIGUER – MANTLE CHIEF ARCHITECT
2 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Problems
Abstraction level
Small batch performance
Platform efficiency
3 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Wrong Abstraction Level
Unpredictable big black box
Current situation: neither fast nor simple
Too high to be fast, too low to be simple
4 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Small Batch Performance
10K
100K
Most games today
Really optimized games
Where you want to be (Mantle target)
5 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Previous Solutions
Geometry instancing
Geometry shaders
Texture atlases & arrays
Uber-shaders
Command recorders
…
6 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Why They Failed?
Development and content creation limitations
Trading driver overhead for engine performance
Trading CPU performance for GPU overhead
7 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
What is Mantle?
Lower level API
Focus on performance
Empower developers to do what they want
8 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Why Mantle?
Better performance
Predictable performance
Developer control
9 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Key Features
10 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Execution Model
Graphics
Compute
DMA
GPU
. . .
Queues
Application
App thread
App thread
App thread
App thread
App thread
11 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Execution Model
Graphics
Compute
DMA
GPU
. . .
Queues
Application
App thread
App thread
App thread
App thread
App thread
12 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Execution Model
Graphics
Compute
DMA
GPU
. . .
Application
App thread
App thread
App thread
App thread
App thread
Queues
13 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Execution Model
Queues
Application
App thread
App thread
App thread
App thread
App thread
Graphics
Compute
DMA
GPU
. . .
14 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Memory & Resources
Application controls memory
Application handles hazards
Generalized resources
15 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Pre-build & Pre-validate
Pipelines
Resource binding
Multi-use command buffers
16 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Platform Considerations
APUs & SOCs are here
No longer CPU vs. GPU
Race to low power
17 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.16
10K batches 50K batches 100K batches
Mantle
DX11
0
20
40
60
80
100
120
0
2
4
6
8
10
12
14
10K batches 50K batches 100K batches
Mantle FPS
DX11 FPS
Mantle power
DX11 Power
Power Efficiency - StarSwarmFPS Power (W) FPS/W
Starswarm using RTS preset @1080p running on APU A10-7800 @ 3.5GHz, 4GB 2133MHz RAM, A88X-Pro M/B
18 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Power Efficiency - Games
0
20
40
60
80
100
120
0
10
20
30
40
50
60
BF4 @ 1280x720 BF4 @
1920x1080
Thief - Low @
720x480
Thief - Normal @
1280x720
Mantle FPS
DX11 FPS
Mantle power
DX11 Power
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
BF4 @
1280x720
BF4 @
1920x1080
Thief - Low @
720x480
Thief - Normal
@ 1280x720
Mantle
DX11
FPS Power (W) FPS/W
Battlefield 4 BrokenFlightDeck level @720p and @1080p MEDIUM settings and SSAO enabled running on APU
A10-7800 @ 3.5GHz, 4GB 2133MHz RAM, A88X-Pro M/B
Thief built-in benchmark running on same hardware with low settings at 720x480 and normal settings at 720p
19 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Lessons Learned
20 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Think Outside the Box
Don’t just make something faster
…avoid doing it completely
Design API and driver together
21 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
New Driver Design
Even a bit of sync kills many-core performance
“Thick” driver = cache pollution
“Make it the application’s problem” 
22 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Lots of Little Things…
The whole is greater than the sum of its parts
23 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Future HW Considerations
HW small batch, anyone?
Command processing bottlenecks
More operations/batches in flight
24 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Challenges
Programming is harder
Ecosystem must change
Applications must "do the right thing“
25 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Summary
Mantle fixes abstraction level
Mantle improves platform efficiency
Mantle leads industry transformation
26 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
Questions?
27 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014
DISCLAIMER & ATTRIBUTION
The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.
The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap
changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software
changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD
reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of
such revisions or changes.
AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY
INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.
AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE
LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION
CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
ATTRIBUTION
© 2014 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices,
Inc. in the United States and/or other jurisdictions. DirectX is a registered trademark of Microsoft Corporation.

More Related Content

PPTX
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
AMD Developer Central
 
PDF
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
AMD Developer Central
 
PDF
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
AMD Developer Central
 
PDF
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
AMD Developer Central
 
PDF
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
AMD Developer Central
 
PDF
CE-4030, Optimizing Photo Editing Application with HSA Technology, by Stanley...
AMD Developer Central
 
PDF
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
AMD Developer Central
 
PDF
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
AMD Developer Central
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
AMD Developer Central
 
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
AMD Developer Central
 
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
AMD Developer Central
 
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
AMD Developer Central
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
AMD Developer Central
 
CE-4030, Optimizing Photo Editing Application with HSA Technology, by Stanley...
AMD Developer Central
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
AMD Developer Central
 
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
AMD Developer Central
 

What's hot (20)

PDF
CC-4001, Aparapi and HSA: Easing the developer path to APU/GPU accelerated Ja...
AMD Developer Central
 
PPSX
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
AMD Developer Central
 
PDF
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
AMD Developer Central
 
PDF
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
AMD Developer Central
 
PPTX
Leverage the Speed of OpenCL™ with AMD Math Libraries
AMD Developer Central
 
PDF
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
AMD Developer Central
 
PDF
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
AMD Developer Central
 
PDF
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
AMD Developer Central
 
PDF
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
AMD Developer Central
 
PPTX
Media SDK Webinar 2014
AMD Developer Central
 
PDF
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
AMD Developer Central
 
PDF
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
AMD Developer Central
 
PDF
HSA-4123, HSA Memory Model, by Ben Gaster
AMD Developer Central
 
PDF
DirectGMA on AMD’S FirePro™ GPUS
AMD Developer Central
 
PDF
HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...
AMD Developer Central
 
PDF
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
AMD Developer Central
 
PDF
LCU13: GPGPU on ARM Experience Report
Linaro
 
PDF
ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
HSA Foundation
 
PDF
MM-4085, Designing a game audio engine for HSA, by Laurent Betbeder
AMD Developer Central
 
PDF
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
AMD Developer Central
 
CC-4001, Aparapi and HSA: Easing the developer path to APU/GPU accelerated Ja...
AMD Developer Central
 
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
AMD Developer Central
 
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
AMD Developer Central
 
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
AMD Developer Central
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
AMD Developer Central
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
AMD Developer Central
 
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
AMD Developer Central
 
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
AMD Developer Central
 
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
AMD Developer Central
 
Media SDK Webinar 2014
AMD Developer Central
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
AMD Developer Central
 
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
AMD Developer Central
 
HSA-4123, HSA Memory Model, by Ben Gaster
AMD Developer Central
 
DirectGMA on AMD’S FirePro™ GPUS
AMD Developer Central
 
HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...
AMD Developer Central
 
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
AMD Developer Central
 
LCU13: GPGPU on ARM Experience Report
Linaro
 
ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
HSA Foundation
 
MM-4085, Designing a game audio engine for HSA, by Laurent Betbeder
AMD Developer Central
 
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
AMD Developer Central
 
Ad

Viewers also liked (14)

PDF
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
AMD Developer Central
 
PPT
Webinar: Whats New in Java 8 with Develop Intelligence
AMD Developer Central
 
PPSX
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
AMD Developer Central
 
PPSX
Inside XBox- One, by Martin Fuller
AMD Developer Central
 
PPSX
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
AMD Developer Central
 
PPSX
TressFX The Fast and The Furry by Nicolas Thibieroz
AMD Developer Central
 
PPSX
Gcn performance ftw by stephan hodes
AMD Developer Central
 
PPTX
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
AMD Developer Central
 
PDF
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
AMD Developer Central
 
PPTX
Introduction to Node.js
AMD Developer Central
 
PPSX
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
AMD Developer Central
 
PPSX
Introduction to Direct 3D 12 by Ivan Nevraev
AMD Developer Central
 
PPSX
Inside XBOX ONE by Martin Fuller
AMD Developer Central
 
PDF
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD
 
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
AMD Developer Central
 
Webinar: Whats New in Java 8 with Develop Intelligence
AMD Developer Central
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
AMD Developer Central
 
Inside XBox- One, by Martin Fuller
AMD Developer Central
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
AMD Developer Central
 
TressFX The Fast and The Furry by Nicolas Thibieroz
AMD Developer Central
 
Gcn performance ftw by stephan hodes
AMD Developer Central
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
AMD Developer Central
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
AMD Developer Central
 
Introduction to Node.js
AMD Developer Central
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
AMD Developer Central
 
Introduction to Direct 3D 12 by Ivan Nevraev
AMD Developer Central
 
Inside XBOX ONE by Martin Fuller
AMD Developer Central
 
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD
 
Ad

Similar to The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mantle Chief Architect (20)

PDF
PT-4056, Harnessing Heterogeneous Systems Using C++ AMP – How the Story is Ev...
AMD Developer Central
 
PDF
Pre-Con Education on APM 9.7
CA Technologies
 
PDF
GS-4112, Mantle: Empowering 3D Graphics Innovation, by Guennadi Riguer and Br...
AMD Developer Central
 
PPTX
IBM API Management BPM Systems Engage
Sebastian Osterc
 
PDF
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
Diego Alberto Tamayo
 
PDF
ISSCC "Carrizo"
AMD
 
PPTX
SAP Cloud Infrastructure Strategy @ Virtualization Week
Frank Stienhans
 
PDF
How to find_information_on_sap_eam
Nguyen Hai
 
PDF
Beyond Moore's Law: The Challenge of Heterogeneous Compute & Memory Systems
inside-BigData.com
 
PPTX
SAP Systems in the Cloud (Oct 2010)
Frank Stienhans
 
PPT
Mho Web Dynpro Abap
thomas_jung
 
PPTX
Why and How to Monitor App Performance in Azure
Ian Downard
 
PPTX
Why and How to Monitor Application Performance in Azure
Riverbed Technology
 
PDF
UNLIMITED - Realtime Custom Applications
Detlev Sandel
 
PDF
Taking IT Analytics to the Next Level
CA Technologies
 
PPTX
GPGPU in Commercial Software: Lessons From Three Cycles of the Adobe Creative...
Kevin Goldsmith
 
PDF
Backup%20 domain%20controller%20(bdc)%20step by-step(1)
Srinivas Dukka
 
PDF
MuleSoft Surat Meetup#39 - Pragmatic API Led Connectivity
Jitendra Bafna
 
DOCX
Notes
Ganesh Kumar
 
PDF
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Seeling Cheung
 
PT-4056, Harnessing Heterogeneous Systems Using C++ AMP – How the Story is Ev...
AMD Developer Central
 
Pre-Con Education on APM 9.7
CA Technologies
 
GS-4112, Mantle: Empowering 3D Graphics Innovation, by Guennadi Riguer and Br...
AMD Developer Central
 
IBM API Management BPM Systems Engage
Sebastian Osterc
 
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
Diego Alberto Tamayo
 
ISSCC "Carrizo"
AMD
 
SAP Cloud Infrastructure Strategy @ Virtualization Week
Frank Stienhans
 
How to find_information_on_sap_eam
Nguyen Hai
 
Beyond Moore's Law: The Challenge of Heterogeneous Compute & Memory Systems
inside-BigData.com
 
SAP Systems in the Cloud (Oct 2010)
Frank Stienhans
 
Mho Web Dynpro Abap
thomas_jung
 
Why and How to Monitor App Performance in Azure
Ian Downard
 
Why and How to Monitor Application Performance in Azure
Riverbed Technology
 
UNLIMITED - Realtime Custom Applications
Detlev Sandel
 
Taking IT Analytics to the Next Level
CA Technologies
 
GPGPU in Commercial Software: Lessons From Three Cycles of the Adobe Creative...
Kevin Goldsmith
 
Backup%20 domain%20controller%20(bdc)%20step by-step(1)
Srinivas Dukka
 
MuleSoft Surat Meetup#39 - Pragmatic API Led Connectivity
Jitendra Bafna
 
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Seeling Cheung
 

More from AMD Developer Central (9)

PDF
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
AMD Developer Central
 
PPSX
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
AMD Developer Central
 
PPSX
Mantle - Introducing a new API for Graphics - AMD at GDC14
AMD Developer Central
 
PPSX
Direct3D and the Future of Graphics APIs - AMD at GDC14
AMD Developer Central
 
PPSX
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
AMD Developer Central
 
PDF
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
AMD Developer Central
 
PDF
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
AMD Developer Central
 
PDF
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
AMD Developer Central
 
PDF
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
AMD Developer Central
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
AMD Developer Central
 
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
AMD Developer Central
 
Mantle - Introducing a new API for Graphics - AMD at GDC14
AMD Developer Central
 
Direct3D and the Future of Graphics APIs - AMD at GDC14
AMD Developer Central
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
AMD Developer Central
 
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
AMD Developer Central
 
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
AMD Developer Central
 
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
AMD Developer Central
 
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
AMD Developer Central
 

Recently uploaded (20)

PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
DevOps & Developer Experience Summer BBQ
AUGNYC
 
PPT
L2 Rules of Netiquette in Empowerment technology
Archibal2
 
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PDF
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
REPORT: Heating appliances market in Poland 2024
SPIUG
 
PPTX
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Francisco Vieira Júnior
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
Software Development Methodologies in 2025
KodekX
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PPTX
Coupa-Overview _Assumptions presentation
annapureddyn
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PDF
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
PDF
Doc9.....................................
SofiaCollazos
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PPTX
Smart Infrastructure and Automation through IoT Sensors
Rejig Digital
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
DevOps & Developer Experience Summer BBQ
AUGNYC
 
L2 Rules of Netiquette in Empowerment technology
Archibal2
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
REPORT: Heating appliances market in Poland 2024
SPIUG
 
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Francisco Vieira Júnior
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Software Development Methodologies in 2025
KodekX
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
Coupa-Overview _Assumptions presentation
annapureddyn
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
Doc9.....................................
SofiaCollazos
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
Smart Infrastructure and Automation through IoT Sensors
Rejig Digital
 

The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mantle Chief Architect

  • 1. MANTLE THE SMALL BATCH (AND OTHER) SOLUTIONS IN MANTLE API GUENNADI RIGUER – MANTLE CHIEF ARCHITECT
  • 2. 2 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Problems Abstraction level Small batch performance Platform efficiency
  • 3. 3 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Wrong Abstraction Level Unpredictable big black box Current situation: neither fast nor simple Too high to be fast, too low to be simple
  • 4. 4 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Small Batch Performance 10K 100K Most games today Really optimized games Where you want to be (Mantle target)
  • 5. 5 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Previous Solutions Geometry instancing Geometry shaders Texture atlases & arrays Uber-shaders Command recorders …
  • 6. 6 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Why They Failed? Development and content creation limitations Trading driver overhead for engine performance Trading CPU performance for GPU overhead
  • 7. 7 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 What is Mantle? Lower level API Focus on performance Empower developers to do what they want
  • 8. 8 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Why Mantle? Better performance Predictable performance Developer control
  • 9. 9 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Key Features
  • 10. 10 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Execution Model Graphics Compute DMA GPU . . . Queues Application App thread App thread App thread App thread App thread
  • 11. 11 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Execution Model Graphics Compute DMA GPU . . . Queues Application App thread App thread App thread App thread App thread
  • 12. 12 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Execution Model Graphics Compute DMA GPU . . . Application App thread App thread App thread App thread App thread Queues
  • 13. 13 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Execution Model Queues Application App thread App thread App thread App thread App thread Graphics Compute DMA GPU . . .
  • 14. 14 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Memory & Resources Application controls memory Application handles hazards Generalized resources
  • 15. 15 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Pre-build & Pre-validate Pipelines Resource binding Multi-use command buffers
  • 16. 16 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Platform Considerations APUs & SOCs are here No longer CPU vs. GPU Race to low power
  • 17. 17 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 10K batches 50K batches 100K batches Mantle DX11 0 20 40 60 80 100 120 0 2 4 6 8 10 12 14 10K batches 50K batches 100K batches Mantle FPS DX11 FPS Mantle power DX11 Power Power Efficiency - StarSwarmFPS Power (W) FPS/W Starswarm using RTS preset @1080p running on APU A10-7800 @ 3.5GHz, 4GB 2133MHz RAM, A88X-Pro M/B
  • 18. 18 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Power Efficiency - Games 0 20 40 60 80 100 120 0 10 20 30 40 50 60 BF4 @ 1280x720 BF4 @ 1920x1080 Thief - Low @ 720x480 Thief - Normal @ 1280x720 Mantle FPS DX11 FPS Mantle power DX11 Power 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 BF4 @ 1280x720 BF4 @ 1920x1080 Thief - Low @ 720x480 Thief - Normal @ 1280x720 Mantle DX11 FPS Power (W) FPS/W Battlefield 4 BrokenFlightDeck level @720p and @1080p MEDIUM settings and SSAO enabled running on APU A10-7800 @ 3.5GHz, 4GB 2133MHz RAM, A88X-Pro M/B Thief built-in benchmark running on same hardware with low settings at 720x480 and normal settings at 720p
  • 19. 19 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Lessons Learned
  • 20. 20 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Think Outside the Box Don’t just make something faster …avoid doing it completely Design API and driver together
  • 21. 21 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 New Driver Design Even a bit of sync kills many-core performance “Thick” driver = cache pollution “Make it the application’s problem” 
  • 22. 22 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Lots of Little Things… The whole is greater than the sum of its parts
  • 23. 23 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Future HW Considerations HW small batch, anyone? Command processing bottlenecks More operations/batches in flight
  • 24. 24 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Challenges Programming is harder Ecosystem must change Applications must "do the right thing“
  • 25. 25 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Summary Mantle fixes abstraction level Mantle improves platform efficiency Mantle leads industry transformation
  • 26. 26 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 Questions?
  • 27. 27 THE SMALL BATCH SOLUTIONS IN MANTLE API | AUGUST 8, 2014 DISCLAIMER & ATTRIBUTION The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION © 2014 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. DirectX is a registered trademark of Microsoft Corporation.

Editor's Notes

  • #18: The selection of resolutions and settings was made to create both CPU and GPU limited cases for completeness of the produced results.
  • #19: The selection of resolutions and settings was made to create both CPU and GPU limited cases for completeness of the produced results.