0% found this document useful (0 votes)
55 views38 pages

Getting Started With The AMD Robotics Hardware Portfolio - Final v2

Uploaded by

John Doe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views38 pages

Getting Started With The AMD Robotics Hardware Portfolio - Final v2

Uploaded by

John Doe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

AMD Robotics Webinar Series

[Public]

April 30-June 4, 2024: AMD 3-Part Robotic Webinar Series

2 |
[Public]

Agenda 1. AMD Robotics Observations

2. Use Cases & Challenges

3. Adaptive Computing for Robotics

4. How We Address Roboticists’ Needs

5. Right Engine for Right Task

6. Across Edge & Cloud: Digital Twin

7. What’s next?

3 |
[Public]

AMD Robotics Observations


Summary of global customer engagements

• Enormous amounts • Factory / logistics • The lines between • Virtual


of sensor data robot, exoskeleton hardware and representation of
being generated shipment CAGR software are blurry physical world to
• Data needs to be 22% between • Software help address
processed fast and 2024-2028* ecosystem, tools, • What if?

near the source • Agriculture and COTS growing • What next?

surgical robots • What else?


drive that number
higher

AI is pervasive including the edge!


4 |
* Source: ABI Industrial Robots report
Vision & perception system
[Public] Robot control system
Motor control system

Every Robot Is DIFFERENT! Every Application Is COMPLEX!

Surgical Robots Industrial Robots Collaborative Robots Exoskeleton &


Humanoid Robots

Aerial Robots Autonomous Mobile Robots Delivery Robots Hospitality &


(Drones) (AMRs/AGVs) Service Robots
5
5 |
[Public]

The Compute Challenge in Robotics

LATENCY AND DETERMINISM ARE CRITICAL Motor


Control
 Real-time response / connectivity ‘makes-or-breaks’ a deployable robot
Vision Guided,
 SW-based handling of data can lead to unpredictable behavior Operations

PERFORMANCE AND POWER EFFICIENCY


 Robotic systems are often untethered, limited cooling, tight enclosures
 Compute efficiency is critical to ensure low power and longer autonomy

SAFETY AND SECURITY GO HAND-IN-HAND


 Cyber-attacks create safety and data privacy risks Low Latency and
Deterministic
 Industrial Robotic systems require IEC 61508 safety compliance Networking

ALGORITHMS ARE DIVERSE AND EVOLVE RAPIDLY


 Motion planning, sensor fusion, predictive maintenance
 Parallel execution of algorithms is a requirement for speed up Embedded
Controllers
(Motion Planning)

6 |
[Public]

Hardware Options to Solve Challenges


Scalar Control flow Vector Control flow Dataflow Mixed
(control-driven) (control-driven) (data-driven) (control- and data-driven)

CPU DSP, GPU, AI Engines FPGA Adaptive SoCs


① Full control, ① Full control, ① Very high parallelism, and ① Full control, ② Complex data and control
② Complex data and control ② Domain- specific parallelism (e.g. localized memories structures easily implemented, ③ Very
structures easily implementable. math, video and image processing) ② Low latency and quick high parallelism, ④ High throughput, ⑤ Free from
response time, side- effects, ⑥ Domain-specific parallelism
③ Supports determinism at
nanosecond level

❶ Challenging to program for ❶ Processing real world signals ❶ Programming FPGA requires ❶ Require upfront partitioning of the different
low latency challeging parallel programming mindset, computational workloads to
program and architect
❷ Challenging to program for ❷ Difficult to program for low latency ❷ Memories are limited by
determinism FPGA size
❸ Difficult to program for determinism
❸ Challenging to debug

7 |
[Public]

The value of Adaptive Computing in Robotics

Low Latency Deterministic Control Customization and Flexibility


1 2 3
FPGAs can be configured to execute specific algorithms FPGAs can be used to implement closed-loop control FPGAs are reprogrammable devices, which means they
directly in hardware, leading to low latency. systems in real-time. This is crucial in robotics, where can be customized to meet the specific requirements of
precise control over actuators is necessary to achieve a robotics application, especially leveraging
accurate and responsive movements. programmable IO for sensor interfacing.

High Throughput Signal Processing Capabilities Energy Efficiency


4 5 6
FPGAs can handle high-throughput data processing, FPGAs are well-suited for applications that involve FPGAs can be optimized for power efficiency by
making them suitable for applications that involve large signal processing tasks. In robotics, where sensor data tailoring the hardware architecture to the specific
amounts of sensor data or require rapid decision- processing is a common requirement, FPGAs can computational needs of the robotics application
making. efficiently handle tasks like image processing, sensor
fusion, and filtering.

8 |
Adaptive Computing
Technologies for
Robotics
[Public]

Broad Compute Portfolio

Adaptive SoC Adaptive SoC Embedded x86 Client PC GPU

Zynq UltraScale+ Ryzen R8000 Ryzen 8000


Versal AI Edge Embedded Processor Series Processor Radeon W7900 GPU
MPSoC

Heterogeneous Adaptive I/Os & Real- Power Efficient Edge AI inference for Gaming & AI
Acceleration Time Processing Applications Windows PCs

10 |
[Public]

Broad Compute Portfolio


Made off the shelf for robotics

Kria SOMs Embedded x86 Embedded+ GPU Alveo

Ryzen Ryzen Processors +


Adaptive SoC Embedded Processor Adaptive SoC Radeon GPUs Adaptive SoC

SOM NUC Mini ITX PCIe Cards PCIe Accelerator


Varying sizes 100mm x 100mm 170mm x 170mm Cards

11 |
[Public]

Scalable Portfolio of Production Kria SOMs


Choose the Starter Kit Select the right Production SOM Develop your Custom Carrier Card
K24
K26

KR260 ROBOTICS
For Robotics and
Machine Vision Systems KRIA K26 SOM
• VCU and larger DPU
• Transceivers
• C & I Grade

KV260 VISION AI
For Vision AI
Cameras and Systems

KRIA K24 SOM K24


• Half the size of a credit card
KD240 DRIVES • Power efficient
For Drives and Motor • ECC support
• C & I Grade
Control Systems
K26 or
Embedded+
12 |
[Public]

Introducing AMD Embedded+

• Embedded+ integrates AMD Ryzen


Embedded processors with
AMD Versal AI Edge adaptive SoCs
on a single PCB

• AMD makes the path to sensor fusion, AI


inferencing, industrial networking, control,
and visualization simpler with the
Embedded+ architecture and
ODM partner products

Sapphire Technology VPR-4616-MB

13 |
[Public]

AMD Commitment to Robotics

Education Universities Startups Commercial Entertainment

14 |
How We Address
Roboticists’ Needs
[Public]

How We Address Roboticists’ Needs

Silicon and Boards

Tools & Software Stack

Connectivity & Sensors

Safety / Security

Human Machine Interface (HMI)

Computer Vision / AI

16 |
[Public]

Design Path for Any Developer to Evaluate AMD Capabilities


Roboticists focused design flows in addition to traditional flows

• Platform runtime orchestration with Python


Python Developer Design Effortlessly • Fully paved road with prebuilt hardware libraries

• Build custom AI inference application


AI Developer Customize AI Model
• Configure AI processor to requirements

• Leverage Vitis Model Composer


Control System Developer Simulate Motor Control • Implement enhanced motor control functionality

• Based on workspaces (vs. applications)


Roboticist Develop Robot Behavior via KRS
• Computational graph centric

• Accelerate entire pipeline from SW


Software Developer Customize Adaptive Drives
• Customized HW acceleration using HLS

• Ultimate flexibility through RTL


Hardware Developer Develop Using Full Custom RTL • Customize connectivity with catalog IP

17 |
[Public]

Kria Robotics Stack Makes ROS 2 ‘Transparent’ to Developers


• A familiar entry-point aligned with common robotics flows Roboticist

• An integrated set of libraries and utilities to accelerate development


• Developed around ROS 2 (Humble Hawksbill) to enable a SW-defined, HW-accelerated platform

ACCELERATORS
Navigation Manipulation Perception Actuation User-Defined
(Hardware Accelerated
Libraries / Apps)

MIDDLEWARE
ROS 21 XRT (AMD Runtime) User Defined

Vitis

OS AND HYPERVISOR Yocto PetaLinux Ubuntu Linux2


Linux Networking Stack
XEN Hypervisor

CONNECTIVITY Time Sensitive


Ethernet EtherCAT CAN User-Defined
Networking

Kria K24 SOM Kria K26 SOM

Safety and Security


1: ROS 2 Humble Hawksbill
2: Ubuntu 22.04
18 |
[Public]

Python (PYNQ) Based Flow for Entry-level Developers

• PYNQ is an open-source Python framework from AMD


• Extensive ecosystem includes libraries for adaptive computing platforms like
Kria SOMs
Data Scientist SW/AI Developers
• PYNQ is built for developers who want to maximize the capabilities of Kria SOMs
but have limited Programmable Logic expertise
• Using the Python language and libraries, designers can leverage the
programmable logic (PL) to build more capable and innovative target applications

A very powerful
combination to build
+ + applications using
AMD adaptive
compute platforms

Kria SOMs are out-of-the box ready with PYNQ platform support
19 |
[Public]

Interfaces Supported
Connectivity Interfaces Camera Interfaces Peripheral Interfaces

MIPI, SLVS-EC, GMSL, LVDS,


Industrial Ethernet, TSN, I2C, I3C, SPI, PCIe, JTAG, UART,
Sub-LVDS, CameraLink,
1/10/25+ GigE, Wi-Fi USB 3.2, DP 1.4, HDMI 2.1
RaspberryPi Camera header

And much more!

20 |
[Public]

Design Tools for Hardware & Software Development

Pytorch

Guided flow for design, verification and implementation Unified software platform enabling development of
with hundreds of drag-and-drop IP cores embedded software & accelerated applications on
heterogeneous platforms

21 |
[Public]

System-Wide Safety and Security Across the Portfolio


Hardware Software
Attacks Attacks

Hardware Tamper-
resistance
• Supports AES-256 encryption1 FPGA
Memory
• UltraScale+ certified to IEC 61508:2010 SIL3 – Annex E Secure Boot Protection
• SHA-256 authentication → Ensures trusted source

• Isolated Design Flow (IDF) → FPGA Fault containment2


Functional Safety
• RSA-2048 Authentication

Software Cryptographic Secure


Acceleration Debugging
• Secure boot, protects from attack at startup

• Arm® TrustZone3 to isolate ‘main’ OS from secure OS

• Memory protection against malware injection

• Rich ecosystem of run-time Security IP


Firmware Code
Encryption Integrity

1: NIST-Approved (National Institute of Standards and Technology)


2: Physical separation of safety-critical regions of the design Network
3: A compromised OS cannot access ‘secured’ data in the secure OS
Security
22 |
Right Engine for the
Right Task
[Public]

Embedded+ for Roboticists

• Embedded+ integrates AMD Ryzen


Embedded processors with
AMD Versal AI Edge adaptive SoCs
on a single PCB

• AMD makes the path to sensor fusion, AI


inferencing, industrial networking, control,
and visualization simpler with the
Embedded+ architecture and
ODM partner products

Sapphire Technology VPR-4616-MB

24 |
[Public]
Memory Interface
Stream Interface
What is an AMD AI Engine? Cascade Interface

>1 GHz VLIW / SIMD Vector Processors


• Versatile: supports ML and other advanced DSP workloads AI AI AI
Local Local Local
• 10s to 100s of tiles per device for scalable compute (2D array) Engine-
ML
Mem.
Engine-
ML
Mem. Engine-
ML
Mem.

Massive Memory Access Within AI Engine Array


• Reduces data movement across device and need for DDR
• Minimizes power and latency AI
Engine-
Local
AI
Engine-
Local AI
Engine-
Local
Mem. Mem. Mem.
ML ML ML

High-Bandwidth Interface to Programmable Logic or NoC


• Flexible connectivity: Stream data from PL or from DDR via NoC
• Optimize data flow for efficient end-to-end application acceleration AI AI AI
Local Local Local
Engine- Engine- Engine-
Mem. Mem. Mem.
ML ML ML
SW Programmable for Any Developer
• C programmable, compile in minutes
• Multiple layers of abstraction from custom kernels to out-of-the-box ML
Memory Tile Memory Tile Memory Tile
framework/model support

AIE-ML Array Interface (PL & NoC Interface Tiles)

25 |
[Public]

Deploy Inference Models Rapidly with VitisTM AI


Development Environment
Complete AI Deployment Environment Train
• Open-source with broad framework and model support

High-Performance IP
• Efficient implementation with no AI Engine coding Optimize &
Quantize

Advanced Optimizations
• Enhanced quantization algorithms optimize ease-of-use and accuracy Compile

• Channel pruning increases performance

Enhanced Ease-of-Use Deploy


• End-to-end workflow examples demonstrate tool capabilities and features

26 |
[Public]

VersalTM AI Edge Gen 2 & Prime Gen 2 Adaptive SoC


Architecture Overview
New Features vs. First Generation Versal Adaptive SoCs

Up to 10X Scalar Compute Next-Generation AI Engines


• 8x A78AE cores – Up to 200k DMIPS • Up to 2X TOPs/Watt vs. AIE-ML
• 10x R52 cores – Up to 23k DMIPS • Advanced Data Types (MX6, MX9)
Pre-Silicon Estimated Performance Pre-Silicon Estimated Performance

Improved Safety & Security Hardened Video Processing


• ASIL D on APU and RPU w/Lockstep • Image Signal Processors (ISPs)
• Application Security Unit • Video Codec Units (VCUs)
• Video Processing Pipeline (VPP)

High-Speed PS Connectivity Integrated GPU


• PCIe® Gen5 w/ DMA, 10 Gb Ethernet
• Arm® Mali -G78AE for rendering/HMI
• USB 3.2, DP 1.4, UFS 3.1
• 2D/3D GUI rendering at 4k60

High-Performance Memory High-Performance I/O


• 6400-DDR5, 8533-LPDDR5x • 1.2 LVDS up to 1.6 Gb/s
• Selectable inline encryption/decryption • 10 Gb/s MIPI C-PHY, 4.5 Gb/s D-PHY

27 |
[Public]

8000 Series SoC


Next Gen High-performance 4nm SoC featuring
“Zen 4” x86 CPU with RDNA3 Radeon Graphics,
and XDNA Neural Processing Unit for AI Inferencing,
in FP7r2 BGA Package

8C/16T RDNA 3 DDR5 4 20L 15-54 XDNA NPU


“Zen 4” Cores 6 Graphics 5600MT/s 4K Displays PCIe® Gen4 Watts TDP 16 TOPs
4nm Workgroups Dual-Channel, ECC (39 TOPs for total SoC)

28 |
[Public]

Architecture Scales Across Portfolios


Variety of Ryzen Embedded Processor and Versal AI Edge Adaptive SoC
Customizable Options Available

AMD Ryzen Embedded Processors with Integrated Radeon Graphics Scale from Edge Sensor to CPU Accelerator

29 |
[Public]

Right Engine for Right Task

PC Needs Offloading in Industrial Systems Embedded+ Architecture Features

• Low latency, deterministic networking, • Expansion connector, programmable


and sensor interfacing I/O, FPGA fabric
• Real-time control • Arm® subsystem, FPGA fabric
• AI inferencing • AI Engines
• Video decode, rendering, and display • Video codec and Radeon graphics

30 |
Across Edge & Cloud:
Digital Twin
[Public]

Collaboration With AWS IoT Cloud


• AWS Greengrass Low High
Data
• Targets more capable embedded devices transmission

• Brings together control action & local analytics


• Mobilize Applications from Cloud to Edge
Low High

Latency

Low High
Data Edge
transmission

and others popular cloud


providers
Low High

Latency

32 |
[Public]

Digital Twin on AMD Ryzen CPU and AMD Kria SOM


AI robot named Pablo that runs on an AMD Kria SOM and its digital virtual twin rendered New distributed ROS 2 Control AMR
by an AMD Ryzen processor and the Unity® game engine. The joystick controls move Digital Twin Demo
Pablo’s head and eyes in both environments. Imagine the visualization possible as the
digital twin environment is not limited what it can render.

Unique or Compelling Value:


• KR260 Robotics Starter Kit offers pre-built interfaces for robotics, machine vision, industrial
communication and control for simplified integration, faster time from
out-of-box to deployment
• Ryzen enables key digital twin software ecosystem out of the box
• MakarenaLabs Pablo is a showpiece in ease-of-use with AMD Kria SOM,
Pynq-Z2, their MuseBox (AI) and IP (Speech) libraries

Manufacturing (Intelligent Factory), Agriculture, Logistics, Surgical, Industrial


Target Applications
Robots and Cobots, Surgical Robots, AGV / AMR, Aerial Robots

AMD Ryzen processors, AMD PYNQ -Z2 board, AMD Zynq 7020, AMD Kria
AMD Technology SOM, AMD Zynq UltraScale+ MPSoC, AMD Vivado HLS design tools,
AMD Vitis software platform, AMD PYNQ project

Partner Technology:
MakarenaLabs MuseBox, IP Libraries

33 |
[Public]

Resources To Get You Started

On-Demand Training Kria Robotics Stack Embedded+


• Sign up for session 2 & 3 of the series • Getting Started with KR260 • Understanding Embedded+ Architecture
• On-Demand webinars on a range of topics • Getting started with Kria Robotics Stack • Ryzen Embedded 8000
• Getting started with Kria and Vitis AI • 1st Gen Versal AI Edge Series
• Getting started with Kria and motor control • 2nd Gen Versal AI Edge Series
• Getting started with Kria App Store

visit https://www.amd.com/en/products/system-on-modules/kria/k26/robotics.html

34 |
[Public]

AMD strength for AI Inference deployment

Industry-Leading ONLY Adaptive Computing


AI Latency &
Performance

Industrial
Real-Time &
Lifecycle, Quality, Importance of Industrial IoT Single-Chip
Deterministic
Reliability, Embedded Platforms from AMD
Control,
Security,
Imaging/Vision
Temperature,
Processing and
Power and
Interfaces
Form Factor

Traditional AMD Strengths


35 |
Based on the AIE-ML v2 compute tile architecture
[Public] featured in the Versal AI Edge series Gen 2 for the MX6
data type compared to the INT8 data type. Operating
conditions: 1 GHz Fmax, 0.7V AIE operating voltage,

ENDNOTES 85C junction temperature, typical process, 60% vector


load, % activations = 0 < 10%. Actual performance will
vary when final products are released in market. (VER-
025)
• Based on AMD internal performance and power projections for the AIE-ML v2 compute tile architecture in the Versal AI Edge Series Gen 2 using the MX6 data type, compared
to performance specifications and AMD Power Design Manager power results for the AIE-ML compute tile architecture featured in the first-generation Versal AI Edge Series
using INT8 data type. Assumptions: 2 row, 8 column sub-arrays. Operating conditions: 1 GHz FMAX, 0.7V AIE operating voltage, 100°C junction temperature, typical process,
60% vector load, % activations = 0 < 10%. Actual performance will vary when final products are released in market. Performance projections as of March 2024. (VER-023)
• Based on AMD internal pre-silicon performance estimates and power projections for the AIE-ML v2 compute tile architecture featured in the Versal AI Edge Series Gen 2 for
the MX6 data type compared to the INT8 data type. Operating conditions: 1 GHz FMAX, 0.7V AIE operating voltage, 85°C junction temperature, typical process, 60% vector
load, % activations = 0 < 10%. Actual performance will vary when final products are released in market. (VER-025)
• Based on AMD internal pre-silicon performance estimates for the AIE-ML v2 compute tile architecture featured in the Versal AI Edge Series Gen 2 using the MX6 data type
compared to the INT8 data type. Operating conditions: 1 GHz FMAX, 0.7V AIE operating voltage. Actual performance will vary when final products are released in
market. (VER-026)
• Based on AMD internal pre-silicon performance estimates for combined total DMIPs of the Versal AI Edge Series Gen 2 and Versal Prime Series Gen 2 processing system
when configured with 8 Arm Cortex-A78AE applications cores @2.2 GHz and 10 Arm Cortex-R52 real-time cores @1.05 GHz, compared to the published combined total
DMIPs of the processing system in the first-generation Versal AI Edge series and Versal Prime series. Versal AI Edge Series Gen 2 and Prime Series Gen 2 operating
conditions: Highest available speed grade, 0.88V PS operating voltage, split-mode operation, maximum supported operating frequency. First-generation Versal AI Edge Series
and Prime Series operating conditions: Highest available speed grade, 0.88V PS operating voltage, maximum supported operating frequency. Actual DMIPs performance will
vary when final products are released in market. (VER-027)

36 |
[Public]

GENERAL DISCLAIMER AND ATTRIBUTION


The information contained herein is for informational purposes only and is subject to change without notice. While every precaution has been
taken in the preparation of this presentation it may contain technical inaccuracies, omissions and typographical errors, and Advanced Micro
Devices, Inc. is under no obligation to update or otherwise correct this information. AMD makes no representations or warranties with
respect to the accuracy or completeness of the contents of this document, and assumes no liability of any kind, including the implied
warranties of noninfringement, merchantability or fitness for a particular purpose, with respect to the operation or use of AMD hardware,
software, or other products described herein. No license, including implied or arising by estoppel, to any intellectual property rights is
granted by this presentation. Terms and limitations applicable to the purchase or use of AMD products are as set forth in a signed agreement
between the parties or in AMD Standard Terms and Conditions of Sale. GD-18

©2024 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, Kria, Radeon, Ryzen, Vitis, Versal, and combinations
thereof are trademarks of Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and
may be trademarks of their respective owners.

37 |

You might also like