0% found this document useful (0 votes)
35 views

introducing-the-versal-architecture

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

introducing-the-versal-architecture

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Introducing the Versal Architecture

Presented By

Sumit Shah
Director of Silicon Product Marketing and Management
October 2, 2018

© Copyright 2018 Xilinx


The Technology Conundrum .. And the Need for a
New Compute Paradigm
Processing Architectures A Single Architecture
are Not Scaling Can’t Do It Alone

Performance
vs. VAZ11-780 40 YEARS OF PROCESSOR PERFORMANCE Irregular
Safety Processing,
2X /
100,000
3.5 Years
? or Latency-Critical data types,
Workloads instruction sets,
2X /
data operation
6 Years Amdahls
10,000 Law
End of
Dennard
Scaling Whole
1000
2X / Domain Specific Application Sensor Fusion,
Parallelism
1.5 Years (e.g., Video, ML) Pre-Processing,
100 Data Aggregation
RISC

10 2X / Complex
3.5 Years Algorithms,
Full Linux “Services”
CISC
1980 1985 1990 1995 2000 2005 2010 2015

Source: John Hennessy and David Patterson, Computer Architecture: A Quantitative Approach, 6/e 2018

>> 2
© Copyright 2018 Xilinx
Need for a New Programming Paradigm

Ecosystem of
Libraries
Need a Scalable,
Software Developer Unified Platform Hardware Developer
Needs Agility and Abstraction Needs Flexibility to Optimize for Performance/Power

Modify,
Design,
Add Code

Familiar Flexible Platform


Platform to Optimize for Performance/Power

>> 3
© Copyright 2018 Xilinx
New Device Category: Adaptive Compute Acceleration Platform
COMPUTE ACCELERATION

Scalar Adaptable Intelligent


Engines Engines Engines

PLATFORM
ADAPTIVE Development Tools
HW/SW Libraries
Diverse Workloads in Run-time Stack
Milliseconds

Future-Proof for SW Programmable


New Algorithms Silicon Infrastructure

Enabling Data Scientists, SW Developers, HW Developers


>> 4
© Copyright 2018 Xilinx
Introducing the World’s First ACAP

˃ Heterogeneous Acceleration
˃ For Any Application
˃ For Any Developer

>> 5
© Copyright 2018 Xilinx
Breakthrough Performance for Cloud, Network, and Edge

Cloud Compute Networking 5G Wireless Edge Compute


Breakthrough AI Inference Multi-terabit Throughput Compute for Massive MIMO AI Inference at Low Power

Int 16x16 DSP Compute (TeraMAC/ sec)


>8X 4X 5X 15X
Single-Chip Encrypted Traffic (Gb/s)
GoogleNet V1 Img/Sec (<2ms)

ResNet50 img/sec (batch=1)


High-End Versal Device UltraScale+ Versal UltraScale+ Versal UltraScale+ Versal
GPU FPGA Device RFSoC Device MPSoC Device
>> 6
© Copyright 2018 Xilinx
Versal Architecture Overview
Adaptable Engines
2X compute density

Intelligent Engines
Scalar Engines • AI Compute
• Platform Control • Diverse DSP workloads
• Edge Compute

Network-on-Chip
Protocol Engines • Guaranteed Bandwidth
• Integrated 600G cores • Enables SW Programmability
• 4X encrypted bandwidth

Programmable I/O DDR Memory


• Any sensor, any interface • 2X bandwidth/pin
• Extendable peripheral set • Server-class density

PCIe & CCIX


Transceivers
• 2X PCIe & DMA bandwidth
• Broad range, 25G →112G
• Cache-coherent interface
• 58G in mainstream devices
to accelerators

>> 7
© Copyright 2018 Xilinx
Platform Management Controller
Bringing the Platform to Life & Keeping it Safe & Secure
Boot & Configuration
10s of
˃ Boots the platform in milliseconds (any engine first) Milliseconds
˃ 8X faster dynamic reconfiguration DONE

˃ Advanced power & thermal management

Security, Safety & Reliability Enclave


˃ HW Root of Trust
˃ Cryptographic acceleration & confidentiality
˃ Enhanced diagnostics, system monitoring & anti-tamper
˃ Error mitigation, detection & management for safety

Boot
Integrated Platform Interfaces & High Speed Debug
˃ Integrated flash, system & debug interfaces
BOOT & CONFIG  SAFETY  SECURITY  DEBUG
˃ High-speed non-invasive, chip-wide debug

>> 8
© Copyright 2018 Xilinx
A Processor in Every Device
Diverse Use Models for Scalar Processing

Edge Compute & Control Plane Processing Operation & Management


Autonomous Systems in the Network & Cloud in Communication Systems
Complex processing Data path management & Board level control monitoring
for intelligent edge and endpoint device management

>> 9
© Copyright 2018 Xilinx
The Arm Subsystem

Dual-Core ARM Cortex-A72 Application Processors Application Processing Unit


˃ Up to 1.7GHz for 2X single-threaded performance1 NEON™
ARM®
˃ Cost and power optimized (half the power) Cortex™-A72
Floating Point Unit

˃ Code compatibility (ARMv8-A architecture) 48 KB I-Cache 32 KB Memory Embedded


w/Parity D-Cache w/ECC Management Unit Trace Macrocell
1
˃ Enables SW developers to start from a familiar place 2

GIC-520 SCU CCI/SMMU 1MB L2 w/ECC

Dual-Core ARM Cortex-R5 Real Processors Real Time Processing Unit


˃ Up to 750MHz for 1.4X greater performance1
ARM® Vector Floating Point Unit
˃ Low latency and deterministic Cortex™-R5
(Split & Lockstep) Memory Protection Unit
˃ Flexible operation modes: Split-Mode and Lock-Step
32 KB I-Cache w/ECC 32 KB D-Cache w/ECC
˃ Highest levels of functional safety (ASIL and SIL) 1
2

GIC 256KB TCM w/ECC 256KB OCM w/ECC

1: DMIPS vs. Zynq UltraScale+ MPSoCs


>> 10
© Copyright 2018 Xilinx
Adaptable Engines

Adaptable
Hardware Engines

Programmable logic for fine-grained


parallel processing, data aggregation,
and sensor fusion
Programmable memory hierarchy
to optimize compute efficiency

High bandwidth, low latency data


movement between engines and I/O

>> 11
© Copyright 2018 Xilinx
Greater Compute Density for Any Workload

Re-Architected Hardware Fabric Tune for Power & Performance Adaptable to any Workload
˃ 4X density per logic block for more compute ˃ Three operating voltages to choose from ˃ Bit-level precision (1 → 1,000) for any algorithm
˃ Less external routing→ greater performance ˃ Balance power/performance for target app ˃ Improves ML efficiency (compression, pruning)
˃ Code and IP compatible with 16nm devices ˃ Equivalent to 3 speed grades in one device ˃ Forward-compatible to lower precision
neural networks, e.g., BNN

ML Inference and Optimizations


VMID (e.g., pruning)

30% 20%
Lower More
Power Performance
VLOW VHIGH
For Any Workload

GENOMIC VIDEO SPEECH


SEQUENCING TRANSCODING RECOGNITION

>> 12
© Copyright 2018 Xilinx
Intelligent Engines

Intelligent Engines
for Diverse Compute

DSP Engines
High-precision floating point & low latency
Granular control for customized data paths

AI Engines
High throughput, low latency, and power efficient
Ideal for AI inference and advanced signal processing

>> 13
© Copyright 2018 Xilinx
DSP Engines
Versatility and Granular Control of Data Path

Enhanced Compute architecture Performance Improvement


˃ Greater than 1GHz of performance UltraScale+ 16nm Versal 7nm

Versatility for Wireless, ML, HPC, and more


2.2X 3.3X 3.6X
˃ Integrated FP32, FP16 floating point, INT24 (HPC)
˃ Integrated complex 18x18 operation (wireless, cable access)
˃ Double the performance in INT8 operation (AI inference)

Code Portability for UltraScale+ 16nm designs


˃ Support for legacy IP and LogiCore libraries
˃ Compatibility with SysGen, Model Composer, HLS tools Int8 Dot Product Complex 32-bit Single
18x18 Precision
Floating Point

>> 14
© Copyright 2018 Xilinx
Intelligent Engines
Massive AI Inference Throughput and Wireless Compute

1.3GHz VLIW / SIMD vector processors


˃ Versatile core for ML and other advanced DSP workloads

Memory

Memory

Memory
AI AI AI
Core Core Core
Massive array of interconnected cores
˃ Instantiate multiple tiles (10s to 100s) for scalable compute

Memory

Memory
Memory
AI AI AI
Core Core Core
Terabytes/sec of interface bandwidth to other engines
˃ Direct, massive throughput to adaptable HW engines

Memory
Memory

Memory
˃ Implement core application with AI for “Whole App Acceleration” AI AI AI
Core Core Core

SW programmable for any developer


˃ C programmable, compile in minutes
˃ Library-based design for ML framework developers

>> 15
© Copyright 2018 Xilinx
NoC for Ease of Use, Guaranteed Bandwidth, and
Power Efficiency

High bandwidth terabit network-on-chip


˃ Memory mapped access to all resources
˃ Built-in arbitration between engines and memory

High Bandwidth, Low Latency, Low power


˃ Guaranteed QoS
˃ 8X power efficiency vs. FPGA implementations

Eases Kernel Placement


˃ Easily swap kernels at NoC port boundaries
˃ Simplifies connectivity between kernels

>> 16
© Copyright 2018 Xilinx
Adaptable Memory Hierarchy
The Right Memory for the Right Job local data memory
in AI engines

Scalar Engines Adaptable Engines Intelligent Engines


AI ENGINES
WORKLOAD1
Arm
Cortex-A72

WORKLOADN
Increasing Bandwidth, Decreasing Density

1,000 Tb/s
Cache
LUTRAM
Distributed low-latency memory
Arm
100 Tb/s Block RAM & UltraRAM Cortex-R5
BRAM

BRAM
BRAM

BRAM
BRAM

BRAM
BRAM

BRAM
Embedded configurable SRAM
UltraRAM UltraRAM

UltraRAM UltraRAM
Cache
(New) Accelerator RAM
TCM Accelerator RAM
4 MB sharable across engines
OCM
10 Tb/s HBM
In-package DRAM
MIPI
PCIe & Network
DDR HBM SerDes LVDS
DDR External Memory CCIX Cores
GPIO
DDR4-3200; LPDDR4-4266
1 Tb/s

>> 17
© Copyright 2018 Xilinx
Introducing the “Integrated Shell”
Scalar Engines Adaptable Engines Intelligent Engines
‘Shell’: Pre-Built Core Infrastructure & System Connectivity
˃ External host interface Arm
Dual-Core AI

CONFIGURABLE, INTEGRATED SHELL


˃ Memory subsystem Cortex-A72 Engines
Versal
˃ Basic interfaces (e.g., JTAG, USB, GbE)
Adaptable
Arm
Hardware
Dual-Core
Key Architectural Elements of the Shell Cortex-R5 DSP
Engines
˃ Platform Management Controller (PMC)
PMC
˃ Integrated host interfaces: PCIe & CCIX, DMA
Network on Chip
˃ Scalable Memory Subsystem: DDR4 & LPRDDR4
PCIe & 112G MIPI
Nx100G
˃ Network-on-Chip for connectivity and arbitration CCIX DDR HBM 58G
Cores LVDS
(w/DMA) 32G GPIO

Greater Performance, Device Utilization, and Productivity


˃ More of the platform available for application’s workload(s)
Host System
˃ Target application runs faster with less device congestion Connectivity Memory

˃ Turn-key, pre-engineered timing closure – no debug


CPU

>> 18
© Copyright 2018 Xilinx
Transceivers: Robust and Scalable Connectivity

32G Optimized for latency and power


NRZ Transceivers

58G Tuned for the latest copper cable,


PAM4 Transceivers backplane & optical interfaces

112G Industry-leading performance for


PAM4 Transceivers copper cable, backplane, optical

COPPER CABLE OPTICS BACKPLANE

>> 19
© Copyright 2018 Xilinx
Programmable I/O for Any Sensor, Interface, or Memory

˃ Different IO types provide a wide range of speeds and voltages


˃ Configure the same I/O for either memory or sensor interfaces per application requirements
HDIO, MIO XDIO
(Legacy Standards) (High-Speed)

DC 400Mb/s 1600Mb/s 3200Mb/s 4266Mb/s

1.8V to 3.3V 0.6V to 1.5V

3.2Gb/s MIPI D-PHY 4.2Gb/s LPDDR4 (0.6V)


8 Mpix Sensors & Displays Highest Memory Bandwidth Per Pin

3.2Gb/s DDR4
Server Class Density Per Channel
>> 20
© Copyright 2018 Xilinx
Versal Core Series Enables “Smart Cities”
Video Surveillance with Machine Learning

Scalar Engines Adaptable Engines Intelligent Engines

NEURAL NETWORK AI ENGINES


RT COMRESSION
Host Connectivity and Network Connectivity MACHINE
Arm LEARNING
Integrated PCIe and Ethernet Security
Dual-Core
Adaptable Engines Optimize Compute/Watt Cortex-A72 CUSTOM MEMORY
HIEARCHY
Video scaling , custom memory hierarchy, compression
DSP ENGINES
DSP Engines for Video Transcode Arm VIDEO
TRANSCODING
Scalable for legacy and emerging video formats Dual-Core VIDEO SCALING /
Compression
Cortex-R5
AI Engines for Real-Time Image Recognition
License Plate/Facial Recognition
NETWORK-ON-CHIP
Network On Chip and Memory Subsystem Multi-Rate
PCIe & CCIX DDR 32G Custom
Interconnects memory and compute Engines (w/DMA) Ethernet I/O

>> 21 Network
© Copyright 2018 Xilinx
For Any Developer

Frameworks AI & Data


Scientists

New Unified Software Software Application


Development Environment Developers

Embedded
Embedded Run-Time
Developers

Hardware
Vivado Design Suite
Developers

>> 22
© Copyright 2018 Xilinx
AI Core
Series

Prime
Series

© Copyright 2018 Xilinx


Part of the Xilinx Product & Technology Portfolio

DEVICE
CATEGORY
FPGA SoC ACAP

FEATURED Spartan Zynq-7000 Versal


PRODUCTS Artix Zynq UltraScale+ MPSoC
Kintex Zynq UltraScale+ RFSoC
Virtex

>> 24
© Copyright 2018 Xilinx
Announcing the First Two Series of the Versal
Portfolio
AI Core Series AI RF
Series
Breakthrough AI Inference Throughput
˃ Portfolio‘s highest compute and low latency inference AI Core
Series
˃ Optimized for cloud, networking, & autonomous applications
˃ For highest range of AI and workload acceleration AI Edge
Series

Prime Series HBM


Series
Broad Applicability Across Multiple Markets
˃ Mid-range series in the Versal portfolio Premium
Series
˃ Optimized for connectivity
˃ For in-line acceleration and diverse workloads Prime
Series

>> 25
© Copyright 2018 Xilinx
Versal AI Core Series
Highest AI Inference
Throughput
50 – 150 INT8 TOPs First Available
Device
VC1352 VC1502 VC1702 VC1802 VC1902
Intelligent Engines AI Engines 128 217 310 300 400
DSP Engines 928 1,312 1,272 1,600 1,968
Adaptable Engines System Logic Cells (K) 540 797 1,021 1,586 1,968
Accelerator RAM (Mb) 32 0 32 0 0

Total SRAM Capacity (Mb) 92 80 174 120 164


Scalar Engines Application Processing Unit Dual-core Arm® Cortex-A72, 48KB/32KB L1 Cache w/ parity & ECC; 1MB L2 Cache w/ ECC
Real-time Processing Unit Dual-core Arm Cortex-R5, 32KB/32KB L1 Cache, 256KB TCM w/ECC and 256KB OCM w/ECC
Foundational NoC Master / NoC Slave Ports 10 14 18 28 28
Platform DDR Memory Controllers 2 2 2 4 4
CCIX & PCIe® w/DMA (CPM) – 1 x Gen4x16, CCIX – 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX
PCI Express® 1 x Gen4x8 4 x Gen4x8 1 x Gen4x8 4 x Gen4x8 4 x Gen4x8
Multirate Ethernet MAC 1 4 3 4 4
SD-FEC 2 0 5 0 0
I/O Programmable I/O 500 500 500 770 770
Transceivers 8 44 24 44 44

Enabling Ethernet at 256Gb/s PCIe & CCIX


10G/25G/50G/100G Bandwidth to Host

Scalable DDR
128b – 256b w/ECC

>> 26
© Copyright 2018 Xilinx
Versal Prime Series
6X Scalable
Logic Density First Available
Device
VM1102 VM1302 VM1402 VM1502 VM1802 VM2502 VM2602 VM2702 VM2902
Intelligent Engines DSP Engines 472 736 1,504 1,312 1,968 3,984 1,880 2,500 3,080
Adaptable Engines System Logic Cells (K) 352 572 1,002 797 1,968 2,030 1,263 1,805 2,154

Total SRAM Capacity (Mb) 35 63 87 80 164 245 174 243 294


Scalar Engines Application Processing Unit Dual-core Arm® Cortex-A72, 48KB/32KB L1 Cache w/ parity & ECC; 1MB L2 Cache w/ ECC

Real-time Processing Unit Dual-core Arm Cortex-R5, 32KB/32KB L1 Cache, 256KB TCM w/ECC and 256KB OCM w/ECC
Foundational NoC Master / NoC Slave Ports 5 16 16 14 28 28 16 26 26
Platform
DDR Bus Widths 64 128 256 128 256 288 384 384 384

DDR Memory Controllers 1 2 4 2 4 5 6 6 6

CCIX & PCIe® w/DMA (CPM) - - - 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX

PCI Express® 1 x Gen4x8 2 x Gen4x8 2 x Gen4x8 4 x Gen4x8 4 x Gen4x8 1 x Gen4x8 1 x Gen4x8 2 x Gen4x8 2 x Gen4x8

Multirate Ethernet MAC 1 2 2 4 4 1 2 2 2


I/O Programmable I/O 316 554 770 500 770 824 802 824 824

Transceiver 12 24 24 44 44 44 52 76 92

Integrated DDR Transceiver


I/O Count Optimized
Controllers & Bandwidth Optimized
Protocol Engines

>> 27
© Copyright 2018 Xilinx
Versal Roadmap

HBM
Memory
Premium Integration
AI RF
112G Serdes AI w/ Integrated RF
600G Cores
AI Core
AI Inference
Throughout

AI Edge
Lowest power AI

Prime
Broadest Application

2H 2019 2020 2021

>> 28
© Copyright 2018 Xilinx
Getting Started

Visit www.xilinx.com/versal
Check out the Media Kit
Watch ACAP Intro video
Subscribe to mailing list for the latest news

View documentation and resources


Data Sheet Overview
Product Tables
Versal Architecture and AI Engine White Papers

>> 29
© Copyright 2018 Xilinx
Key Take-Aways

Versal: The First ACAP


Heterogeneous Acceleration
For Any Application
For Any Developer

Announcing Two Families


Versal Prime Series for Broad Application
Versal AI Core Series for Highest AI Throughput

Availability
Early Access Program for SW and tools
Devices Available 2H 2019

>> 30
© Copyright 2018 Xilinx
© Copyright 2018 Xilinx
Versal AI Core Series
Highest AI Inference
Throughput
50 – 150 INT8 TOPs First Available
Device
VC1352 VC1502 VC1702 VC1802 VC1902
Intelligent Engines AI Engines 128 217 310 300 400
DSP Engines 928 1,312 1,272 1,600 1,968
Adaptable Engines System Logic Cells (K) 540 797 1,021 1,586 1,968
Accelerator RAM (Mb) 32 0 32 0 0

Total SRAM Capacity (Mb) 92 80 174 120 164


Scalar Engines Application Processing Unit Dual-core Arm® Cortex-A72, 48KB/32KB L1 Cache w/ parity & ECC; 1MB L2 Cache w/ ECC
Real-time Processing Unit Dual-core Arm Cortex-R5, 32KB/32KB L1 Cache, 256KB TCM w/ECC and 256KB OCM w/ECC
Foundational NoC Master / NoC Slave Ports 10 14 18 28 28
Platform DDR Memory Controllers 2 2 2 4 4
CCIX & PCIe® w/DMA (CPM) – 1 x Gen4x16, CCIX – 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX
PCI Express® 1 x Gen4x8 4 x Gen4x8 1 x Gen4x8 4 x Gen4x8 4 x Gen4x8
Multirate Ethernet MAC 1 4 3 4 4
SD-FEC 2 0 5 0 0
I/O Programmable I/O (4G, 3.3V) 378, 122 378, 122 378, 122 648, 122 648, 122
Transceivers (32G) 8 44 24 44 44

Enabling Ethernet at 256Gb/s PCIe & CCIX


10G/25G/50G/100G Bandwidth to Host

Scalable DDR
128b – 256b w/ECC

>> 32
© Copyright 2018 Xilinx
Versal Prime Series
6X Scalable
Logic Density First Available
Device
VM1102 VM1302 VM1402 VM1502 VM1802 VM2502 VM2602 VM2702 VM2902
Intelligent Engines DSP Engines 472 736 1,504 1,312 1,968 3,984 1,880 2,500 3,080
Adaptable Engines System Logic Cells (K) 352 572 1,002 797 1,968 2,030 1,263 1,805 2,154

Total SRAM Capacity (Mb) 35 63 87 80 164 245 174 243 294


Scalar Engines Application Processing Unit Dual-core Arm® Cortex-A72, 48KB/32KB L1 Cache w/ parity & ECC; 1MB L2 Cache w/ ECC

Real-time Processing Unit Dual-core Arm Cortex-R5, 32KB/32KB L1 Cache, 256KB TCM w/ECC and 256KB OCM w/ECC
Foundational NoC Master / NoC Slave Ports 5 16 16 14 28 28 16 26 26
Platform
DDR Bus Widths 64 128 256 128 256 288 384 384 384

DDR Memory Controllers 1 2 4 2 4 5 6 6 6

CCIX & PCIe® w/DMA (CPM) - - - 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX

PCI Express® 1 x Gen4x8 2 x Gen4x8 2 x Gen4x8 4 x Gen4x8 4 x Gen4x8 1 x Gen4x8 1 x Gen4x8 2 x Gen4x8 2 x Gen4x8

Multirate Ethernet MAC 1 2 2 4 4 1 2 2 2


I/O Programmable I/O (4G, 3.3V) 216, 100 432, 122 648, 122 378, 122 648, 122 702, 122 702, 100 702, 122 702, 122

Transceiver (32G, 58G) 12, 0 24, 0 24, 0 44, 0 44, 0 16, 28 20, 32 32, 44 40, 52

Integrated DDR Transceiver


I/O Count Optimized
Controllers & Bandwidth Optimized
Protocol Engines

>> 33
© Copyright 2018 Xilinx
Versal AI Core Series
VC1352 VC1502 VC1702 VC1802 VC1902

Intelligent Engines AI Engines 128 217 310 300 400


AI Engine Data Memory Blocks (#) 1024 1736 2480 2400 3200
AI Engine Data Memory (Mb) 32 54.25 77.5 75 100
DSP Engines 928 1,312 1,272 1,600 1,968

Adaptable Engines System Logic Cells (K) 540 797 1,021 1,586 1,968
LUTs 246,784 364,544 466,688 725,000 899,840
Distributed RAM (Mb) 8 11 14 22 27
Memory Total Block RAM (Mb) 18 19 29 28 34
UltraRAM (Mb) 42 60 113 91 130
Accelerator RAM (Mb) 32 0 32 0 0
Total SRAM Capacity (Mb) 92 80 174 120 164
Scalar Engines Application Processing Unit Dual-core Arm® Cortex-A72, 48KB/32KB L1 Cache w/ parity & ECC; 1MB L2 Cache w/ ECC
Real-time Processing Unit Dual-core Arm Cortex-R5, 32KB/32KB L1 Cache, and 256KB TCM w/ECC
Memory 256KB On-Chip Memory w/ECC
Connectivity Ethernet (x2); UART (x2); CAN-FD (x2); USB 2.0 (x1); SPI (x2); I2C (x2)

Foundational Platform NoC Master / NoC Slave Ports 10 14 18 28 28


DDR Bus Width 128 128 128 256 256
DDR Memory Controllers 2 2 2 4 4
CCIX & PCIe® w/DMA (CPM) – 1 x Gen4x16, CCIX – 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX
PCI Express® 1 x Gen4x8 4 x Gen4x8 1 x Gen4x8 4 x Gen4x8 4 x Gen4x8
Multirate Ethernet MAC 1 4 3 4 4
SD-FEC 2 0 5 0 0
Platform Management Controller Boot, Security, Safety, Monitoring, and High Speed Debug
Package Footprint Package Dimensions Ball Pitch XPIO, HDIO, MIO, GTY XPIO, HDIO, MIO, GTY XPIO, HDIO, MIO, GTY XPIO, HDIO, MIO, GTY XPIO, HDIO, MIO, GTY
A1024 31x31 0.92 378, 22, 78, 8 378, 22, 78, 8
E1369 35x35 0.92 378, 44, 78, 8 378, 44, 78, 24
A1596 37.5x37.5 0.92 378, 44, 78, 32 378, 44, 78, 16 378, 44, 78, 32 378, 44, 78, 32
D1760 40x40 0.92 648, 44, 78, 24
A2197 45x45 0.92 378, 44, 78, 44 648, 44, 78, 44 648, 44, 78, 44

© Copyright 2018 Xilinx


Versal Prime Series
VM1102 VM1302 VM1402 VM1502 VM1802 VM2502 VM2602 VM2702 VM2902

Intelligent Engines DSP Engines 472 736 1,504 1,312 1,968 3,984 1,880 2,500 3,080
Adaptable Engines System Logic Cells (K) 352 572 1,002 797 1,968 2,030 1,263 1,805 2,154
LUTs 161,024 261,376 457,984 364,544 899,840 927,872 577,536 825,000 984,576
Distributed RAM (Mb) 5 8 14 11 27 28 18 25 30
Memory Total Block RAM (Mb) 8 16 40 19 34 48 55 74 90
Total UltraRAM (Mb) 27 47 47 60 130 197 119 169 204
Total SRAM Capacity (Mb) 35 63 87 80 164 245 174 243 294
Scalar Engines Application Processing Unit Dual-core Arm® Cortex-A72, 48KB/32KB L1 Cache w/ parity & ECC; 1MB L2 Cache w/ ECC
Real-time Processing Unit Dual-core Arm Cortex-R5, 32KB/32KB L1 Cache, and 256KB TCM w/ECC
Memory 256KB On-Chip Memory w/ECC
Connectivity Ethernet (x2); USB 2.0 (x1); UART (x2); SPI (x2); I2C (x2); CAN-FD (x2)

Foundational Platform NoC Master / NoC Slave Ports 5 16 16 14 28 28 16 26 26

DDR Bus Widths 64 128 256 128 256 288 384 384 384
DDR Memory Controllers 1 2 4 2 4 5 6 6 6
CCIX & PCIe® w/DMA (CPM) - - - 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX 1 x Gen4x16, CCIX

PCI Express® 1 x Gen4x8 2 x Gen4x8 2 x Gen4x8 4 x Gen4x8 4 x Gen4x8 1 x Gen4x8 1 x Gen4x8 2 x Gen4x8 2 x Gen4x8
Multirate Ethernet MAC 1 2 2 4 4 1 2 2 2
XPIO, HDIO, MIO XPIO, HDIO, MIO XPIO, HDIO, MIO XPIO, HDIO, MIO XPIO, HDIO, MIO XPIO, HDIO, MIO XPIO, HDIO, MIO XPIO, HDIO, MIO XPIO, HDIO, MIO
Package Footprint Package Dimensions
GTY, GTM GTY, GTM GTY, GTM GTY, GTM GTY, GTM GTY, GTM GTY, GTM GTY, GTM GTY, GTM

B625 21x21 216, 22, 78, 4, 0

B1024 31x31 216, 22, 78, 12, 0 216, 44, 78, 16, 0 324, 44, 78, 16, 0

B1369 35x35 216, 44, 78, 24, 0 324, 44, 78, 24, 0 324, 44, 78, 24, 0

A1760 40x40 432, 22, 78, 24, 0 648, 22, 78, 24, 0 756, 22, 78, 20, 0

C1760 40x40 378, 44, 78, 44, 0 378, 44, 78, 44, 0 378, 44, 78, 20, 32 378, 44, 78, 24, 32 378, 44, 78, 24, 32

D1760 40x40 648, 44, 78, 24, 0

A2197 45x45 648, 44, 78, 44, 0 648, 44, 78, 16, 16

A2785 50x50 702, 44, 78, 16, 28 702, 44, 78, 20, 32 702, 44, 78, 32, 44 702, 44, 78, 40, 52

>> 35
© Copyright 2018 Xilinx

You might also like