0% found this document useful (0 votes)

24 views

Accelerating AI with Storage Scale

Uploaded by

abery.au

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views

Accelerating AI with Storage Scale

Uploaded by

abery.au

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Accelerating AI with

Storage Scale

Storage Scale User Group

May 13th, 2024, ISC, Hamburg Germany
Ted Hoover
Product Manager, Storage for Data and AI
Disclaimer

IBM's statements regarding its plans, directions, and intent are subject to change or withdrawal without
notice at IBM's sole discretion. Information regarding potential future products is intended to outline our
general product direction and it should not be relied on in making a purchasing decision. The information
mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver
any material, code, or functionality. The development, release, and timing of any future features or
functionality described for our products remains at our sole discretion.
IBM reserves the right to change product specifications and offerings at any time without notice. This
publication could include technical inaccuracies or typographical errors. References herein to IBM products
and services do not imply that IBM intends to make them available in all countries.
To unlock the full potential of AI we must overcome the challenges of
enterprise infrastructure

Infrastructure limitations Growing resource Operational and physical Security and data
& platforms to scale AI demands & silos resource efficiencies resiliency
82% of organizations cite siloed Increasing operational overhead Data must be trusted and
AI is the fastest growing workload
data as a key obstacle to more with new AI apps challenge security of sensitive
driving spending on compute and IT budgets and energy efficiencies information from cyberthreats,
storage infrastructure2. effective AI development1.
loss or downtime is high
priority.

Data Sources:
1 IDC, Planning for Success with Generative AI
2 https://www.ibm.com/downloads/cas/VKGPNJ3B
What if your organization could accelerate AI workloads with a storage infrastructure
designed to accelerate business growth?

Difficult for AI Optimized for AI

Isolated Accelerated
with Silos Innovation
Workload Workload Workload Workloads Workloads Workloads

Management
Generative AI models and platform

Security
End-to-end application platform

Data acceleration and sharing with governance

or
Cloud 1 Cloud 2 Cloud 1 Cloud 2
Storage 1 Storage 2 Storage 1 Storage 2

Client IT environment Client IT environment

• Siloed and slow-to-adopt innovation • Continuous and speedy innovation

• Sub-optimal use of resources • Integrated and automated operating model
• Hard to align across business • Accelerated value of investments
• AI constrained • Generative AI at scale

4
Customers need an end-to-end data strategy to bring accelerated results for the AI
pipeline
Prepare Build Deploy
Data Distributed training & Model Inference
preparation model validation adaptation

Workflow of steps (e.g. Long-running job on Model tuning with May have sensitivity to
deduplicate, remove massive infrastructure custom data set for latency/throughput,
Data hate & profanity, etc. downstream tasks always cost-sensitive
Decisions
Hours to days Weeks to months Minutes to hours Sub-second API
request
10-2000+ low to 10-500+ high-end 1+ mid high-end
mid-end CPUs GPUs (per job) GPU (per jkob) Single or fraction of
cores GPU per fine tuning
task or serving
request
Infra: 8xA100, 8xH100 Infra: 8xA100, Infra:
High performance 8xH100 L40S, L4
networking

AI acceleration and collaboration

with efficiency and resilience
IBM Storage for Data and AI / © 2024 IBM Corporation
Storage Requirements for AI

AI Tuning/Inferencing AI Training

Efficient GPU support

AI Workloads
Efficient GPU Rapid Maximum
High bandwidth
Storage support deployment Performance
Acceleration
Low latency
HA/DR/Backup
AI Platform
Storage
Abstraction Simplified Linear scaling of performance and
Metadata
Day-2 capacity
catalog Scalability
Optimized for operations
integration
AI High density

IBM Storage for Data and AI / © 2024 IBM Corporation

Storage: Why Matters

• Lot of I/O (Yes, customers will downplay it)

• 2:1 Read: Write

• Most important are Read & Re-Read

• Writes are massive with large parameters models with 175B+

• Scalable Performance really matters

• High Performance Parallel File Storage (PFS) is a scratch space, not

long-term storage
https://docs.nvidia.com/https:/docs.nvidia.com/dgx-superpod-reference-architecture-dgx-h100.pdf
• Tiering to Object/NL-SAS/Tape is common practice
7
IBM Tops Nvidia GPU data
delivery charts
IBM Storage Scale System 6000 is over 2x more performant than ESS 3500

Why IBM Storage for NVIDIA GPUs?

The world’s fastest systems need the world’s best storage.
IBM has the best storage for NVIDIA GPUs

Highest Performance Platform

• Fastest performance for reads, writes, and density
• Linearly scalability for future growth
A Robust Enterprise Platform
• Six 9’s for all apps: AI, Analytics, HPC, Back-up, Archive, Cloud
• Cyber-reslient, encryption, WORM, and immutability

Collapse Layers & Simplify Data Integration

• Eliminate extra copies and share data globally with all protocols
• Data cataloging and tiering for economics and data flexibility

https://blocksandfiles.com/2023/08/15/ibm-nvidia-gpu-data-delivery/
Why IBM Storage and NVIDIA are better together to accelerate AI innovation
IBM Storage Scale accelerates your infrastructure with a hybrid cloud by design for AI platform

Accelerate discovery
Multi-protocol parallel data access w/ up
AI Workloads to 310GB/s, 13M IOPs and NVIDIA
Servers with NVIDIA GPUs GPUDirect® support
AI Platform
Increase collaboration
IBM Storage Scale Data abstraction with remote data, non-
IBM storage and cloud data directly to
NVIDIA DGX BasePOD NVIDIA Systems

Support lower cost and

IBM Storage Scale System 6000 green initiatives
New QLC computational storage with
transparent archive optimization
NVIDIA DGX SuperPOD lobal ata Platform

etapp ell S b ect TAP

Safeguard data from the
unknown
P re ast Storage

NVIDIA DGX Grace Hopper Cyber enhanced 99.9999% availability

w/data catalog/namespace to enhance
9
trust
IBM Storage for Data and AI & NVIDIA GPU Solutions
A full spectrum of scalable AI solutions
Start small and scale predictably in response to business demand with the same IBM Storage Software

AI Entrant AI Medium AI Master AI Scaler

IBM Storage:
• Simple building block – simple,
scalable seamless upgrade path
• Enterprise features– performance,
scalability, data protection and
security
• Global Data Platform Services –
Integrate with current storage,
NVIDIA SuperPOD multi-site active-active, edge to
1 x DGX A100/H100 4 x DGX A100/H100 32 x DGX H100 cloud to core, single namespace
8 x DGX A100/H100
or or or across multiple installations
or
1x NVIDIA Certified Server 4 x NVIDIA Certified Servers 32 x NVIDIA Certified Servers
8x NVIDIA Certified Servers
• IBM expertise and services

• Successful deployments across the

globe –

Telco, Automobile, Banking and

• 1 x 3500 • 2 x 3500 • 2 x 3500
• 12 NVMe Half Populated Finance, Healthcare, Retail,
• Up to 125 GB/s read • Up to 250 GB/s read • Up to 250 GB/s read
3500 Academic/ Research and
• Up to 60 GB/s Read • 1 x 6000 • 2 x 6000 Public Sector
• 1 x 6000
• 1 x 6000 w/ 12 NVMe • Up to 310 GB/s read • Up to 310 GB/s read • Up to 620 GB/s read
• Up to 80+ GB/s read • Up to 155 GB/s write • Up to 155 GB/s write • Up to 310 GB/s write

A simple, scalable upgrade path

IBM Storage for Data and AI / © 2024 IBM Corporation 10
IBM Storage Scale
An integral part of the Vela architecture • Built completely on IBM Cloud infrastructure
• Dedicated IBM Storage Scale cluster on IBM Cloud
instances
AGG 1 AGG 2 AGG 3 AGG 4
• Cloud-Native Scale Access (CNSA) on GPU compute
cluster
• 200 nodes, 1600 GPUs
TOR 1 TOR 2 TOR 1 TOR 2 TOR 1 TOR 2 TOR 1 TOR 2 • Shared POSIX file system semantics
• One volume for training data
• Fit complete training dataset
• One volume for checkpointing
Rack 1 Rack 50 Rack 51 Rack 63
• Can accumulate ~10 days of checkpointing
• Large cost-effective data repository using
Remote IBM Cloud Object Storage
Mount • Two-tier architecture where AFM transparently moves
data between the object storage and file system
Fast IBM Storage Scale - Active File Management
File
System Transparent Tiering
Fast Storage Tier Cheap Storage Tier

Raw performance improvements:

• 3x write bandwidth compared to COS-only (15GB/s vs
Cost-
5GB/s)
IBM Cloud Object Storage
effective • 40x read bandwidth over NFS (40GB/s vs 1GB/s)
object
• Long-term storage for training checkpoints
storage
Training performance improvements:
• Storage Scale improved training step time variation by 5X 11
IBM Storage for Data and AI / © 2024 IBM Corporation
IBM Blue Vela
HGX “SuperPOD” Storage Fabric (IBM Cloud/ IBM Research)
SU# SU# SU#n
1 2
Accelerated GPU
Compute Scale
DGX H100 #n
Cluster

Storage Fabric – InfiniBand NDR Network Remote Mount

ESS3500 #1 ESS3500 #2 SSS6000#1 ESS3500 #n EMS

SSS6000#n
ESS3500 #3 ESS3500 #4
IBM Storage Scale System Cluster

• A leading global AI & Hybrid Cloud company • AI and Data platform to deliver enterprise AI service
• AI Supercomputer Scalable up to 5000 H100 HGX Systems
• Training LLM models with 100B+ parameters
• 1st Phase 1 SU with 32 HGX node
• 2nd phase will have 20 Scalable Units; 384 HGX nodes • Faster results – quality & speed of the training
models.
• ESS3500 for initial Phase 1deployment; 32 SSS6000 for Phase 2
• NDR is the Network Fabric for both compute & storage

12
IBM Storage Scale on ARM
GA with IBM Storage Scale 5.2.0
On April 26, 2024

Storage Scale User Group

May 13th, 2024, ISC, Hamburg Germany
Ingo Meents
IT Architect
Storage Scale Development
Why ARM? Increasing demand in AI & HPC IBM Storage for Data and AI

• Advanced RISC Machine

• Processor design licensed from ARM limited
• Simple RISC architecture 64 bit (and 32)
• Efficiency: embedded, mobile devices
• Growing into HPC, AI, ML
https://www.arm.com/markets/computing-infrastructure/high-performance-computing

TOP 500 list European Grace-CPU AWS

Fugaku super computer Processor Initiative DPU Graviton 2 and 3
https://www.top500.org/system/179807/ https://www.european-processor-initiative.eu/ https://www.nvidia.com/de-de/data-center/grace-cpu/ https://aws.amazon.com/de/ec2/graviton/

ARM Neoverse Family IBM Storage for Data and AI
Group of 64-bit ARM processor cores

Neoverse Series Intended Usage Level Instruction Set Examples

Ampere Altra (2-socket 80 cores)

Neoverse N-series
Data center usage N1 ARMv8.2-A AWS Graviton2 (64 cores)
(scale out performance)
Huawei Kunpeng 920

N2 ARMv9.0-A Alibaba Yitian 710

Neoverse E-series
Edge computing E1 ARMv8.2-A
(efficient throughout)
E2 ARMv9.0-A

AWS Graviton3 (64 cores)

Neoverse V-series High performance
V1 ARMv8.4-A Center for Dev of Advanced
(max performance) computing
Computing (C-DAC) AUM

Nvidia Grace (144 cores)

Nvidia Blue Field 3
V2 ARMv9.0-A
AWS Graviton 4
Google Axion

A64FX, Fujitsu HPC Armv8.2-A + SVE Supercomputer Fugaku

N3 and V3 have been presented in Feb 2024

**This is a general list of where ARM can be found, how it can be categorized and some examples. This is not a Scale support list.**
https://www.nextplatform.com/2023/09/13/other-than-nvidia-who-will-use-arms-neoverse-v2-core/
© Copyright IBM Corporation 2023 15
Hardware Examples IBM Storage for Data and AI

NVIDIA ARM HPC Developer Kit Server - Ampere® Altra® Max

ARM Server
Our development
• Single socket Ampere® Altra® Max or Altra® Processor & test platform
• Up to 2 x NVIDIA ® A100 PCIe Gen4 GPU cards
• Up to 2 x NVIDIA ® BlueField-2 DPUs
• 8-Channel RDIMM/LRDIMM DDR4, 16 x DIMMs

QuantaGrid S74G-2U

• NVIDIA GH200 Grace Hopper Superchip

• NVIDIA Grace 72 Arm® Neoverse V2 cores Grace Hopper
• 1 Processor
• NVIDIA® NVLink®-C2C 900GB/s
• 3 PCIe 5.0 x16 FHFL Dual Width slots

Blue Field3 DPU DPU

• Up to 16 Armv8.2+ A78 Hercules cores (64-bit)

• 16GB on-board DDR5

CPU Fujitsu A64FA AWS Graviton-Prozessor

https://www.fujitsu.com/global/products/ in Amazon EC2
computing/servers/supercomputer/a64fx/ https://aws.amazon.com/de/ec2/graviton/
Positive Feedback Basic tests successful
© Copyright IBM Corporation 2023 16
ARM @ Nvidia IBM Storage for Data and AI

Grace Grace Super Chip

Grace Hopper Super Chip

Grace Blackwell Super Chip

Just announced in March
at GTC24

https://www.nvidia.com
Grace = ARM CPU where our clients runs
Hopper or Blackwell = GPU where we can put data with GDS Grace Blackwell & Grace Hopper
© Copyright IBM Corporation 2023 17
ARM support with Storage Scale 5.2.0 IBM Storage for Data and AI

• Included
• SE package / install toolkit / rpm based install Where to get the SE package
• NSD client
• Scale base functionality (IO, policies, remote mounts, snapshots, quotas, • https://www.ibm.com/support/fixcentral
etc.) • Data Access and Data Management editions
• Manager roles: file system manager / token manager / cluster manager
• RDMA (IB or RoCE) including GDS
• Health Monitoring
• Target OS: RHEL 9.3 and Ubuntu 22.04 (ask to open RFE for customers
askign for RHEL 8)
• File audit logging, watch folders folders
• Call home
• GUI (can display ARM node, but cannot run on ARM)
• Excluded, but planned for future releases
• NSD servers Supported Operating Systems
• GNR/ECE
• Excluded • RHEL 9.3
• SNC • gpfs.base-5.2.0-0.aarch64.rpm
• Protocols
• BDA / HDFS • Ubuntu 22.04
• CNSA • gpfs.base_5.2.0-0_arm64.deb
• TCT

Thank you for using

How to Write an Effective RFP for B2B E-Commerce
No ratings yet
How to Write an Effective RFP for B2B E-Commerce
15 pages
IBM Turbonomic 101 Client Presentation
No ratings yet
IBM Turbonomic 101 Client Presentation
17 pages
Retail Data Model
No ratings yet
Retail Data Model
24 pages
DataStage Migration Webinar - v3FINAL
No ratings yet
DataStage Migration Webinar - v3FINAL
28 pages
Netezza OBIEE
No ratings yet
Netezza OBIEE
10 pages
DBA Survival Guide
No ratings yet
DBA Survival Guide
43 pages
Brkcom 1008
No ratings yet
Brkcom 1008
139 pages
f5 Ai Reference Architecture
No ratings yet
f5 Ai Reference Architecture
33 pages
Final Assignment_Binh_16250
No ratings yet
Final Assignment_Binh_16250
22 pages
External Inputs - ERP and Related Topics: - Jyotinath Ganguly - Director and Lead Architect, Capgemini - Q1 2020 - Part 1
No ratings yet
External Inputs - ERP and Related Topics: - Jyotinath Ganguly - Director and Lead Architect, Capgemini - Q1 2020 - Part 1
41 pages
Dell Networking
No ratings yet
Dell Networking
27 pages
ACI-D1
No ratings yet
ACI-D1
131 pages
051024_Nvidia_update_for_Lenovo[1]
No ratings yet
051024_Nvidia_update_for_Lenovo[1]
30 pages
Private&HybridCloud-Sales Training Deck
No ratings yet
Private&HybridCloud-Sales Training Deck
26 pages
Nutanix First Call Deck For PrivateCloud
No ratings yet
Nutanix First Call Deck For PrivateCloud
33 pages
solution-overview-base-command-manager
No ratings yet
solution-overview-base-command-manager
3 pages
IBM Power 9 Scale Out Servers - Presentation
No ratings yet
IBM Power 9 Scale Out Servers - Presentation
30 pages
LfmbtlGVQyaCZXCAP8gg - IDC - Why Developing and Deploying AI Technology On Workstations Makes Sense
No ratings yet
LfmbtlGVQyaCZXCAP8gg - IDC - Why Developing and Deploying AI Technology On Workstations Makes Sense
12 pages
Chapter 6 Cloud Native and Digital Transformation
No ratings yet
Chapter 6 Cloud Native and Digital Transformation
22 pages
Q1 What's New SDDC Presentation
No ratings yet
Q1 What's New SDDC Presentation
40 pages
AI at the EDGE (1)
No ratings yet
AI at the EDGE (1)
45 pages
08 Cloud Native and Transformation
No ratings yet
08 Cloud Native and Transformation
25 pages
5-D1-pm Cloud Pak For Data For Business Partners Jan 2020 V2
No ratings yet
5-D1-pm Cloud Pak For Data For Business Partners Jan 2020 V2
61 pages
A New IT Experience in The Era of Digital Transformation (PDFDrive)
No ratings yet
A New IT Experience in The Era of Digital Transformation (PDFDrive)
178 pages
AI Report
No ratings yet
AI Report
3 pages
Microcontrollers Nanoedgeai Solution Overview
No ratings yet
Microcontrollers Nanoedgeai Solution Overview
32 pages
OutsystemsOverview Gcc v1
No ratings yet
OutsystemsOverview Gcc v1
36 pages
Optimization White Paper 2009 FINAL
No ratings yet
Optimization White Paper 2009 FINAL
18 pages
Red Hat & NVIDIA for FSI - Final (1)
No ratings yet
Red Hat & NVIDIA for FSI - Final (1)
18 pages
Marketing Plan - Mba
No ratings yet
Marketing Plan - Mba
15 pages
08- Aruba Central Solution FULL
No ratings yet
08- Aruba Central Solution FULL
91 pages
The Case For Orchestration of Cloud Infrastructure
No ratings yet
The Case For Orchestration of Cloud Infrastructure
7 pages
The Modern Data & AI Playbook For Industrial Leaders
No ratings yet
The Modern Data & AI Playbook For Industrial Leaders
15 pages
Cloud Data Governance and Catalog - Azure Case Study
No ratings yet
Cloud Data Governance and Catalog - Azure Case Study
29 pages
What is Big Data
No ratings yet
What is Big Data
12 pages
Simplify and Automate Application Resource Management: With Cisco Intersight Workload Optimizer
No ratings yet
Simplify and Automate Application Resource Management: With Cisco Intersight Workload Optimizer
2 pages
Part 1 - SDDC
No ratings yet
Part 1 - SDDC
34 pages
SM Exclusive Services V1.3 - Compressed
No ratings yet
SM Exclusive Services V1.3 - Compressed
19 pages
a3be6c Architecting IT Estates for Maximum AI Impact
No ratings yet
a3be6c Architecting IT Estates for Maximum AI Impact
9 pages
Event 65300 AI in EPM Strategy and Overview
No ratings yet
Event 65300 AI in EPM Strategy and Overview
44 pages
LUMIQ DataSheet
No ratings yet
LUMIQ DataSheet
5 pages
5 - SimpliVity
No ratings yet
5 - SimpliVity
27 pages
Nutanix DRC Solution
No ratings yet
Nutanix DRC Solution
28 pages
08 HUAWEI CLOUD Enterprise Intelligence Application Platform....
No ratings yet
08 HUAWEI CLOUD Enterprise Intelligence Application Platform....
37 pages
08 HUAWEI CLOUD Enterprise Intelligence Application Platform....
No ratings yet
08 HUAWEI CLOUD Enterprise Intelligence Application Platform....
37 pages
Enterprise Ai: Building Powerful Enterprise AI Infrastructure: How To Design Enduring Infrastructure For AI
No ratings yet
Enterprise Ai: Building Powerful Enterprise AI Infrastructure: How To Design Enduring Infrastructure For AI
8 pages
TCS Serverless Computing
No ratings yet
TCS Serverless Computing
8 pages
Revolusi Industri 4.0
No ratings yet
Revolusi Industri 4.0
7 pages
How Risk Management Affects
No ratings yet
How Risk Management Affects
61 pages
International Business Overview: V.M.Kumar Sr. Vice President - International Business
No ratings yet
International Business Overview: V.M.Kumar Sr. Vice President - International Business
34 pages
ITRS OP5 Channel BattleCard 2019
No ratings yet
ITRS OP5 Channel BattleCard 2019
5 pages
Introducing Power IBM
No ratings yet
Introducing Power IBM
70 pages
Optra - Edge Infographic 2022
No ratings yet
Optra - Edge Infographic 2022
1 page
Cloud Computing and Saas
No ratings yet
Cloud Computing and Saas
18 pages
En DAY4 David Chen Building The AI Computing Platform For Pervasive Intelligence en
No ratings yet
En DAY4 David Chen Building The AI Computing Platform For Pervasive Intelligence en
8 pages
1 Pager Solutions - SenseGiz Technologies
No ratings yet
1 Pager Solutions - SenseGiz Technologies
15 pages
DA195_1727961089585001SblZ
No ratings yet
DA195_1727961089585001SblZ
6 pages
Containers or VMs - Deploy AI Workloads With Ease - 1647197291151001kben
No ratings yet
Containers or VMs - Deploy AI Workloads With Ease - 1647197291151001kben
26 pages
Data & Analytics Modernization - Final
100% (1)
Data & Analytics Modernization - Final
34 pages
Webinar_-Harnessing-AI-for-Smarter-Compliance
No ratings yet
Webinar_-Harnessing-AI-for-Smarter-Compliance
26 pages
Mastering Lead Generation with DeepSeek AI/ A Comprehensive Guide to Transforming Your Sales Strategy
From Everand
Mastering Lead Generation with DeepSeek AI/ A Comprehensive Guide to Transforming Your Sales Strategy
Robert Cullen
No ratings yet
Essays on Infrastructure-as-code
From Everand
Essays on Infrastructure-as-code
Ravi Rajamani
No ratings yet
HPE Compute Certification Guide: 444 Practice Questions for the Advanced HPE1-H02 Exam
From Everand
HPE Compute Certification Guide: 444 Practice Questions for the Advanced HPE1-H02 Exam
Steve Brown
No ratings yet
04-Layer 2—LAN Switching Configuration Guide-book
No ratings yet
04-Layer 2—LAN Switching Configuration Guide-book
221 pages
Hyland-OnBase-Gartner-Reprint_2019
No ratings yet
Hyland-OnBase-Gartner-Reprint_2019
27 pages
06 GB0-392 Exam Syllabus for the H3CSE-RS-NSO
No ratings yet
06 GB0-392 Exam Syllabus for the H3CSE-RS-NSO
4 pages
6643690f56a51719abfa0901_Gartner Market Guide for NDR
No ratings yet
6643690f56a51719abfa0901_Gartner Market Guide for NDR
18 pages
Your Definitive Guide to Network Detection and Response
No ratings yet
Your Definitive Guide to Network Detection and Response
16 pages
HCI goes main stream
No ratings yet
HCI goes main stream
6 pages
01 Training Outline for the H3CSE-RS-SW Advanced Routing _ Switching Technology 1
No ratings yet
01 Training Outline for the H3CSE-RS-SW Advanced Routing _ Switching Technology 1
4 pages
The Gorilla Guide to HCI - Technical Overview-a00078608enw
No ratings yet
The Gorilla Guide to HCI - Technical Overview-a00078608enw
81 pages
A00009581enw Hyperconverged Infrastructure For Data Protection PDF
No ratings yet
A00009581enw Hyperconverged Infrastructure For Data Protection PDF
30 pages
H3C MSR3600 Series Router Datasheet
No ratings yet
H3C MSR3600 Series Router Datasheet
15 pages
06-Layer 2—WAN Access Configuration Guide-book
No ratings yet
06-Layer 2—WAN Access Configuration Guide-book
103 pages
Tolly223136-H3C MSR3620-X1 Performance Features
No ratings yet
Tolly223136-H3C MSR3620-X1 Performance Features
12 pages
The Forrester Wave™ Agile Content Management Systems (CMSes), Q1 2021-1
No ratings yet
The Forrester Wave™ Agile Content Management Systems (CMSes), Q1 2021-1
18 pages
Gorilla Guide to Hyperconverged Infrastructure for Tier 1Dedicated Apps-a00015105enw
No ratings yet
Gorilla Guide to Hyperconverged Infrastructure for Tier 1Dedicated Apps-a00015105enw
19 pages
B2B E-Commerce RFP Template by Liferay
No ratings yet
B2B E-Commerce RFP Template by Liferay
30 pages
Gorilla Guide to Hyperconverged Infrastructure for Cloud-a00009575enw
No ratings yet
Gorilla Guide to Hyperconverged Infrastructure for Cloud-a00009575enw
26 pages
Ant Media Server Enterprise and Community
No ratings yet
Ant Media Server Enterprise and Community
5 pages
en-maevex-6152-encoder-datasheet
No ratings yet
en-maevex-6152-encoder-datasheet
4 pages
en_maevex-6100-datasheet
No ratings yet
en_maevex-6100-datasheet
4 pages
C7000 BladeSystem EOSL
No ratings yet
C7000 BladeSystem EOSL
13 pages
5.0_Fund Contribution
No ratings yet
5.0_Fund Contribution
2 pages
6.1_FAQ21_CCC_Checklist_for_load_test
No ratings yet
6.1_FAQ21_CCC_Checklist_for_load_test
2 pages
6.2_FAQ22_CCC_Checklist_for_security_risk_assessment_and_audit
No ratings yet
6.2_FAQ22_CCC_Checklist_for_security_risk_assessment_and_audit
2 pages
4.0_Getting Started
No ratings yet
4.0_Getting Started
3 pages
6.0_FAQs
No ratings yet
6.0_FAQs
27 pages
7.0_Contact Us
No ratings yet
7.0_Contact Us
2 pages
3.3_Database-as-a-Service (DBaaS)
No ratings yet
3.3_Database-as-a-Service (DBaaS)
2 pages
3.0 Services Information
No ratings yet
3.0 Services Information
2 pages
3.1_Infrastructure-as-a-Service (IaaS)
No ratings yet
3.1_Infrastructure-as-a-Service (IaaS)
2 pages
JGK
No ratings yet
JGK
10 pages
Pearson Vue System Requirements and Frequently Asked Questions
No ratings yet
Pearson Vue System Requirements and Frequently Asked Questions
5 pages
Central Processing Unit
No ratings yet
Central Processing Unit
49 pages
ADFS Docs
No ratings yet
ADFS Docs
64 pages
VRRPv3 Protocol Support
No ratings yet
VRRPv3 Protocol Support
10 pages
Bluetooth TRX Module New
No ratings yet
Bluetooth TRX Module New
31 pages
4 Bit Counter With Test Bench
100% (1)
4 Bit Counter With Test Bench
3 pages
2010-09-18 22.21.06.610 Formal - Assessment (Initial) .WinSAT
No ratings yet
2010-09-18 22.21.06.610 Formal - Assessment (Initial) .WinSAT
24 pages
Year 9 Data Storage Worksheet 2
No ratings yet
Year 9 Data Storage Worksheet 2
2 pages
Introducing The Basics of Service Management Facility (SMF) On Oracle Solaris 11
No ratings yet
Introducing The Basics of Service Management Facility (SMF) On Oracle Solaris 11
8 pages
Manual HP 360
No ratings yet
Manual HP 360
77 pages
Cs 101 Assignment..
No ratings yet
Cs 101 Assignment..
4 pages
Troubleshoot IPv4 and IPv6 Static Routing
No ratings yet
Troubleshoot IPv4 and IPv6 Static Routing
13 pages
Lab 2: Developing and Debugging C Programs in Codewarrior For The Hcs12 Microcontroller
No ratings yet
Lab 2: Developing and Debugging C Programs in Codewarrior For The Hcs12 Microcontroller
12 pages
Red Hat Enterprise Linux 7: Migration Planning Guide
No ratings yet
Red Hat Enterprise Linux 7: Migration Planning Guide
89 pages
ECC Injection Tests in Traveo MCUs
No ratings yet
ECC Injection Tests in Traveo MCUs
65 pages
Classification of Computers
No ratings yet
Classification of Computers
28 pages
C3 - Instruction Set and Assembly Language Programming (PEMANTAUAN)
No ratings yet
C3 - Instruction Set and Assembly Language Programming (PEMANTAUAN)
89 pages
Event Id - 18
No ratings yet
Event Id - 18
2 pages
CS SRG Paper-1B-answers
No ratings yet
CS SRG Paper-1B-answers
8 pages
Skyline 20240502
No ratings yet
Skyline 20240502
25 pages
Linux Academy PDF
No ratings yet
Linux Academy PDF
21 pages
C++ - Understanding OpenGL - Stack Overflow
No ratings yet
C++ - Understanding OpenGL - Stack Overflow
4 pages
20VL004 - FPGA Based System Design: Embedded Processor Hardware Design
No ratings yet
20VL004 - FPGA Based System Design: Embedded Processor Hardware Design
2 pages
The 8085 and 8086 Microprocessors
No ratings yet
The 8085 and 8086 Microprocessors
11 pages
Linux Mini Shell
No ratings yet
Linux Mini Shell
7 pages
DX Diag
No ratings yet
DX Diag
28 pages
Installing Power Factory
No ratings yet
Installing Power Factory
2 pages

Accelerating AI with Storage Scale

Uploaded by

Accelerating AI with Storage Scale

Uploaded by

Accelerating AI with

Storage Scale User Group

Difficult for AI Optimized for AI

Data acceleration and sharing with governance

Client IT environment Client IT environment

• Siloed and slow-to-adopt innovation • Continuous and speedy innovation

AI acceleration and collaboration

Efficient GPU support

IBM Storage for Data and AI / © 2024 IBM Corporation

• Lot of I/O (Yes, customers will downplay it)

• 2:1 Read: Write

• Most important are Read & Re-Read

• Writes are massive with large parameters models with 175B+

• High Performance Parallel File Storage (PFS) is a scratch space, not

Why IBM Storage for NVIDIA GPUs?

Highest Performance Platform

Collapse Layers & Simplify Data Integration

Support lower cost and

etapp ell S b ect TAP

NVIDIA DGX Grace Hopper Cyber enhanced 99.9999% availability

AI Entrant AI Medium AI Master AI Scaler

• Successful deployments across the

Telco, Automobile, Banking and

A simple, scalable upgrade path

Raw performance improvements:

Storage Fabric – InfiniBand NDR Network Remote Mount

ESS3500 #1 ESS3500 #2 SSS6000#1 ESS3500 #n EMS

Storage Scale User Group

• Advanced RISC Machine

TOP 500 list European Grace-CPU AWS

© Copyright IBM Corporation 2023 14

Neoverse Series Intended Usage Level Instruction Set Examples

Ampere Altra (2-socket 80 cores)

N2 ARMv9.0-A Alibaba Yitian 710

AWS Graviton3 (64 cores)

Nvidia Grace (144 cores)

A64FX, Fujitsu HPC Armv8.2-A + SVE Supercomputer Fugaku

N3 and V3 have been presented in Feb 2024

NVIDIA ARM HPC Developer Kit Server - Ampere® Altra® Max

• NVIDIA GH200 Grace Hopper Superchip

Blue Field3 DPU DPU

• Up to 16 Armv8.2+ A78 Hercules cores (64-bit)

CPU Fujitsu A64FA AWS Graviton-Prozessor

Grace Grace Super Chip

Grace Blackwell Super Chip

© Copyright IBM Corporation 2023 18

You might also like