0% found this document useful (0 votes)
15 views

Igor - T Platforms IDC Workshop - v1.0

Uploaded by

sumitwalia177
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Igor - T Platforms IDC Workshop - v1.0

Uploaded by

sumitwalia177
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 14

1 petaFLOPS+ in 10 racks

TB2–TL system announcement

Rev 1A
T-PLATFORMS
Company Facts
 Russia’s leading developer of turn-key
solutions for supercomputing
 160 employees
 R&D teams in Russia, Taiwan and Ukraine
 6 installations in the global Top500
 MSU Lomonosov cluster - #13 on TOP500
 50% of computational power in the Top50 list of Russia & CIS
 ~200 supercomputer installations
 Joint R&D with scientific community, dozens hardware HPC patents
 Manufacturing facility & Tech. Center in Hannover, Germany
 Announce Highest Density Computation with Nvidia:
 TB2-TL System
TB2-TL SYSTEM OVERVIEW
1 Petaflops+ in only 10 42U racks
 Chassis Form factor: 7U enclosure with 16 blades
 Record-breaking compute density
 16 nodes with 32 Tesla X2070 GPU’s and 32 Intel Xeon
L5600 series CPUs
 6 enclosures in standard 19” rack
 17.5TF per enclosure/~105TF per rack (DP Peak)
 Record-breaking performance-per-watt:
 ~1450MFLOPS/W (1.45GFLOPS/W)
 12KW power consumption per chassis under full load

 Dedicated global barrier network and global interrupt network


 Integrated QDR InfiniBand and Gigabit Ethernet switches
 Near-real time management
TB2-TL System
Look & Feel

T-Blade 2 enclosure with 16 TL nodes , 2 IB T-Blade 2 enclosure – front view


switches and management modules – rear view

TL compute node with two Tesla X2070 GPU’s


TL blade
Main Features

TL blade mainboard
 Two Intel Xeon L5630 CPUs
 12 or 24 GB DDR3-1333 RAM options
 Two Intel 5520 chips + ICH10
 Two MXM connectors for Tesla X2070 GPU
 Two QDR InfiniBand dedicated ports (one per X2070 Module)
 Optional Global Barrier and Global Interrupt networks

NVIDIA Tesla™ X2070 GPU


 GPU @1.15GHz with 448 CUDA cores
 6GB GDDR5 with ECC @ 1.566GHz with 384 bit memory interface
 Power consumption: <=225W
NVIDIA Tesla 2070 (Fermi) module
Compute blades comparison
XN TL
(Intel Xeon-based blade) (X2070-based blade)
Blades in 7U enclosure 16 16
X86 CPU’s per blade 4 (2 x 2 socket nodes) 2 (1 x 2 socket node)
X86 CPU Type Up to Intel Xeon E5670 Up to Intel Xeon L5630
GPU modules per blade 0 Up to 2
GPU modules per enclosure 0 Up to 32
IB QDR ports 2 (1 per node) 2
Memory per blade 24 or 48GB DDR3 12 or 24GB DDR3
HDD’s 0 0(TBD)
Peak performance (DP) ~4.5TF ~17.5TF 4x perf.
Peak power consumption ~11.2KW ~12KW
per enclosure
Applications availability (ann. 21/09 GTC)
parallelized with MPI for multi-GPU Cluster

Matlab Parallel Computing Toolkit

Amber version 11

Ansys Mechanical, r13

3ds Max iRay (Near Real Time Ray Tracing)


PGI Cuda-X86 compiler for universal
deployment of Cuda applications
The Clustrx Operating
System
Scalable and Reliable Next Generation
Operating System
for
Petaflop and Exaflop computing
Clustrx Subsystems
1. Clustrx Watch –
• real time monitoring and control
• Computing nodes & management nodes & Infrastructure

2. dConf - Cluster-wide, decentralized distributed


storage for configuration data

3. Resource manager –
• POSIX-compliant, modular, scalable, GRID-ready

4. Network boot & provisioning –


• infrastructure to support any number of computing nodes
• Booting Clustrx kernel
• Any other stock linux (RH, SUSE, …)
• Windows HPC
Heterogeneous architectures
Future development
 Architecture-independent system management
 Hybrid MPI
 Supports accelerated nodes
 Virtualization of GPGPU hardware
 Main direction of further development
TB2-TL
Summary
 Record-breaking compute density: 105TF (DP) per 42U 19” rack
 Record-breaking performance-per-watt: ~1450MFLOPS/W
 Full PCIe Gen2 bandwidth, dedicated IB port per GPU
 Mix & Match CPU and CPU/GPU blades for best utilization
 Petascale-ready
 Proven blade infrastructure
(utilized in 420TF Lomonosov cluster: #13 on TOP500)
 Clustrx OS to support heterogeneous computing
 Ready to order: November 2010
 Shipment to select customers: Q4’2010
 GA: Q1’2011
 Pricing and availability info: [email protected]

You might also like