Upload
zaza
View
29
Download
0
Embed Size (px)
DESCRIPTION
Computación de Alto Rendimiento en IBM Research Septiembre 2006. Dr. José G. Castaños [email protected] IBM T.J. Watson Research Center. El Projecto “Blue Gene”. En Diciembre 1999, IBM Research anuncia el Blue Gene Producir nuevos adelantos en simulaciones biomoleculares - PowerPoint PPT Presentation
Citation preview
© 2006 IBM Corporation
Computación de Alto Rendimientoen IBM ResearchSeptiembre 2006
Dr. José G. Castañ[email protected] T.J. Watson Research Center
IBM Research
© 2006 IBM Corporation
El Projecto “Blue Gene”
En Diciembre 1999, IBM Research anuncia el Blue Gene Producir nuevos adelantos en simulaciones biomoleculares Investigar nueva tecnologias en hardware y software para producir computadoras
de alto rendimiento
Blue Gene sigue un enfoque modular, donde el bloque basico (o célula) puede copiarse ad infinitum
Procesadores de bajo consumo – permite rendimientos combinados mas altos
– PowerPC 440 System-on-a-chip ofrece ventajas en costo/rendimiento
– Menor complejidad– Alta densidad (2048 procesadores por rack, enfriado por aire)– Redes integradas para gran escala
Ambiente de software familiar, simplificado para HPC Mucha atención a RAS (“reliability, availability, and serviceability”) en todo el
sistema
IBM Research
© 2006 IBM Corporation
El Chip Blue Gene/L (ASIC)
PLB (4:1)
“Double FPU”
Ethernet Gbit
JTAGAccess
144 bit wide DDR256/512MB
JTAG
Gbit Ethernet
440 CPU
440 CPUI/O proc
L2
L2
MultiportedSharedSRAM Buffer
Torus
DDR Control with ECC
SharedL3 directoryfor EDRAM
Includes ECC
4MB EDRAM
L3 CacheorMemory
6 out and6 in, each at 1.4 Gbit/s link
256
256
1024+144 ECC256
128
128
32k/32k L1
32k/32k L1
“Double FPU”
256
snoop
Tree
3 out and3 in, each at 2.8 Gbit/s link
GlobalInterrupt
4 global barriers orinterrupts
128
• IBM CU-11, 0.13 µm• 11 x 11 mm die size• 25 x 32 mm CBGA• 474 pins, 328 signal• 1.5/2.5 Volt
IBM Research
© 2006 IBM Corporation
Arquitectura de Blue Gene/L
IBM Research
© 2006 IBM Corporation
Blue Gene/L en Lawrence Livermore National Laboratory
BG/L
Number of Racks 64
Number of Nodes 65536
Processor Frequency 700 Mhz
Peak Performance
Rack 5.7 TF
Machine 360 TF
Linpack 280 TF
App 101 TF
Memory Rack 256 GB
Machine 16 TB
Power ~2 MW
Size 250 sq.m.
Bisection Bandwidth 700 GB/s
Cables Number 5,000
Length 25 km
Storage Disk 200 TB
IBM Research
© 2006 IBM Corporation
Source: www.top500.org
# Ven-dor
Rmax TFlops
Installation
1 IBM 280.6DOE/NSSA/LLNL
(64 racks BlueGene/L)
2 IBM 91.2BlueGene at Watson
(20 racks BlueGene/L)
3 IBM 75.8ASC Purple LLNL
(1280 nodes p5 575)
4 SGI 51.9 NASA/Columbia (Itanium2)
5 Bull 42.90CEA/DAM Tera10
(Itanium2)
6 Dell 38.27Sandia -Thunderbird(EM64T/Infiniband)
7 Sun 38.18Tsubame Galaxy TiTech
(Opteron/Infiniband)
8 IBM 37.33FRZ – Juelich
(8 racks BlueGene/L)
9 Cray 36.19Sandia – Red Storm
(XT3 Opteron)
10 NEC 35.86Japan Earth Simulator
(NEC)
# Ven-dor
Rmax TFlops
Instalation
11 IBM 27.91MareNostrum Barcelona
Supercomputer (JS20)
12 IBM 27.45ASTRON Netherlands (6 racks BlueGene/L)
13 Cray 20.52ORNL – Jaguar (XT3 Opteron)
14 Calif Dig 19.94LLNL
(Itanium2)
15 IBM 18.20AIST - Japan
(4 rack BlueGene/L)
16 IBM 18.20EPFL - Switzerland(4 rack BlueGene/L)
17 IBM 18.20KEK – Japan
(4 rack BlueGene/L)
18 IBM 18.20KEK – Japan
(4 rack BlueGene/L)
19 IBM 18.20IBM – On Demand Ctr(4 rack BlueGene/L)
20 Cray 16.97ERDC MSRC
(Cray XT3 Opteron)
Blue Gene en los Top500
IBM Research
© 2006 IBM Corporation
Motivacion del Software de Sistema
Nodos de Computación dedicados a ejecutar una sola aplicacion, and casi nada más
Compute node kernel (CNK) Simplicidad!
Nodos de I/O corren Linux and proveen servicios de OS – files, sockets, comenzar programas, señales, debugging, and fianalización de tareas
Solution estandar: Linux
Nodos de Servicio ejecutan todos los servicios de administración (e.g., latidos, checkean errores)
transparente para el programa de los usuarios
IBM Research
© 2006 IBM Corporation
Blue Gene/L: Architectura del Software de Sistema
Functional Gigabit Ethernet
Functional Gigabit Ethernet
I/O Node 0
Linux
ciod
C-Node 0
CNK
I/O Node 1023
Linux
ciod
C-Node 0
CNK
C-Node 63
CNK
C-Node 63
CNK
Control Gigabit
Ethernet
Control Gigabit
Ethernet
IDo chip
LoadLeveler
SystemConsole
CMCS
JTAG
torus
tree
DB2
Front-endNodes
Pset 1023
Pset 0
I2C
FileServers
fs client
fs client
Service Node
app app
appapp
IBM Research
© 2006 IBM Corporation
Classical MD – ddcMD2005 Gordon Bell Prize Winner!!
Scalable, general purpose code for performing classical molecular dynamics (MD) simulations using highly accurate MGPT potentials
MGPT semi-empirical potentials, based on a rigorous expansion of many body terms in the total energy, are needed in to quantitatively investigate dynamic behavior of d-shell and f-shell metals.
524 million atom simulations on 64K nodes achieved 101.5 TF/s sustained. Superb strong and weak scaling for full machine - (“very impressive machine” says PI)
Visualization of important scientific findings already achieved on BG/L: Molten Ta at 5000K demonstrates solidification during isothermal compression to 250 GPa
2,048,000 Tantalum atoms
IBM Research
© 2006 IBM Corporation
Resolidificación Rapida del Tantalum (ddcMD)
Nucleation of solid is initiated at multiple independent sites throughout each sample cellGrowth of solid grains initiates independently, but soon leads to grain boundaries which span the simulation cell: size of cell is now influencing continued growth2,048,000 simulation recently performed indicates formation of many more grains