Technology challenges in exascale supercomputing

Abdulrahman Azab
TrackTrack 2
DescriptionExascale computing is a measure of computer performance and refers to systems being able to perform at least an exaFLOP. In June 2022, the world's first public exascale computer, Frontier, became the world’s fastest supercomputer

Exascale/pre-exascale computing should “not” be thought of just as a huge number of floating point units, so that HPC centers can compete on who has the largest supercomputer in “size”. If an exascale “sized” supercomputer is not able to run an exascale “application”, does this actually make it an exascale supercomputer? or it can be simply described as a set of smaller supercomputers located in one room?

There are several technical challenges in the way to actually reach exascale computing. One is how to actually develop exascale applications i.e. with billion-way parallelism == 1 billion floating point units performing 1 billion calculations per second. Do such applications actually exist, and if not then what are the possibilities for existing large-scale applications being capable of reaching this in the future

A second challenge is handling power consumption for exascale computing. A theoretical analysis by the Exascale Study Group showed that with the traditional technologies, a 1-exaflop system may consume more than 600 megawatts.

A third challenge is the “memory wall” challenge. If exascale/pre-exascale systems are supposed to be the “fastest”, how can we manage the time and energy required to move data from memory into the compute units, and from the compute units out to storage, not to be larger than the time and energy required to perform a floating-point operation on that data.

The presentation will address the above, in addition to other general and a few system-specific challenges and how they are currently handled in Frontier and EuroHPC petascale systems

Presentation documents

All talks