Machine Learning / Science / Technology

US Gets Bragging Rights for World’s First Exascale System

2 Jun 2022 9:17am, by

The first exascale supercomputer has landed on earth.

A new supercomputer called Frontier at the U.S. Department of Energy’s Oak Ridge National Laboratory (ORNL) registered a sustained performance of 1.1 exaflops on the LINPACK benchmark.

That benchmark put Frontier at the top of the Top500 chart of fastest supercomputers in the world, which was released on Tuesday. The system knocked out the Fugaku system at the RIKEN Center for Computational Science in Kobe, Japan, which held on to the top spot for two straight years.

Race to the Top

The exaflop measure includes 18 zeros, and the race between the U.S., China, Japan and Europe to reach the milestone first has raged for more than a decade. One exaflop is about 1,000 times faster than a petaflop, which was first achieved by IBM’s Roadrunner supercomputer at Los Alamos National Laboratory in 2008.

The Frontier supercomputer is based on the HPE Cray EX235a architecture and has AMD’s EPYC 64C CPUs and Instinct MI250X graphics processors. The system has a total core count of 8,730,112, which include the CPUs and GPUs. It has 700 petaflops of storage spread out over cabinets.

The Fugaku system in Japan slipped into the second spot, delivering the performance of 442 petaflops. The system uses A64FX processor based on the 64-bit ARM-based architecture.

In the third spot was the new LUMI system in Finland, which belongs to EuroHPC, a private-public supercomputing initiative led by the EU. The system delivered performance of 152 petaflops, and uses AMD’s Epyc chips.

AMD was a big winner in the Top500 list, with two of the top three systems on Top500 using the company’s Epyc chips. AMD competes in the x86 server market with Intel, which expects to put its chips in an exascale supercomputer called Aurora that will come online soon. Intel CEO Pat Gelsinger said the company’s Xeon server chip and GPU accelerator called Ponte Vecchio will push supercomputing beyond the 2 exaflop mark.

While the U.S. may have public bragging rights with one exascale computer, it is rumored that China had secretly deployed exascale systems last year. China has two systems in the top-10, including the Sunway TaihuLight in the sixth spot, and Tianhe-2A in the ninth spot, of which neither are exascale systems. China is developing its own chips and hiding its technological progress in light of trade wars with the U.S., which has also banned exports to China of the latest supercomputing chips made by companies like Intel.

AI and Quantum Computing

The Top500 list could be at a crossroads as specialized applications are offloaded to alternative systems such as AI accelerators, and soon, quantum computing. AI systems apply a new form of computing based on probabilities and associations, and the performance of such systems measure up differently than the logical style of conventional computing. ML Commons’ MLPerf has emerged as a leading benchmark for AI applications on various accelerators including GPUs, ASICs and other chips.

The Liebniz Supercomputing Centre (LRZ) near Munich is trying out new systems that include quantum computers and a specialized machine-learning system jointly made by HPE and Cerebras, which has made a specialized AI chip with 850,000 and is the size of a wafer.

“If you do LINPACK-like applications, it’s a good measure. But if you don’t do LINPACK-like application that it’s not very useful,” said Dr. Dieter Kranzlmüller, director of the LRZ.

LRZ is now moving away from evaluating the raw performance of supercomputers and, instead, looking at workloads such as AI to for quicker scientific computing results.

“What I would want to do is really to make sure that the infrastructure fits what the application users need, which is exactly how we do our procurements and which is what we want to explore,” Kranzlmüller said.

Nvidia is already pitching an alternative measure of supercomputing performance for AI workloads. At the ongoing International Supercomputing Conference. Nvidia and Los Alamos National Laboratory announced a new system called Venado that will exceed 10 exaflops of AI performance, which will come online in 2023 or 2024.

Feature and inset image credit: ORNL, U.S. Dept. of Energy