After more than a decade of planning and innovation, the tech world is standing on the cusp of the era of exascale computing. And if Intel has its way, the era won’t be very long.
At the recent developer-focused Intel Innovation 2021 event — the chipmaker’s revival of its popular Intel Developer Forum, which was last held in 2016 – CEO Pat Gelsinger, at the tail end of his keynote, said that the upcoming Aurora supercomputer that is based on Intel technology will exceed 2 exaflops of peak compute performance, twice what was expected. That product will launch next year.
However, during a question-and-answer session with analysts and journalists after the keynote, Gelsinger said that Intel engineers already are deep into working toward the next stage of compute, aiming for the zettascale level. The CEO noted that Intel’s goal is to be the first to reach zettascale computing “by a wide margin.” It expects to get there by 2027.
Said Gelsinger, “We are laying out, as part of the zetta initiative, what we have to do in the processor, in the fabric, in the interconnect and in the memory architecture, what we have to do for the accelerators and the software architecture to do it.”
The Race to Exascale
Gelsinger’s promise of 2 exaflops of performance from Aurora in 2022, and the goal of hitting the zettascale level in five years, comes as the rest of the world also pushes its way into the next stage of computing.
In October, The Next Platform, citing unnamed sources, reported that China already has reached the exascale computing level in two systems, including the Sunway Oceanlite supercomputer, the successor to the current Sunway TaihuLight system, which is ranked fourth on the Top500 list of the world’s fastest supercomputers. Sunway Oceanlite reportedly hit a peak performance of 1.3 exaflops.
The other supercomputer is housed at the National University of Defense Technology, in China.
Both systems use homegrown processor and accelerator technologies, part of China’s push to lessen its reliance on CPUs and other data center technologies from outside the country.
In addition, SiPearl, a chipmaker that is designing a low-power and high-performance processor for supercomputers in the European Union, announced last week that it is working with Intel to offer a joint solution for the region’s exascale computing efforts.
The two companies will bring together SiPearl’s Rhea CPU with Intel’s Ponte Vecchio GPU to create a high-performance node that can be leveraged for exascale systems.
The push toward exascale computing has for years fueled an ongoing global competition — particularly between the United States and China — to be the first with a supercomputer that can deliver exaflops of performance. The belief is that whatever country becomes the leader in exascale computing will have an advantage in everything from military and scientific research, to business innovation, to the economy.
“It’s great for bragging rights and would likely benefit numerous government-funded research labs and projects,” Charles King, principal analyst with Pund-IT, told The New Stack.
The United States is planning to launch three exascale systems in the coming years. Frontier, which is being installed at the Oak Ridge National Laboratory, is being built by Hewlett Packard Enterprise and will include more than 9,000 Cray XE nodes, with each node being powered by an AMD Epyc processor and four AMD Radeon Instinct MI200 GPUs. It will deliver 1.5 exaflops of performance.
Aurora was announced in 2015 and was scheduled to be launched three years later as a 180-petaflop system. It initially was to be powered by Intel’s many-core Xeon Phi processors. However, Intel canceled its Xeon Phi chips in favor of GPUs and, in the delay, Aurora was redesigned to become an exascale system.
It was further delayed as Intel struggled to move to 7-nanometer chips and the need to build the Ponte Vecchio GPUs both internally and via chip foundry Taiwan Semiconductor Manufacturing Corp. (TSMC).
Aurora will be built at the Argonne National Lab.
In 2023, the United States is scheduled to launch El Capitan, an exascale supercomputer commissioned by the Department of Energy, at the Lawrence Livermore National Lab. It will feature AMD CPUs and GPUs and also will exceed 2 exaflops of peak performance.
Ponte Vecchio Performance
According to Intel officials, the company was able to push the expected peak performance of Aurora to 2 exaflops due to the better-than-expected performance of its compute GPUs and the capabilities of its upcoming “Sapphire Rapids” Xeon server chips.
“The tech industry has always had a thing for three orders of magnitude performance barriers — kilo-, mega-, giga-, tera-, peta- and now, exaflops,” Pund-IT’s King said. “So delivering a system that delivers 2x the performance of what until recently was massively complex and difficult to attain would be quite a feather in Intel’s cap and also set the goal for what serious competitors need to pursue and achieve.”
The benefits of exascale computing will cascade down from high-performance computing (HPC) environments to enterprises, which are struggling with the massive amounts of data being generated and the need to run such modern workloads as artificial intelligence (AI), machine learning and data analytics.
The result will be that businesses “eventually will have access to powerful tools that were once relegated to deep-pocketed research labs,” the analyst said. “As a result, companies are affordably making use of digitized, highly complex processes and workflows, including oil and gas exploration, immersive product design, supply chain modeling and risk management.”
Developers Will Benefit from Exascale
In addition, as more enterprises get access to such HPC-level computing, more developers also will get access to the huge amount of computer power that exascale systems will bring, King said.
Intel’s efforts to accelerate the move to zettabyte-scale computing within the next few years will be remarkable if it’s achieved, he said.
“In 2008, EMC published the first of its ‘Exploding Digital Universe’ studies, which projected that the total amount of data produced worldwide per year would exceed 1 zettabyte in 2009 or 2010,” he said.
“A little over a decade later, one of the smartest CEOs in IT is talking about being able to practically store and manage data volumes of that size” by 2027, King said.
He added, “Since companies are continuing to massively grow the amounts of information they create and collect, planning for a zettabyte future is a good bet and Intel is well-positioned to pursue and achieve Gelsinger’s vision.”
A Reinvigorated Intel
Intel’s ambitious plans for an Aurora that will deliver 2 exaflops of performance — and reaching zettabyte computing a few years later — come after some difficult years for Intel, where it struggled with its manufacturing capabilities that led to missed deadlines and delayed product launches.
Since becoming the CEO in February, Gelsinger has recommitted to Intel’s manufacturing capabilities, investing tens of millions of dollars in new foundries and expanding production in both the United States and Europe.
“Following years when many in the industry claimed Intel had lost its competitive mojo and was in danger of being overtaken and overcome by serious competitors and upstart technologies, Gelsinger’s presentation [at Intel Innovation] qualifies as a substantial turnaround in terms of the company’s messaging and ambitions,” King said.
Dell Technologies, Hewlett Packard Enterprise and VMware are sponsors of The New Stack.