AWS Pushes Forward Its Custom Chip Efforts with Graviton3
Amazon Web Services last month expanded its silicon efforts, highlighted by the introduction of the giant cloud provider’s third-generation Arm-based Graviton processor that will power new cloud instances aimed at compute-intensive workloads like high-performance computing (HPC), scientific modeling, analytics and CPU-based machine learning inferencing.
At AWS’ re:Invent conference, the company unveiled of the Graviton3 processors, which currently are in preview, and the EC2 C7g instances that will run on them. At the same time, AWS CEO Adam Selipsky also announced new Trn1 instances running on the company’s year-old Trainium chips aimed at machine learning training workloads and boasted about the price-performance capabilities of Inf1 instances launched in 2019 and leveraging the Inferentia chips, for machine learning inferencing tasks.
The company even announced storage-optimized EC2 instances — Im4gn/Is4gen/I4i — based on its Nitro solid-state drives (SSDs) for improved storage performance for I/O-intensive workloads in the AWS cloud.
AWS’ Focus on Silicon
The introduction of the latest processors and EC2 instances is the latest demonstration of AWS’ years-long efforts to build its own processors to run in its cloud instances as well as in its Outposts infrastructure, which are designed to deliver AWS services and connectivity to on-premises data centers at a time when enterprise adoption of hybrid cloud models is growing rapidly.
All this comes five years after AWS bought Israeli startup Annapurna Labs in 2016, making it the foundation of its chip-making efforts.
“AWS has invested … years in its own silicon starting with Nitro, expanding to general-purpose Graviton, Inferentia for inference and now Trainium for training,” Patrick Moorhead, principal analyst with Moor Insights and Strategy, told The New Stack. “AWS can pick and choose every feature it wants and every feature it doesn’t need to take advantage of its own software. It can also optimize its I/O for its specific networking and storage. At scale, this should allow it to provide compute at lower costs and in certain circumstances, higher performance.”
Intel, AMD and Nvidia serve a broader market across multitudes of environments and some customers don’t use every feature, Moorhead said. AWS is using home-grown compute to differentiate its instances.
Price-Performance Is Key
During his keynote, Selipsky stressed the price-performance benefits enterprises will see running such workloads as AI, machine learning and analytics on instances leveraging the AWS chips rather than x86 CPUs from Intel and AMD or GPUs from those vendors and Nvidia.
“With both Trainium and Inferentia, customers can have the best price-performance for machine learning, from scaling training workloads to accelerating deep learning workloads in production with high-performance inference, making the full power of machine learning available for all customers,” the CEO said. “It’s been a goal of ours for a long time and lowering the cost of training and inference are major steps in this journey.”
AWS didn’t give away many details regarding Graviton3. He said the instances with the new silicon will be 25% faster than Graviton2-powered instances in running general-purpose compute workloads and will be even better with some specialized workloads. For example, it is twice as good in both floating-point performance for scientific workloads and for cryptographic jobs. It is also three times faster running machine learning applications.
Power Efficiency a Factor
Graviton3 will use as much as 60% less energy for the same performance, helped in part by the use of DDR5 memory, which consumes less power than DDR4 while delivering 50% more bandwidth. The processor will run up to 64 cores, have 50 billion transistors and come with a clock speed of 2.6GHz.
The Inferentia- and Trainium-based instances also are about reducing the costs for running particular workloads. The Inf1 instances delivers 70% lower cost-per-inference than similar GPU-based ECs instances, Selipsky said. Meanwhile, the Trainium-powered Trn1 instances, aimed at such jobs as natural language processing and image recognition, will provide twice the bandwidth — up to 800 Gb/s of EFA networking throughput — than what is available in GPU-based instances.
Enterprises also will be able to deploy the Trn1 instances in EC2 UltraClusters, which can scale to tens of thousands of Trainium chips and reach petabit scale. Those UltraClusters will be 2.5 times larger than previous EC2 UltraClusters.
“Inferentia and Trainium are all about saving money doing production-level inference and hard-core training,” Moorhead said. “AWS has been consistent with its position on saving money and therefore, before I even see the Trainium results, I have high confidence that on certain workloads, you will see significant savings.”
Trend Toward Custom Chips
Graviton, Inferentia and Trainium are part of a broader trend in the industry toward specialized processors. In a blog post this week, Chris Bergey, senior vice president and general manager of Arm’s infrastructure line of business, wrote that his company, which designs chips and licenses those designs to other firms, is helping to drive that trend with its power-efficient designs.
“Data center workloads and internet traffic are nearly doubling every two years, so performance-per-watt benefits are crucial to keep computing from increasing its carbon footprint,” Bergey wrote, adding that Arm’s growth in the cloud “is giving developers a choice to continue to innovate sustainably by providing consistent performance and scalability on a per-core basis, enabling a combination of scalable performance and efficiency to deliver industry-leading TCO.”
AWS isn’t the only hyperscaler looking to design their own chips as they look for more performance and efficiency. Microsoft last year reportedly decided to build Arm-based chips that would be used in Azure servers and Google, which has such custom chips as its Tensor Processing Units and OpenTitan security-focused chips. Facebook also is building its own data center chips.
Challenges of Building Your Own Processors
Rob Enderle, principal analyst with The Enderle Group, told The New Stack that he was unsure how this will play out.
“When companies get to a certain scale, they tend to believe that their internal economies of scale will allow them to effectively compete with focused providers as peers,” Enderle said. “This latest trend is largely the result of Intel missing a number of critical milestones … forcing most in the cloud industry to consider this path.”
However, under CEO Pat Gelsinger, Intel’s execution is improving. At the same time, AMD continues to impress with its Epyc CPUs and GPUs, which suggests the need for custom chips may be decreasing, he said.
“It also may be easier to work though firms like AMD and Intel during times of supply shortages than it will be going it alone because those firms should not only have better supply redundancies but will also be better able to deflect blame from internal decision-makers if shortages are outside even their control,” Enderle said. “Cost does remain a potential advantage to going it alone but only if you ignore the value of the various firms’ intellectual property protections and decades of experience, which typically provide offsetting reliability, consistency and performance advantages.”
In addition, over time the costs add up and the internal efforts can become unprofitable and unsustainable. Part of this is because it is difficult to find and retain the talent needed, a particular challenge at a time of huge shortages of skilled employees, the analyst said.
“While the past doesn’t always predict the future and companies at AWS’ scale can do things successfully that even the largest enterprises cannot, the computing industry has — with the clear exception of Apple — largely moved away from vertical integration repeatedly since IBM’s near-collapse in the 1990s,” Enderle said, adding that “the fundamental advantages of specialized companies remains valid as long as they execute.”