Seeking a New Paradigm in AI Benchmarking
AI development holds a potential solution to some of the world’s most pressing challenges and could help create a better future for all. To build AI models and tackle these challenges, we need cutting-edge infrastructure, and networking is a significant component of this infrastructure. This important work includes optimizing these extensive networks that enable AI/ML computations, moving the technology forward for everyone. Performance, deployment and operational complexities all present challenges for managing network solutions as scale grows.
Benchmarking, for example, not only improves current AI systems but aids in planning future networks. A new benchmarking system we’ve developed at Meta can play an important role in this, and we believe it’s another opportunity to enlist the global community of AI technologists in furthering the enhancement of AI efficiency analysis and benchmarking tools.
As part of benchmarking, execution traces provide additional important functions including visualization and performance optimization. At Meta, our new Chakra execution trace is a graph-based representation of AI/ML workloads that aims to unify diverse execution trace schemas. In addition to capturing core operations such as communication, memory and compute, it can capture metadata, dependencies and timing.
We believe that encouraging and seeking industrywide adoption can enhance AI efficiency analysis tools and enable holistic performance benchmarking. As part of a collaboration with open engineering consortium MLCommons, Meta has open sourced a toolkit for a wide array of simulators, emulators and replay tools enabling the collection, analysis, generation and adoption of Chakra execution traces.
Expanding Beyond the Limits of Traditional Benchmarking
For the most part, benchmarking AI systems has meant running full machine learning workloads. MLCommon’s MLPerf and other established benchmarking approaches can provide useful insights, for example, into the performance and behavior of both AI workloads and systems. MLPerf has emerged as a leading benchmark for AI applications on various accelerators including GPUs ( graphics processing units), ASICs (application-specific integrated circuits) and other chips.
Yet there are several challenges inherent in that type of full-workload benchmarking. Among them are high compute cost, impediments in forecasting future system performance and an inability to adapt to evolving workloads.
The Chakra execution traces build on our insights into the constraints of traditional benchmarking. One key area we hope to progress through our work with MLCommons is in advancing the benchmarking that is so essential to AI work.
The Chakra working group, for example, is curating a “Chakra trace benchmark suite” — collecting execution traces from other contributing players. In addition, that working group is helping to address a constraint whereby traces sourced from one system might not accurately simulate on a system with a different network topology, GPU and bandwidth. The aim is to gather traces at multiple stages, including pre- and post-optimization, for use on any target system.
Meta, MLCommons and the Path to Future Innovation
The Chakra working group is just one example of our work with MLCommons. We’re also part of a new multidisciplinary group working on AI safety benchmarks.
For the AI ecosystem to flourish, industry consensus is essential. The Chakra working group under MLCommons will focus on a range of projects that can help forge an agile, reproducible benchmarking and co-design system for AI. Whether it’s developing for capturing and converting execution traces from diverse frameworks or defining comprehensive benchmarks with Chakra execution traces based on MLCommons/MLPerf guidelines, we invite interested individuals and companies to join us.