In yet another step toward running its operations entirely on carbon-free energy sources by 2030, Google is now deploying machine learning technology that will help automatically shift workloads between data centers, depending on the availability of renewable energy resources, which can vary by type, location or the time of day.
The move is part of Google’s plan to transition to what it calls Carbon-Intelligent Compute Management, a system that will use artificial intelligence to automatically maximize clean electricity use across their data centers — and therefore minimize the carbon footprint and operational costs. The system functions by delaying non-urgent workloads that aren’t time-sensitive, such as encoding and analyzing videos that are uploaded to YouTube, or processing images that are uploaded to Google Photos and Drive.
The company says that these “temporally flexible” tasks will still be completed within 24 hours, while critical production tasks and user-facing services that need to run around the clock — such as Search, Maps, YouTube and cloud customers’ workloads running in allocated Virtual Machines (VMs) — will not be changed by the new system.
‘Carbon-Aware Computing for Datacenters’
“Workloads are comprised of compute jobs,” explained the team of Google engineers in their recent paper on the new platform. “The system needs to consider compute jobs’ arrival patterns, resource usage, dependencies and placement consequences, which generally have high uncertainty and are hard to predict (i.e., we do not know in advance what jobs will run over the course of the next day). Fortunately, in spite of high uncertainties at the job level, Google’s flexible resource usage and daily consumption at a cluster level and beyond have demonstrated to be quite predictable within a day-ahead forecasting horizon. The aggregate outcome of job scheduling ultimately affects global costs, carbon footprint, and future resource utilization.”
Data centers account for 1% of worldwide electricity use, a proportion that has not only doubled during the last decade, but is continuing to grow. Google’s goal is to leverage machine learning to automatically reduce the overall amount of carbon that is currently being emitted by the company’s massive fleet of computers, while taking into account the carbon intensity of the power sources used and their predicted availability so that time-flexible workloads can be shifted to hours where more “green” energy is available. Carbon intensity varies depending on the energy source used; for instance, coal is more carbon-intensive than wind power, as more grams of carbon dioxide are produced in the process of generating one unit of electricity.
Google’s new system uses prediction models to routinely gather the next day’s carbon intensity forecasts, and determine the next day’s energy demands, based on previous usage. The system uses what Google calls “risk-aware optimization” tools to set Virtual Capacity Curves (VCCs), or hourly resource usage limits for each data center’s cluster operating system, which is responsible for task allocation. These limits are established by analyzing predictions of broader flexible and inflexible demands across Google’s network of data centers, the uncertainty of those demands, hourly carbon intensity forecasts, while balancing those variables with enterprise and environmental targets, as well as infrastructure and workload performance, and the usage limits set by local energy providers.
“Rather than using stylized models of demand uncertainty and its translation to power consumption, Google’s “CarbonIntelligent Computing” system uses aggregate cluster-specific resource demand forecasts and power models trained separately for each cluster,” emphasized the team. “Thus, the new framework captures diversity in workloads and hardware configurations at Google scale.”
The enormous scope of the variables being juggled by the new system is quite mind-boggling, but according to the team, the system is working, with room for future improvement, like potential integration with micro-grids and dynamic energy management programs at grid level.
“Data from operation shows that VCCs effectively limit hourly capacity when the grid’s energy supply mix is carbon-intensive and delay the execution of temporally flexible workloads to ‘greener’ times,” added the researchers. “Using actual measurements from Google datacenter clusters, we demonstrate a power consumption drop of 1-2% at times with the highest carbon intensity.”
That reduction may seem insignificant, but considering the scale of Google’s operations across the globe, some experts say it’s actually a big step forward, with broader implications for more Big Tech players to take notice and join in.
“Google was the first of the big tech players to use advanced orchestration to improve the efficiency of their data centers using a system called Borg,” explained Anne Currie, a tech ethicist who is also vice-chair of the Trademark Group at the Green Software Foundation. “What’s revolutionary about this new scheme is they are purposefully reducing efficiency so machines don’t run when there’s no green electricity to power them. Where they lead the others will follow, because it’s very hard to have green power all the time. Sometimes the sun isn’t shining, or the wind isn’t blowing. The main significance of this is Google is sharing what they are doing as they do it — even before the results are all that impressive. In the past, Google hasn’t tended to be as public about what they’re up to. They don’t usually lead in that way. It’s a good sign for the industry if the big player starts to share thoughts and cooperate. This is too important for everyone to fail to pull together.”
Ultimately, the hope is that tech industry leaders will not only develop similar measures to dynamically maximize renewable energy usage when it is available, but to also actively invest in and advocate for developing more renewable generation capacity around the world, in addition to adopting other concrete measures like better data center waste management, and building a more “circular” approach in general to infrastructure, operations, and culture.
“My dream is that the cloud providers start to build services that help the tech industry reduce carbon released from runtime electricity and embodied carbon from hardware to zero,” said Currie. “It will be hard and it has to be done by 2030. We need all the help we can get. We also need leadership and engineers look up to Google, Amazon and Microsoft. I’m relying on them to be creative.”
Read more in Google’s paper.
Amazon Web Services is a sponsor of The New Stack.
Feature Image by Annie Spratt via Unsplash