How Google Cloud Enabled a Record Calculation of Pi
“The 100-trillionth decimal place of π (pi) is 0.”
Emma Haruka Iwao knows — because she led the first team that ever calculated every single digit. After running calculations for 157 days, 23 hours, 31 minutes and 7.651 seconds, their big moment arrived when it was time to check. “I was going to be the first and only person to ever see the number,” Iwao remembered in a blog post published this week.
The results? It was a new world record. “We’d calculated the most digits of π ever — 100 trillion to be exact.”
But what’s equally interesting is how they did it. Iwao and her team had put 128 virtual CPUs on the calculations — and a whopping 894GB of RAM. “As a developer advocate at Google Cloud, part of my job is to create demos and run experiments that show the cool things developers can do with our platform,” Iwao explained in the blog post. “One of those things, you guessed it, is using a program to calculate digits of pi.”
Iwao had already set an earlier record for digits of pi calculated. Back in 2019, using Google Compute Engine (powered by Google Cloud), Iwao had fired up 25 virtual machines which churned non-stop for 121 days, ultimately calculating pi out to a record 31,415,926,535,897 places — the first time a record-setting calculation had been performed using the cloud.
But this time, to calculate pi all the way out to its 100,000,000,000,000th decimal place, Iwao worked with a team of three more developer advocates — a team which availed themselves of the very latest infrastructure.
For starter, Iwao’s team switched to Google’s new Virtual NIC (gVNIC) network driver, which tightly integrates with Google’s Andromeda virtual network stack, giving them the benefit of a higher throughput with a lower latency. This meant that their main compute node — running Debian Linux 11 — ended up with a jaw-dropping 100Gbps egress bandwidth support (which is critical for a project like this, which uses a network-based, shared storage architecture).
“Back in 2019 when we did our 31.4-trillion digit calculation, egress throughput was only 16 Gbps,” Iwao points out in a second more-technical blog post, “meaning that bandwidth has increased by 600% in just three years… This increase was a big factor that made this 100-trillion experiment possible, allowing us to move 82.0PB of data for the calculation, up from 19.1PB in 2019.”
Yes, that’s 82PB of output — a number that’s even more massively large than the 515TB used to ultimately store the number. “The algorithm has multiple variables,” Iwao explained in an interview with The New Stack on Tuesday. “So pi is just one number, that’s 100 trillion digits. But in order to calculate pi, you need to keep and store a lot of these different numbers, a few variables in the equation. ”
Iwao also writes that just the amount of data processed — 82,000TB — is equivalent to 2,598 years of high-definition movies. (Versus just 19TB in 2019 — or just 606 years of high-definition movies.) It took 515TB just to store the final number.
But it turns out that calculating 100 trillion digits of pi isn’t just a storage problem. “Because the dataset doesn’t fit into main memory, the speed of the storage system was the bottleneck of the calculation,” Iwao wrote. “We needed a robust, durable storage system that could handle petabytes of data without any loss or corruption, while fully utilizing the 100 Gbps bandwidth.
“To store the final results, we attached two 50TB disks directly to the compute node.”
And before they even began, the team also used the Terraform tool (along with a home-grown program) to performance-test dozens of different possible infrastructure options — and different parameters for the pi-calculating program y-cruncher. And the fine-tuning paid off. “Overall, the final design for this calculation was about twice as fast as our first design,” Iwao’s blog post notes. “In other words, the calculation could’ve taken 300 days instead of 157 days!”
The winning configuration? “We designed a cluster of one computational node and 32 storage nodes, for a total of 64 iSCSI block storage targets.”
Iwao’s blog post calls the new record “a testament to how much faster Google Cloud infrastructure gets, year in, year out” (citing improvements not only in its computing power, but also in its storage and networking.) Iwao writes that when people asked in 2019 what would happen next, she knew that the scientific community “just keeps counting,” applying whatever cutting-edge tools are available. Since there’s no end to Pi, our currently-calculated digits will always be increasing, as long as computing itself keeps evolving.
In our interview Tuesday, I’d asked Iwao: what’s the message to this milestone? “For non-tech people, who maybe just use computers, what I want to say is computers are getting faster,” Iwao told me. “And it’s exciting — that computer science and engineering are exciting fields to learn about… If you’re interested, it’s great to learn cloud computing or computer science in general…
“For the tech community the message is more specific. Since 2019, we added more than 62 trillion digits — so we tripled the number of digits. And we didn’t spend three times more time…. So the computers are actually more than twice as fast.
“Sometimes even when you work in the industry, it’s hard to keep track of all the new announcements and new tech coming every week. I think of pi as a proxy to the overall status of computers and architectures… Being able to say we achieved more than two times the speed. That’s significant.”
Here are some other highlights from our interview:
TNS: Does the speed keep improving, year after year? That is, does cloud computing have its own equivalent to Moore’s Law?
Specific to Google Cloud, I don’t think we give predictions or future road maps beyond what we have on the website. But for computers in general, my opinion is we still don’t see an end to progress.
We are still seeing improvements in semiconductors, CPUs, architectures, operating systems and everything on top of that. So I believe we’ll still continue to see improvements and speedups in the future…
Speaking specifically of CPUs, it sounds like you’re not one of those people who believes that Moore’s Law is finally dead.
I think we’ve been saying that for years already. And I think what’s cool is scientists in engineering are working to extend the limit, to go beyond the big limit we initially thought was possible. So because we know the challenge, and we understand what we need to do to make it even faster — I think we’ll continue to see more improvements…
This is a powerful moment for computers and computer science because you can rely on and reuse other people’s work and achievements. There are open source communities, and there are lots and lots of information knowledge that you can take and incorporate into your own challenges.
Even within scientific, high-performance-computing communities, there are a number of open source software tools and solutions — and so the improvement in hardware and cloud platforms are not just hardware but about software as well. There are tools like compilers, profilers, development tools, libraries like MPI, OpenMP and schedulers. There are a lot of technologies that you can use, and some of them are available for Google Cloud as well.
I think learning about these technologies is one thing; it’s an exciting and challenging field. And using them and creating a new solution and developing new software is I think another challenge and an interesting one.
So what happens now that we’ve calculated pi to the 100 trillionth digit? What can we do with it?
To be 100 percent honest, there are some experiments and research you can do with the numbers — for example, checking frequency of numbers and patterns. For any scientific calculations, you don’t need more than maybe 40 or 50 digits — that’s all you need.
I think this is one of the cases where how you do it is actually more important and interesting than what you get in the end. We did publish the entire 100-trillion digit result online, and anyone can download and access that through cloud storage or the API we offer.
I love to hear what people might want to use those numbers for…
So then what does the future hold? Will we see another record broken for the number of digits of pi calculated?
Improvements in computing storage networking is one of the examples that we see every day in the cloud and other computing platforms. It might not be me — it might be me, no one knows. But I think the scientific community as a whole will continue getting more digits of pi in the future. And we will continue to see improvements to computers, algorithms, and what-not.
So I’m excited to see what’s coming next, and I hope we as humanity, collectively continue to see more digits of pi in the future.