Why Microsoft Has to Save OpenAI
The chaotic, slapstick unravelling of generative AI darling OpenAI started with the technology family equivalent of an early Thanksgiving argument that turns unexpectedly bitter. It may or may not have been ended by a firm but friendly intervention from Microsoft, looking rather like the adult in the room. But through all the twists and turns — and there may still be more of both — Microsoft’s intervention to keep the OpenAI technology (if not the company) stable is inevitable.
More Than Money
Microsoft’s recent $10 billion investment in OpenAI was hardly chump change (although it was paid for in part by wide-ranging layoffs that tarnished the impressive culture change that CEO Satya Nadella delivered at the company), but it’s already proved something of a funding Ouroboros, with a substantial amount of the money Microsoft has invested in OpenAI over the years apparently spent on (Azure) cloud computing to run OpenAI’s large language models.
Never mind the far-distant plans to create an AGI that may never materialize. Microsoft — which wants you to think about it as “the AI company,” and specifically “the Copilot company,” rather than “the Windows company” — will effectively get the technology underpinnings of ChatGPT for about half what it paid for Nuance in 2021 or slightly less than the $7.5 billion it spent on GitHub in 2019 (adjusted for inflation). It wasn’t all spent on cloud, but Microsoft’s capex for Q1 2023 alone was $7.8 billion.
Despite having its own impressive roster of AI researchers, and its own extremely large foundation models, Microsoft cares an enormous amount about OpenAI’s ChatGPT LLMs because of the equally enormous investments in hardware and software it’s made to support them, and because of the dependency it’s taken on with OpenAI technology in almost all of its divisions and product lines.
Nadella’s opening keynote at the Ignite conference was peppered with references to OpenAI, including a preview of the GPT-4 Turbo models. Microsoft’s own products are equally seasoned with OpenAI technology, which is at the heart of the many Copilots.
Making Foundation Models Economical
LLMs and other foundation models take a lot of data, time and compute power to train. Microsoft’s solution is to treat them as platforms, building a handful of models once and reusing them over and over again, in increasingly customized and specialized ways.
Microsoft has been building the stack for creating Copilots for five years — changing everything from low-level infrastructure and data center design (with a new data center going live every three days in 2023) to its software development environment to make it more efficient.
Starting with GitHub Copilot, almost every Microsoft product line now has one or more Copilot features. It’s not just generative AI for consumers and office workers with Microsoft 365 Copilot, Windows Copilot, Teams, Dynamics and the renamed Bing Chat, or the GPT-powered tools in Power BI; there are Copilots for everything from security products (like Microsoft Defender 365), to Azure infrastructure, to Microsoft Fabric and Azure Quantum Elements.
Microsoft customers are also building their own custom copilots on the same stack. Nadella name-checked half a dozen examples — from Airbnb and BT, to NVidia and Chevron — but the new Copilot Studio is a low-code tool for building custom copilots using business data and Copilot plugins for common tools like JIRA, SAP ServiceNow and Trello that could make OpenAI essentially ubiquitous.
To make that happen, Microsoft has built an internal pipeline that takes new foundation models from OpenAI, experiments with them in smaller services like the Power Platform and Bing, and then uses what it’s learned from that to build them into more specialized AI services that developers can call. It has standardized on Semantic Kernel and Prompt flow for orchestrating AI services with conventional programming languages like Python and C# (and has built a friendly front around that for developers in the new Azure AI Studio tool). These tools help developers build and understand LLM-powered apps without having to understand LLMs — but they rely on Microsoft’s expertise with the OpenAI models that underpin them.
Hardware Is a Real Commitment
Microsoft would have made significant investments in the Nvidia and AMD GPUs that OpenAI relies on, along with the high bandwidth InfiniBand networking interconnections between nodes and the lower latency hollow-core fiber (HFC) manufacturing it acquired Lumensity for last year, whichever foundation models it was using.
Microsoft credits OpenAI with collaboration on not just the Nvidia-powered AI supercomputers that now routinely show up on the Tops500 list but also even some of the refinements to the Maia 100. It doesn’t just sell those Azure supercomputers to OpenAI; that’s the public proof point for other customers who want similar infrastructure — or just the services that run on that infrastructure, which is now effectively almost every product and service Microsoft offers.
But previously, its main approach to AI acceleration was to use FPGAs, because they allow for so much flexibility: the same hardware that was initially used to speed up Azure networking became an accelerator for Bing search doing real-time AI inferencing and then a service that developers could use to scale out their own deep neural network on AKS. As new AI models and approaches were developed, Microsoft could reprogram FPGAs to create soft custom processors to accelerate them far faster than building a new hardware accelerator — which would quickly become obsolete.
With FPGAs, Microsoft didn’t have to pick the system architecture, data types or operators it thought AI would need for the next couple of years: it could keep changing its software accelerators whenever it needed — you can even reload the functionality of the FPGA circuit partway through a job.
But last week, Microsoft announced the first generation of its own custom silicon: the Azure Maia AI Accelerator, complete with a custom on-chip liquid cooling system and rack, specifically for “large language model training and inferencing” that will run OpenAI models for Bing, GitHub Copilot, ChatGPT and the Azure OpenAI Service. This is a major investment that will significantly reduce the cost (and water use) of both training and running OpenAI models — cost savings that only materialize if training and running OpenAI models continue to be a major workload.
Essentially, Microsoft just built a custom OpenAI hardware accelerator it won’t be deploying into data centers until next year, with future designs already planned. That’s hardly the perfect time for its close partner to have a meltdown.
Keeping the Wheels Turning
Although it’s likely made overtures over the years, Microsoft didn’t start out wanting to acquire OpenAI. Originally, it deliberately chose to work with a team outside the company so that it knew the AI training and inference platform it was building wasn’t only designed for its own needs.
But with OpenAI’s models staying so far ahead of the competition, Microsoft has bet on them more and more. Only a year after launch, ChatGPT claims 100 million users a week and OpenAI had to pause ChatGPT Plus signups because new subscribers were overwhelming capacity — and that’s not counting the OpenAI usage by Microsoft’s direct customers.
Whether you use ChatGPT from OpenAI or an OpenAI model built into a Microsoft product, it all runs on Azure. The lines between what Microsoft calls a ‘first party service’ (its own code) and ‘a third party service’ (from anyone else) have become rather blurred.
Theoretically, Microsoft could back out and pivot to a different foundation model, and almost all the foundation models from key players already run on Azure. But not only is changing horses in mid-stream messy and expensive, leaving you likely to lose a lot of ground, it’s also likely to damage you in the stock market and with customers. Far better to make sure that the OpenAI technology survives and thrives — whatever happens to OpenAI the company.
While the developer relations team at OpenAI has been reassuring customers that the lights are still on, the systems are still running and the engineering team is on call, OpenAI customers have reportedly been reaching out to rivals Anthropic and Google; which might include Azure OpenAI customers who Microsoft won’t want to lose. LangChain, a startup building a framework for creating LLM-powered apps that have just announced significant integration with Azure OpenAI Service, has been sharing advice to developers on how switching to a different LLM requires significant changing of your prompt engineering (and most examples today are for OpenAI models).
The OpenAI Dependency
If the internal customers at Microsoft — which means pretty much every division and product line — are having the internal version of those same conversations, bringing as much OpenAI expertise as possible in-house is going to ease whatever transitions it needs to make if OpenAI itself does fragment or fade away.
Yes, Microsoft has what CFO Amy Hood described as “a broad perpetual license to all the OpenAI IP” up until AGI (if that ever happens) even if its partnership with OpenAI ends, but generative AI is moving so fast that just keeping today’s models running isn’t enough. Microsoft needs to count on getting future LLMs like GPT-5.
Despite the name, OpenAI has never been primarily an open source organization, with only a handful of releases and none of them the core LLMs. But it’s instructive to compare the way the points that were significant in Microsoft’s slow embrace of open source weren’t just when it released core projects like PowerShell and VS Code as open source, but was when it started taking dependencies on open source projects like Docker and Kubernetes in Windows Server and Azure.
The dependency it’s taken on with OpenAI is even more significant, and one that’s ironically proved to have less stability and governance. One way or another, Microsoft is going to ensure that what it needs from OpenAI survives.