An AI Safety Institute Benefits Big Tech, but Little Else
In what is known euphemistically as “legacy planning” in British politics, Prime Minister Rishi Sunak hosted the first AI Safety Summit at the beginning of the week. Invited were a range of famous tech business people and politicians, all of whom imagined they had something to gain from their proximity to each other.
Holding the summit in Bletchley Park celebrates the early British code breaking skills of Alan Turing and his Bombe machine, which would help speed up the process of decrypting the Enigma generated Nazi codes. (Later, Turing was driven into suicide by Britain, but that probably isn’t the message we want to convey.)
While the summit is certainly as meaningless as it is harmless, successive meetings may produce some form of understanding about AI. Right now, the threat from what is now called “Frontier AI” is obscure and largely dramatic. More serious, perhaps, than “What if sharks had lasers”. But hiding within the declaration was something very concrete and what this post is concerned with: the formation of an AI Safety Institute.
Before I could look closely at how this works, U.S. Vice President Kamala Harris crashed the party and announced the formation of the United States AI Safety Institute (USAISI) inside NIST. Analysts confirmed that the U.S. did not “want to lose our commercial control to the U.K.” The institute would “carefully evaluate and test new types of AI so we can understand what new models are capable of.”
Either way, does this matter? What will the institute do, and how could it affect a nascent industry? I’ll pick out three of the areas of concern that the US AISI has stated they are considering.
1. Red Teaming
This is a relatively minor service, but even this example shows the issues within a new regulated space.
When a company asks an internal set of penetration testers (a red team) to attack their own system, they are doing so to get an early heads up of problems and to orientate their staff to deal with security issues earlier in the pipeline.
If what is being “attacked” is an LLM service, this will be a barrage of requests that will check that any resulting conversation does not guide a user on the best way to commit suicide, build a viable bomb, etc. This is currently the curse of public products like ChatGPT. But it also tests regression, making sure any updated services don’t drop standards.
If this is done externally and with public results, a small service will effectively compete with that of a large company product. If this is where the smaller company is innovating, that might be welcome — but otherwise, it just reduces confidence and delays release.
Into the dragnet may come the Frontier AI within autonomous vehicles. So any new company may have their AI being used in test cars around a circuit. On the face of it, that would be a good thing — although the problem with research in autonomous driving AI isn’t really achieving safety, it is the consistency of the underlying models. Reducing competition won’t help with this.
2. AI Fairness
A particularly pernicious problem with large selection sets is when they clearly don’t have a diverse root. The infamous “show me a picture of a doctor” producing a white middle-aged male is a stereotypical example concern.
For smaller startups, forcing their LLMs to pass fairness tests will steer them into using pre-curated sets that won’t fail in an area where they may not be innovating in. This classic regulation side effect could just move money to canny set curators, while doing little to advance AI research.
Yet again, larger firms have the time and space to take advantage of legitimate concerns and force legislators into setting ever lower bars for small firms to limbo under.
3. AI Explainability and Interpretability
One of the earliest problems with the current LLM models is that it is very hard to backtrace an AI decision to see how it was made. This is because reinforcement learning happens over millions of small patterns that don’t hold any specific meaning for human observers.
While it would be easier to trust generative AI if it could more easily be interrogated, this feels like an area that might need a lot of work — again, in an area where a small firm cannot afford to research in. This doesn’t feel like it should be used as an early barrier to entry.
So, Is This Just a Corporate Shakedown?
If this appears to be setting out a one-sided picture of a known side effect, let’s go back to the AI Safety Summit. Who was invited? Large firms have the resources to make themselves known by launching early public services or playing Go. The event also exposed divergences over the use of open source AI models between large companies and start-ups, as well as governments around the world. Industry leaders have been making it clear they regard open source models as “dangerous”.
Real and valid safety corrections tend to happen from the ground up civil action, that works through regulators to change the law. This route was how wearing seat belts became compulsory in Europe and most of the United States. While it could be argued that other solutions were ignored (perhaps airbags could have filled the same crash safety role) any new car entry has a relatively cheap requirement that they have to fulfill. And the motor industry was already mature when these changes were introduced. No one is disputing what a car is, or what happens when a driver or passenger is ejected through the windscreen at 30mph.
Within an environment of fear-driven on scant evidence, an AI Safety Institute seems like a dubious attempt to move early on possibly fake concerns that will nevertheless further entrench Big Tech interests. This isn’t to doubt that companies take all sorts of shortcuts to get to market; they definitely do. But the looming threat of early regulation may throw the AI baby out with the bath water.