Machine Learning

The AI Infrastructure Alliance Wants to Build a ‘Canonical Stack’

25 Mar 2021 8:30am, by

During the last several years, the influence of machine learning and artificial intelligence has grown enormously in our daily lives — from chatbots automating customer service, to algorithmically driven recommendation engines and a slew of virtual assistants. Yet, despite the powerful potential of ML and AI to transform the way that the average company does business, only about 8% of companies with 10 or fewer employees are using some form of ML/AI, compared to 25% of companies with more than 250 employees. This relatively uneven adoption of AI could be chalked up to a shortage of talent needed to meet the demand, but it’s also an infrastructural problem: currently, there’s a lack of common standards and cross-platform interoperability when it comes to developing machine learning tools and frameworks, which makes it difficult for smaller companies to incorporate AI into their daily operations.

In an effort to overcome this issue, 25 AI startups from around the world recently joined together to form the AI Infrastructure Alliance (AIIA), a non-profit organization aiming to create what they are calling a “canonical stack” for AI — by establishing robust engineering standards and consistent integration points within the AI infrastructure ecosystem.

“To us, ‘canonical’ means the definitive, the broadly accepted, the key components,” explained Daniel Jeffries, the Alliance’s director and chief technology evangelist for Pachyderm, a member of the Alliance. “Until you have a stack built around this idea, regular companies can’t build effective AI/ML practices.”

Much of the trouble stems from the fact that AI development is currently being dominated by Big Tech companies — mainly Facebook, Amazon Web Services, Apple, Microsoft, Netflix and Google — which makes it difficult for smaller enterprises to get a foot in the proverbial AI/ML door and innovate on their own terms.

“Right now, AI/ML is not democratized,” said Jeffries. “It’s still the province of the biggest, most cutting-edge companies. Big companies are building the infrastructure and designing the algorithms all at once. But for AI to trickle down to the real world, we need a strong, stable infrastructure foundation to build on — in other words, we want to build the road that allows every car to drive.”

To illustrate the point, Jeffries draws an analogy between what’s happening in AI now with the past emergence of the LAMP or MEAN stack in web development, where a generally accepted set of software components were utilized together to create a complete solution, with each layer building on top of the layer below to form a “stack.” Similarly, a canonical stack of standard components in AI needs to coalesce first before smaller enterprises can start to produce groundbreaking applications en masse.

“Once you have that standardized, broadly accepted platform developers can move up the stack to solve way more interesting problems,” said Jeffries. “WhatsApp’s tiny team of just 35 engineers reached 400 million users because they didn’t have to invent messaging protocols, transit layer security, a GUI, scalable peer-to-peer protocols and everything else that goes along with building a cutting-edge traditional application. All those components existed, fully baked, so WhatsApp engineers could just put it all together to do something cool with it. That’s what we need in AI now — development is still way too complicated and hard for the average company, and that’s going to change when we have a canonical stack for AI/ML.”

What a Canonical AI Stack Looks Like

For the AIIA, the future of democratized AI would be built upon a seamless deployment framework, capable of rolling out tightly integrated components, with connections to various SaaS services. Much of the work — such as setting up and connecting together a data versioning platform, a feature store, a pipelining system, a model deployment framework and a monitoring system — could be done on an automated basis, built with contributions from Alliance members, said Jeffries.

“We have companies and projects in every aspect of machine learning in the Alliance — from data labeling with YData and Superb AI, to model serving with Seldon, UbiOps, and Algorithmia, to data versioning and lineage with Pachyderm, to feature stores with Tecton, to experiment tracking with ClearML and UbiOps, to monitoring with Superwise, WhyLabs, Fiddler, or New Relic.

In addition, in order to ensure easy deployment, Jeffries adds that the AIIA is engaged with larger, general-purpose infrastructure companies like Canonical, which focus on building comprehensive deployment frameworks. According to Jeffries, this process of cementing the connections between Alliance members is an ongoing one, with members now working to form “micro-alliances”, and building bridges between their respective engineering and integrations teams.

“Members are putting out papers together with joint examples,” added Jeffries. “They’re improving their open source projects and enterprise projects so they work better together. They’re building connectors and integration points, and eventually, we’ll abstract those connection points out. We’re already talking with several advanced data science engineering teams that are working on amazing open source projects that form the glue between different platforms, and we’re looking to roll them under the Alliance.”

Of course, with so many moving parts to coordinate, fostering these emerging links hasn’t been without challenges, and the AIIA is looking to learn from the missteps of similar precedents so that they can avoid making the same mistakes.

“We’ve got to make sure that everyone sees the bigger picture and works together — a rising tide lifts all boats,” said Jeffries. “We don’t want this to turn into a meaningless reference architecture. We don’t want everyone in the Alliance pushing and pulling so hard that it warps the stack all out of proportion or collapses to individual interests. The trick here is to focus on mutual benefits — every member of the Alliance must ask themselves how the Canonical Stack can help the Alliance as a whole. We also don’t want governance by pure committee. That destroyed OpenStack. Committees are good at logistics and taking votes, but they’re not good at creative work. They’re not good at true vision. So we need visionary architects working behind the scenes.”

But Jeffries points out that they are prepared for these challenges ahead of time, and have the right tools to achieve the goal of establishing a canonical stack for AI/ML. Reaching such a counterpoint is key to diminishing the powerful stranglehold that Big Tech has over the AI industry, which will help to reduce the threat of vendor lock-in with any one AI/ML provider, as well as mitigating worrisome ethical issues like algorithmic bias and privacy, and Big Tech’s disingenuous practice of ethics-washing.

“Tech giants are not incentivized to protect privacy, or create universal standards,” noted Jeffries. “In fact, they’re incentivized to do the opposite — exploit privacy and keep a walled garden up to protect their resources. Smaller, more agile companies can work together and compete because we’re a rebel alliance. We have no vested interest in, say, protecting ad-targeting data. We can build real standards, instead of standards in name only.

“The big companies can pay lip service to openness and privacy and universal interoperability, but the truth is written right on the tin for everyone to see. Dictatorships say they’re protecting security and peace but really they’re just protecting their own power — this is what the Alliance is working to unwind.”

Image: Héctor J. Rivas via Unsplash

A newsletter digest of the week’s most important stories & analyses.