AI Engineering in 2023: The LLM Stack and New AI Dev Tools
Using large language models (LLMs) in application development has been one of the biggest trends in technology this year. It began with companies using OpenAI’s proprietary models via its API, but by the end of 2023 there were a plethora of different LLMs to choose from — including open source LLMs that developers can directly access, rather than rely on an API.
As well as the proliferation of LLMs, there has been an expansion of dev tools available to integrate LLMs into apps. We’ll discuss this and more as we look at five key trends this year in AI engineering.
1. Emergence of the AI Engineer
First and foremost, there’s now a new role for developers to consider for their careers: AI engineer.
AI engineer is the next step up from “prompt engineer,” according to its main proselytizer Shawn “@swyx” Wang. Earlier this year he created a nifty diagram showing where AI engineers fit into the wider AI and development ecosystems:
The role of AI engineer is still very new. As of the end of 2023, it means a developer who uses LLMs and associated tooling — such as the LangChain framework and vector databases.
In an interview I conducted with Shawn Wang in October, at the AI Engineer Summit in San Francisco that he co-hosted, he likened the AI engineer role to being a mobile specialist.
“So, think of AI as a platform, like mobile engineering, right? Like, you just specialize in the mobile stack. I don’t want to touch it, because mobile is gnarly. You go to all the mobile conferences, you know all the mobile tech, and you know the debates. But when I need anything mobile done, I come to you and you know how to get it done.”
He added that all developers should at least familiarize themselves with what AI engineering is — just as they would’ve at least learned the scope of mobile engineering when that became popular ten to fifteen years ago.
2. The Evolution of the LLM Stack
An associated trend in AI engineering this year has been the emergence of a tech stack for this new role. There are varying opinions on what the stack includes, but I like the following diagram from VC firm Andreessen-Horowitz (a16z):
The orchestration layer is perhaps the most important for AI engineers, as that is where their applications will connect to LLMs. This is where “prompt engineering” comes in, which is basically a method of querying LLMs to get these systems to provide useful information for applications. Over 2023, tools like LangChain and LlamaIndex emerged to help developers with prompt engineering and other LLM integrations.
It’s worth noting the word “chain” in LangChain’s name, which indicates that it can interoperate with other tools — not just various LLMs, but other dev frameworks too. In May, for example, Cloudflare announced LangChain support for its Workers framework.
3. Open Source LLMs
Arguably the most impactful development this year in AI engineering was the rise of open source LLMs. Having alternative, non-proprietary LLMs to choose from became particularly important after OpenAI nearly imploded in November, due to an attempted boardroom coup.
Most AI engineers I’ve talked to say that OpenAI’s LLMs are still superior to all the other LLMs. However, open source models are fast catching up. Meta’s LLama 2, announced in July, currently tops Stanford’s HELM (Holistic Evaluation of Language Models) benchmarking leaderboard.
When Meta first announced Llama, back in February, it released the model weights to the research community under a non-commercial license. Other powerful LLMs, such as GPT, are typically only accessible through limited APIs.
“So you have to go through OpenAI and access the API, but you cannot really, let’s say, download the model or run it on your computer,” Sebastian Raschka from Lightning AI explained to me in May. “You cannot do anything custom, basically.”
In other words, Llama is much more adaptable for developers. As we head into 2024, this is potentially very disruptive to the current leaders in LLMs, like OpenAI and Google.
4. Vector Databases
By far the biggest influence on the data side of LLM development this year has been the use of vector databases.
Microsoft defines a vector database as “a type of database that stores data as high-dimensional vectors, which are mathematical representations of features or attributes.” The data is stored as a vector via a technique called “embedding.”
In a contributed post on The New Stack earlier this year, Mark Hinkle used the analogy of a warehouse to explain the use case for vector databases. “Imagine a vector database as a vast warehouse and the AI as the skilled warehouse manager,” Hinkle wrote. “In this warehouse, every item (data) is stored in a box (vector), organized neatly on shelves in a multidimensional space.” The AI can then retrieve or compare items based on their similarities. According to Hinkle, a vector database is “ideal for applications like recommendation systems, anomaly detection and natural language processing.”
Newer database solutions like Pinecone or open source projects like Chroma have carved out a profitable niche in vector databases this year. But there are now many options for vector databases in the market — including from existing database companies that are bolting this functionality on. For example, Redis offers vector database functionality in its Redis Enterprise product.
5. AI Agents
Perhaps the most controversial trend in AI engineering has been AI agent software, such as AutoGPT, which was released to the world at the end of March. AI agents are automated pieces of software that use LLMs for various tasks.
At the AI Engineer Summit in October, I sensed overconfidence among some of the speakers about the abilities of these automated agents. Maybe it was even hubris, because the general idea of agents seems to be to take humans out of the equation. But if you have ever dealt with automated chatbots from the likes of your bank or phone company, chances are you wished that a human was on the other end of the chat.
Jacob Marks, a machine learning engineer at Voxel51 and one of the conference attendees, put it this way in a LinkedIn post: “AI Agents are far from reaching their full potential. In part, this is due to the difficulty in creating robust evaluations for said agents. AutoGPT is in major flux.”
Perhaps 2024 is when Auto-GPT and other AI agent software will come into its own. But for now, as OpenAI co-founder Andrej Karpathy warned in April, the risk is that AI agents could “go off the rails.”
It’s been a hectic year of innovation in AI engineering. But despite glaring issues — both technical (see AI agents) and business (see OpenAI boardroom) — we can expect even more progress in generative AI next year. OpenAI is being actively challenged by the likes of Meta and Google, and all indications are that the underlying technology will continue to improve in leaps and bounds. Just this week, Google released Gemini, a model it claims outperforms ChatGPT in most tests. The battle for LLM supremacy will continue into 2024.
In addition, the LLM app ecosystem is likely to mature, as young companies like LangChain and Pinecone continue to expand. We may also see governments step in with regulation next year, so it won’t all be smooth sailing.
Regardless of what 2024 brings, 202 been a wild year for AI engineering — and it’ll probably be remembered as a pivotal one in the history of the internet.