What’s Next in Building Better Generative AI Applications?

After the Big Bang of generative AI — OpenAI’s release of its ChatGPT-3 demo in near the end of 2022 — a variety of industries have been scrambling to make sense of the fallout. Companies have been integrating the tech into the ways they do business, and training the most popular large language models (LLMs) to help them solve their biggest challenges.
Madhukar Kumar, chief marketing officer of SingleStore,is tinkering with LLMs too, he said in this episode of The New Stack Makers podcast.
“The one that I really like, and I’m still playing with it is a large language model called Gorilla, which is trained on APIs,” he said. “So it generates and gives you APIs based on what you’re trying to do.”
Kumar spoke to TNS Makers ahead of SingleStore Now: a Real-Time AI Conference, slated for Oct. 17 in San Francisco. The day’s agenda promises speakers like Harrison Chase, creator of the web framework LangChain, and Stan Girard, founder of the B2B AI platform Quivr.
But it also offers plenty of demos aimed at guiding and inspiring lots of tinkering
At the conference, Kumar said, “We will go through the entire lessons of how to build out a generative AI application from scratch — but for an enterprise specifically.”
Bringing LLMs up to Date
Kumar pointed out a limit built into the current crop of LLMs, whether they’re open source or commercial offerings: they are “frozen in time.”
“You cannot continuously train a large language model,” he noted. Therefore, “GPT- 3.5, from what I’ve read, is trained on data till September of 2021. And it cost over $100 million. And it took a very long time to train it.”
Therefore, “if you ask the large language model, for example, who won the gold medal for curling, in Olympics of 2022 … [it] will tell you that ‘I’m not aware of it, I don’t know.’”
In the last few months, a method of bringing LLMs’ data up to the current day has emerged, he said, “is what is called in context learning or real-time learning, or also known as retrieval augmented generation.”
RAG, Kumar said, is central to how SingleStore and its vector databases are approaching the issue of bringing LLMs up to date.
“Let’s say if you ask a large language model that doesn’t know about your company data, then what you can do is, in real-time, before you talk to the large language model, you take the user query, then you go back to your corpus of data with an enterprise, you search for it. And then you say, OK, here’s some data. And that is called context.”
Then, he said, “you hand it over to the large language model and say, ‘Now please answer this question.’ It’s like an open test book — but give me only answers related to this. So that also kind of put guardrails around hallucinations,” the factually false answers thatgenerative AI tools have sometimes generated.
In order to curate the data in real-time and turn it into context for LLMs, he added, data needs to be stored as vectors, which SingleStore allows users to do. “So you can do semantic search as well as lexical search,” he said. “You can join all kinds of data, and it’s all in milliseconds, which makes the AI in real-time.”
Check out the full episode to learn more.
Register for the SingleStore Now: a Real-Time AI Conference on Oct. 17 using code (TNS-25) and get a ticket for $25 (regular price $199).