From RAG to Riches: Dispelling AI Hallucinations
Generative AI (GenAI) and large language models (LLM) are undeniably the hottest tech of 2023, and this momentum doesn’t look like it’ll slow in 2024 or beyond. Businesses will continue to invest billions in the technologies, and wealthier organizations will indulge in M&A sprees to make sure they are at the forefront of innovation.
As a business tool, GenAI makes perfect sense — it can make employees more effective and efficient, increase understanding and skills, and open up new opportunities. The danger of increasing organizational reliance on AI is that you need to trust its ability to make the right decision. Without this, the organization could spend a large part of its AI investment double- and triple-checking every prompt and answer to ensure they’re trustworthy. Additionally, AIs can easily fall victim to hallucinations that confuse organizations or set them on entirely the wrong path.
The Grand Illusion
LLMs are probabilistic engines that analyze the input and the available data, and then calculate which words (or sequences of words) should come next in a reply. This is a double-edged approach. It allows organizations to answer potentially any query on any subject and to do so using natural, understandable and grammatically correct language.
However, ultimately LLMs are playing the odds. If their education — and the datasets they learn on and use — can’t match a query, then their only option is to bluff. The answer will look accurate and be delivered with complete confidence, but it isn’t based on reality or any learned knowledge that can add context. For organizations that need to make business decisions and follow best practices based on factual evidence, this greatly reduces AI’s trustworthiness, and so its effectiveness.
Behind the Curtain
AI hallucinations are caused by a variety of factors, but ultimately it comes down to the fact that, while a person has a lifetime of knowledge and experience to draw from, AI models are only ever as intelligent as their datasets.
For instance, one of the most common challenges leading to AI hallucinations is data sparsity. If a dataset has missing or incomplete values, the AI will have no choice but to fill in the gaps. A person will have the context, judgment and critical-thinking abilities to deal with this, but an AI could easily come to inaccurate conclusions. For instance, even if they’ve never seen any of his movies, most people would consider Tom Hanks as a good, even great, actor. However, an AI with only a few performances in its dataset might come to an opposite conclusion.
Linked to missing data is incorrect data. Poor data quality that results in information being mistakenly categorized or labeled, or where an AI is learning from unreliable sources, can, in turn, lead to AIs unwittingly spreading misinformation. This isn’t simply a case of sharing a single inaccurate fact; for instance, claiming that the James Webb Space Telescope took pictures 17 years before it launched. An inability to cross-reference relevant data or understand biases leads to increasingly inaccurate conclusions — such as using unrepresentative medical data to predict, detect and treat skin cancers.
Finally, there’s the question of how the AI model is trained. If training data doesn’t contain enough samples to allow the model to generalize, if there is too much irrelevant, “noisy” data, if a model trains for too long on a single sample dataset, or if the model is so complex it learns from both irrelevant and relevant data, the result is overfitting. The AI model will work perfectly on its training sample but have extremely poor pattern recognition in the real world, leading to inaccuracies and errors.
Breaking the Spell
Eliminating AI hallucinations is key to ensuring that AI realizes its full potential. A crucial first step is tackling data sparsity, quality and overfitting. Like any other business function or employee, enterprises can’t expect AI to operate effectively if it isn’t given the right information and training. Fine-tuning, or retraining, the model also helps generate relevant, accurate content. The issue is that without continuous training, data can become outdated. All this can mean significant costs and a delayed return on investment.
Prompt engineering is another method to avoid hallucinations and is quickly becoming part of the expected AI skillset. However, this does come with the burden of needing to ensure models are always fed highly descriptive prompts and additional training investment.
Thinking Outside the Dataset Box
Ideally, with the right help, an AI model should be able to improve its data and alleviate hallucinations. Retrieval-augmented generation (RAG) is one of the most promising techniques for doing this. By pulling data from external sources as needed, the RAG AI framework gives LLMs the vital context they need to improve responses and, crucially, avoid hallucinations.
To help applications such as virtual assistants, chatbots and other content creators generate precise, relevant responses, organizations need to be sure RAG can deliver the ability to reference multiple information sources and develop a deep understanding of context. As with any AI application, this is a matter of trust — and by pulling information from relevant, reliable and up-to-date sources and giving users access to those sources, RAG helps dispel doubts about AI’s reliability.
Pressingly, RAG needs access to real-time data to ensure that all information is as current, complete and therefore accurate as possible. For instance, during the peak retail season, any application or chatbot designed to offer users the best, most personalized offers on products will be worthless unless it can tailor its suggestions to each user’s profile and the context of its user session to show the best recommendations live. In addition, it needs access to real-time data to get dynamic changes in prices to formulate the best offer for the user. After all, nobody wants offers for something they already bought, to find out their recommendations weren’t the right product at the right price at the right time or that they overpaid for something available at a lower price elsewhere.
Furthermore, RAG should be paired with an operational data store that enhances its effectiveness. To query data effectively, data needs to be stored in high-dimensional mathematical vectors that allow models to search using numerical vectors instead of specific terms or language. The AI can then find relevant information in the correct context without having to rely on finding the same terms. Using a database that supports efficient storage and searching of vectors and that can turn models’ queries into these numerical vectors, AI models can keep their understanding up-to-date in real time: always learning, always adapting and greatly reducing the chances that outdated or incomplete information will lead to costly hallucinations.