Retrieval Augmented Generation (RAG): the key to unlocking enterprise AI?

by Dominique Theodore, Founder/Director

Nearly two years after GenAI captured global attention with the release of ChatGPT in late 2022, most enterprises have now either invested or at least explored its possibilities. However, when it comes to knowledge-intensive processes that constitute the core part of many organizations, Generative AI has yet to fully deliver on its promise to transform the way we work and do business.

One of the glaring issues that has become apparent to early adopters of GenAI has been “hallucinations” – the concerning tendency for AI models to produce output that seems plausible but in fact has no basis in reality.

And while the latest models claim to have drastically reduced hallucinations, recent surveys indicate that trust remains a major roadblock to the widespread AI adoption in industry. According to a survey conducted in the fourth quarter of 2023 by Gartner, 39% of participants still cite lack of trust in AI as one of the barriers to the implementation of AI within their organization.

Why do AI models hallucinate?

At the heart of all large language models such as ChatGPT, Claude or Bard lies the transformer, an AI architecture that can process large amounts of raw data and figure out how parts of different parts of this data relate to each other. AI models use the patterns they learn from existing data to generate new data and with some fine-tuning can be adapted to perform a wide variety of tasks such as answering questions, generating computer code and even translating languages.

What gives AI models their creativity also means that they will sometimes “try their best” when faced with a question which is outside their knowledge base, essentially “making up stuff” in the process. While this is hardly a dealbreaker for someone planning a weekend getaway, it is much more serious for a lawyer working on a multi-million-dollar merger or a financial analyst planning his next investment decision.

Hallucinations are not the only hurdle when it comes to adopting LLMs in the enterprise. Another major drawback of LLMs is that have limited access to proprietary data and without domain-specific data, their use in knowledge-intensive enterprise use cases remains limited.

To overcome these limitations, it is possible to train an AI model on smaller, specialized dataset and adjusting its parameters to the new data. However, this technique known as fine-tuning requires significant skills and resources as well as access to sufficient labelled data to train the data.

Enter Retrieval Augmented Generation (or RAG), one of the most exciting developments in artificial intelligence that promises to combine the capabilities of generative with real-time, relevant data retrieval.

RAG is an AI framework that aims to improve the quality of LLM responses by supplementing their internal representation of information with authoritative sources of knowledge.

In simple terms, the RAG pipeline involves three stages:

  1. Retrieval: This is where the RAG engine searches and ranks data which is relevant to the query.

  2. Augmentation: Taking the top ranked results and adding them to the prompt that will be fed into the Large Language Model

  3. Generation: This involves combining the generative powers of the LLM with the retrieved external data to produce a response that is coherent, up-to-date and relevant to the query.

What can RAG bring to the enterprise?

Implementing RAG can bring about several benefits. By combining its creative powers with enterprise-specific data, RAG reduces the chance that AI pulls information baked into its parameters. This reduces the chances that an LLM will leak sensitive data, or ‘hallucinate’ incorrect or misleading information.

RAGs could also make LLMs more effective at answering questions that rely on the latest up-to-date information, potentially in real-time.

Ultimately, RAG will help to improve trust in artificially generated output through its ability to cross-reference the sources of information, which could pave the way for the use of AI in critical enterprise applications where accuracy and reliability are paramount.

Use case

Let us imagine a typical company’s HR team having to deal with employee questions about different aspects of the company’s HR policies. Answering such queries usually involves the team pulling up the relevant policy, determining how it applies to the situation, and then conversing with the employee to interpret the policy.

RAG makes it possible to create a chatbot that can answer questions based on actual HR policy documents. Such a RAG enabled chatbot is able to provide answers are not only grounded in factual information but can also correctly understand the intent behind the questions.

The picture below shows an interaction with a simple HR chatbot we built using LlamaIndex, an open-source RAG framework with ChatGPT as the underlying LLM. Answers are generated from policy documents processed by the RAG engine.

An interaction with an HR chatbot Although the above example involved unstructured (textual) data, RAG frameworks can also work structured data such as tables, JSON, CSV files and SQL databases.

The risks and challenges

Despite the benefits of adopting a RAG highlighted above, it should be mentioned that RAGs are not a bullet proof solution to the underlying issues and in fact come with their own set of issues. Exposing sensitive, internal data to externally hosted LLMs is not without its risks, and enterprises should exercise due diligence to ensure their intellectual property is protected.

However, RAG will be a powerful tool for enterprises looking to leverage GenAI while maintaining the focus on accuracy and reliability. By expanding knowledge base of LLMs with real-time, relevant and domain-specific information, RAG can set the stage for applications of AI in critical knowledge-intensive businesses where dependability is non-negotiable.

If you have questions or are interested in implementing AI in your organization, do not hesitate to contact us.

Tell us about your project

Our offices

  • Mauritius
    Dr. Edwards Street
    Curepipe, Mauritius
  • Ethiopia
    Semit 72
    Addis Ababa, Ethiopia