WWW Oracle Com in Artificial-Intelligence Generative-Ai Retrieval-Augmented-Generation-Rag
WWW Oracle Com in Artificial-Intelligence Generative-Ai Retrieval-Augmented-Generation-Rag
In This Article Generative artificial intelligence (AI) excels at creating text responses based on large language models (LLMs) where the AI is
trained on a massive number of data points. The good news is that the generated text is often easy to read and provides
What Is Retrieval-Augmented detailed responses that are broadly applicable to the questions asked of the software, often called prompts.
Generation (RAG)?
The bad news is that the information used to generate the response is limited to the information used to train the AI, often a
Retrieval-Augmented Generation generalized LLM. The LLM’s data may be weeks, months, or years out of date and in a corporate AI chatbot may not include
Explained specific information about the organization’s products or services. That can lead to incorrect responses that erode confidence in
the technology among customers and employees.
How Does Retrieval-Augmented
Generation Work?
Future of Retrieval-Augmented
Generation
In addition to the large, fairly static LLM, the sports league owns or can access many other information sources, including
databases, data warehouses, documents containing player bios, and news feeds that discuss each game in depth. RAG lets the
generative AI ingest this information. Now, the chat can provide information that’s more timely, more contextually appropriate,
and more accurate.
Key Takeaways
RAG is a relatively new artificial intelligence technique that can improve the quality of generative AI by allowing large
language model (LLMs) to tap additional data resources without retraining.
RAG models build knowledge repositories based on the organization’s own data, and the repositories can be continually
updated to help the generative AI provide timely, contextual answers.
Chatbots and other conversational systems that use natural language processing can benefit greatly from RAG and
generative AI.
Implementing RAG requires technologies such as vector databases, which allow for the rapid coding of new data, and
searches against that data to feed into the LLM.
PDFmyURL converts web pages and even full websites to PDF easily and quickly.
Consider all the information that an organization has—the structured databases, the unstructured PDFs and other documents,
the blogs, the news feeds, the chat transcripts from past customer service sessions. In RAG, this vast quantity of dynamic data is
translated into a common format and stored in a knowledge library that’s accessible to the generative AI system.
The data in that knowledge library is then processed into numerical representations using a special type of algorithm called an
embedded language model and stored in a vector database, which can be quickly searched and used to retrieve the correct
contextual information.
Now, say an end user sends the generative AI system a specific prompt, for example, “Where will tonight’s game be played, who
are the starting players, and what are reporters saying about the matchup?” The query is transformed into a vector and used to
query the vector database, which retrieves information relevant to that question’s context. That contextual information plus the
original prompt are then fed into the LLM, which generates a text response based on both its somewhat out-of-date generalized
knowledge and the extremely timely contextual information.
Interestingly, while the process of training the generalized LLM is time-consuming and costly, updates to the RAG model are just
the opposite. New data can be loaded into the embedded language model and translated into vectors on a continuous,
incremental basis. In fact, the answers from the entire generative AI system can be fed back into the RAG model, improving its
performance and accuracy, because, in effect, it knows how it has already answered a similar question.
An additional benefit of RAG is that by using the vector database, the generative AI can provide the specific source of data cited
in its answer—something LLMs can’t do. Therefore, if there’s an inaccuracy in the generative AI’s output, the document that
contains that erroneous information can be quickly identified and corrected, and then the corrected information can be fed into
the vector database.
In short, RAG provides timeliness, context, and accuracy grounded in evidence to generative AI, going beyond what the LLM
itself can provide.
RAG isn’t the only technique used to improve the accuracy of LLM-based generative AI. Another technique is semantic search,
which helps the AI system narrow down the meaning of a query by seeking deep understanding of the specific words and
phrases in the prompt.
Traditional search is focused on keywords. For example, a basic query asking about the tree species native to France might
search the AI system’s database using “trees” and “France” as keywords and find data that contains both keywords—but the
system might not truly comprehend the meaning of trees in France and therefore may retrieve too much information, too little,
or even the wrong information. That keyword-based search might also miss information because the keyword search is too
literal: The trees native to Normandy might be missed, even though they’re in France, because that keyword was missing.
PDFmyURL converts web pages and even full websites to PDF easily and quickly.
Semantic search goes beyond keyword search by determining the meaning of questions and source documents and using that
meaning to retrieve more accurate results. Semantic search is an integral part of RAG.
The “ask a question, get an answer” paradigm makes chatbots a perfect use case for generative AI, for many reasons. Questions
often require specific context to generate an accurate answer, and given that chatbot users’ expectations about relevance and
accuracy are often high, it’s clear how RAG techniques apply. In fact, for many organizations, chatbots may indeed be the
starting point for RAG and generative AI use.
Questions often require specific context to deliver an accurate answer. Customer queries about a newly introduced product, for
example, aren’t useful if the data pertains to the previous model and may in fact be misleading. And a hiker who wants to know
if a park is open this Sunday expects timely, accurate information about that specific park on that specific date.
The RAG has access to information that may be fresher than the data used to train the LLM.
Data in the RAG’s knowledge repository can be continually updated without incurring significant costs.
The RAG’s knowledge repository can contain data that’s more contextual than the data in a generalized LLM.
The source of the information in the RAG’s vector database can be identified. And because the data sources are known,
incorrect information in the RAG can be corrected or deleted.
PDFmyURL converts web pages and even full websites to PDF easily and quickly.
Increasing costs; while generative AI with RAG will be more expensive to implement than an LLM on its own, this route is
less costly than frequently retraining the LLM itself
Determining how to best model the structured and unstructured data within the knowledge library and vector database
Developing requirements for a process to incrementally feed data into the RAG system
Putting processes in place to handle reports of inaccuracies and to correct or delete those information sources in the RAG
system
Cohere, a leader in the field of generative AI and RAG, has written about a chatbot that can provide contextual information
about a vacation rental in the Canary Islands, including fact-based answers about beach accessibility, lifeguards on nearby
beaches, and the availability of volleyball courts within walking distance.
Oracle has described other use cases for RAG, such as analyzing financial reports, assisting with gas and oil discovery, reviewing
transcripts from call center customer exchanges, and searching medical databases for relevant research papers.
In the future, possible directions for RAG technology would be to help generative AI take an appropriate action based on
contextual information and user prompts. For example, a RAG-augmented AI system might identify the highest-rated beach
vacation rental on the Canary Islands and then initiate booking a two-bedroom cabin within walking distance of the beach
during a volleyball tournament.
RAG might also be able to assist with more sophisticated lines of questioning. Today, generative AI might be able to tell an
employee about the company’s tuition reimbursement policy; RAG could add more contextual data to tell the employee which
nearby schools have courses that fit into that policy and perhaps recommend programs that are suited to the employee’s jobs
and previous training—maybe even help apply for those programs and initiate a reimbursement request.
In addition, Oracle is integrating generative AI across its wide range of cloud applications, and generative AI capabilities are
available to developers who use OCI and across its database portfolio. What’s more, Oracle’s AI services offer predictable
performance and pricing using single-tenant AI clusters dedicated to your use.
The power and capabilities of LLMs and generative AI are widely known and understood—they’ve been the subject of
breathless news headlines for the past year. Retrieval-augmented generation builds on the benefits of LLMs by making them
more timely, more accurate, and more contextual. For business applications of generative AI, RAG is an important technology to
watch, study, and pilot.
No. Retrieval-augmented generation is a technique that can provide more accurate results to queries than a generative large
language model on its own because RAG uses knowledge external to data already contained in the LLM.
PDFmyURL converts web pages and even full websites to PDF easily and quickly.
RAG can incorporate data from many sources, such as relational databases, unstructured document repositories, internet data
streams, media newsfeeds, audio transcripts, and transaction logs.
Data from enterprise data sources is embedded into a knowledge repository and then converted to vectors, which are stored in
a vector database. When an end user makes a query, the vector database retrieves relevant contextual information. This
contextual information, along with the query, is sent to the large language model, which uses the context to create a more
timely, accurate, and contextual response.
Yes. The vector databases and knowledge repositories used by RAG contain specific information about the sources of
information. This means that sources can be cited, and if there’s an error in one of those sources it can be quickly corrected or
deleted so that subsequent queries won’t return that incorrect information.
Investors OCI | Microsoft Azure What is Cloud Storage? Cloud Architecture Center Sales: +91 80-37132100
Partners Cloud Reference What is HPC? Cloud Lift How can we help?
Students and Educators Corporate Responsibility What is PaaS? Oracle Red Bull Racing Events
Diversity and Inclusion News
Security Practices OCI Blog
© 2024 Oracle | Terms of Use and Privacy Cookie Preferences Ad Choices Careers
PDFmyURL converts web pages and even full websites to PDF easily and quickly.