Orchestrating Multi-Agent Systems for Multi-Source Information Retrieval and Question Answering with Large Language Models
Orchestrating Multi-Agent Systems for Multi-Source Information Retrieval and Question Answering with Large Language Models
1 Introduction
The rapid evolution of Large Language Models (LLMs) has transformed the fields of in-
formation retrieval and question-answering (Q&A) systems, enabling significant advance-
ments in understanding and generating human-like text. These capabilities have unlocked
new possibilities for retrieving precise and contextually relevant information from diverse
sources. However, integrating data from heterogeneous sources - such as unstructured text
documents, structured databases, and real-time APIs - into a unified system remains a
complex challenge. Traditional systems often fall short in managing this complexity, strug-
gling to retrieve and correlate information across varying formats, which can compromise
the accuracy and relevance of responses. This challenge highlights the need for sophisti-
cated frameworks that can dynamically orchestrate and retrieve information from multiple
sources while leveraging the contextual understanding offered by LLMs.
Professionals across industries often face the daunting task of navigating vast amounts
of unstructured text while simultaneously accessing structured data. This process is not
only labor-intensive but also error-prone, as locating specific information and correlating
it across disparate sources can be difficult. For example, in Contract Management, re-
trieving details from both lengthy contract documents and structured database records
often requires extensive manual effort. Tasks such as identifying penalties, SLAs, or dead-
lines buried within hundreds of pages and linking them with structured metadata demand
significant time and attention to detail.
To address these issues, we propose a dynamic multi-agent framework that leverages ad-
vanced techniques in orchestration and retrieval to enhance the capabilities of multi-source
DIO:10.5121/ijnlc.2024.13603 27
International Journal on Natural Language Computing (IJNLC) Vol.13, No.5/6, December 2024
2 Background
documents, RAG ensures that only the most pertinent sections are retrieved and incorpo-
rated into the LLM, reducing information overload and enhancing answer precision. The
selection of similarity metrics, such as Cosine or Euclidean distance, significantly impacts
which chunks are chosen for retrieval [Gao et al., 2023b]. In RAG, the chunking strategy
is pivotal because it directly affects the quality of the retrieved content.
A well-crafted chunking approach ensures that the information is cohesive, semantically
complete, and preserves its intended meaning. Various chunking methods can be applied
depending on the nature and structure of the data. For example, one common technique
involves dividing text into chunks based on a set number of tokens, often including an
overlap parameter to maintain continuity between segments. This overlap is particularly
useful in lengthy documents where important details may span multiple chunks. Another
method, particularly effective for structured documents, involves chunking based on spe-
cific sections or headers, such as splitting contracts into clauses or legal sections. This
ensures that each chunk is a self-contained, semantically meaningful unit. The choice of
chunking technique plays a vital role in balancing the need to capture full context while
ensuring relevance in the retrieved information.
2.3 Text-to-SQL
Text-to-SQL translates natural language queries into SQL commands, bridging the gap
between plain-text inputs and relational databases [Seabra et al., 2024b]. This technique
empowers users to access precise, structured data without requiring knowledge of SQL
syntax [Liu et al., 2023]. By leveraging LLMs, Text-to-SQL systems interpret natural lan-
guage, map it to database schemas, and generate accurate queries.
As noted in [Pinheiro et al., 2023], Text-to-SQL systems excel in complex database
environments by linking entities to tables and generating SQL commands. This capability
is particularly valuable in domains like Contract Management, where queries often span
multiple tables with intricate relationships. The correlation between Text-to-SQL systems
and the semantics embedded in the relational schema plays a crucial role in determining
the accuracy of the generated SQL commands. Relational schemas inherently define the
structure and relationships between tables, columns, and data types, providing a semantic
framework that Text-to-SQL models rely on to map natural language queries to precise
SQL statements.
When the schema is well-designed with clear, intuitive naming conventions and mean-
ingful relationships, it enhances the model’s ability to interpret user intent and generate
accurate SQL commands. However, if the schema contains ambiguous or poorly named
entities, lacks sufficient normalization, or features complex relationships, the Text-to-SQL
system may struggle to correctly align the query’s semantics with the database structure.
This misalignment can lead to incomplete, incorrect, or overly broad SQL queries, reduc-
ing the accuracy of the retrieved data. Therefore, the interplay between the relational
schema’s semantics and the Text-to-SQL model’s understanding is critical for achieving
high-quality query translations. Improving schema clarity and incorporating semantic an-
notations can further enhance the system’s performance by providing additional context
for accurate SQL generation.
Text-to-SQL complements RAG by providing precise, structured data retrieval. While
RAG focuses on retrieving semantically similar text for generative synthesis, Text-to-
SQL delivers exact matches from structured databases [Seabra et al., 2024a]. This synergy
enhances the flexibility of multi-source systems, enabling them to address a diverse range
of queries effectively. 30
International Journal on Natural Language Computing (IJNLC) Vol.13, No.5/6, December 2024
2.4 Prompt Engineering
Prompt Engineering is a powerful technique that customizes the behavior of Large Lan-
guage Models (LLMs) by embedding carefully crafted instructions into the input prompts.
These instructions serve to align the model’s outputs with the specific needs and expecta-
tions of the user, providing a high degree of control over the generated responses. By defin-
ing parameters such as tone, format, and the required level of detail, Prompt Engineering
enables developers to guide the model toward producing outputs that are not only accurate
but also contextually appropriate and tailored to the task at hand [OpenAI, 2023b].
Carefully crafted prompts significantly enhances the accuracy and relevance of re-
sponses [White et al., 2023]. In the context of Contract Management, prompts can be
tailored to explicitly specify tasks such as retrieving penalty clauses or summarizing con-
tractual obligations, effectively directing the LLM to focus on the most pertinent sections
of the text. For example, a prompt like “Identify and summarize penalties related to late
delivery in this contract” provides clear and concise guidance, ensuring the model produces
outputs aligned with user expectations. By embedding contextual details and precise in-
structions, well-designed prompts not only reduce ambiguity but also enhance the LLM’s
ability to deliver precise, task-specific information, making them invaluable in domains
requiring high accuracy and contextual awareness [Giray, 2023].
Prompt Engineering also mitigates ambiguity and reducing factual hallucinations,
which are common challenges when working with Large Language Models LLMs. By
carefully designing prompts to restrict responses to specific, reliable data sources, this
technique ensures that the LLM’s outputs are both relevant and grounded in verifiable
information [Wang et al., 2023]. For instance, prompts can be tailored to direct the model
to retrieve data exclusively from trusted repositories or databases, explicitly instructing
it to disregard unsupported prior knowledge. This level of control helps prevent the gen-
eration of plausible-sounding but inaccurate responses, a phenomenon often referred to as
hallucination.
When integrated with advanced retrieval techniques like Retrieval-Augmented Gener-
ation (RAG) and Text-to-SQL, Prompt Engineering amplifies the capabilities of multi-
source systems. For example, in the context of RAG, prompts can instruct the model to
focus on the most relevant information retrieved from a vectorstore, ensuring that the
contextual input aligns closely with the query’s intent. Similarly, in Text-to-SQL systems,
prompts can provide explicit instructions on how to interpret user queries, map them
to database schemas, and prioritize certain fields or relationships for retrieval. Studies
such as [Jeong, 2023] and [Gao et al., 2023a] demonstrate that the integration of Prompt
Engineering with these techniques not only enhances the relevance and precision of re-
sponses but also streamlines the interaction between unstructured and structured data
sources. Moreover, prompts can introduce dynamic contextualization, allowing systems to
adapt instructions in real time based on the query’s requirements, user intent, or the type
of data being accessed. This synergy makes Prompt Engineering a cornerstone of mod-
ern multi-source question-answering frameworks, addressing limitations inherent in LLMs
while improving reliability and user trust in their outputs.
2.5 Agents
Agents serve as the backbone of dynamic workflows in Q&A systems, enabling intelligent
query routing and efficient resource utilization. By dynamically directing queries to the
most appropriate retrieval paths, agents ensure that each request is processed using the
method best suited to its nature and context. This adaptability is essential for multi-source
31
International Journal on Natural Language Computing (IJNLC) Vol.13, No.5/6, December 2024
systems, where queries may require access to diverse data types, including unstructured
text, structured databases, or even real-time APIs. In our framework, specialized agents
such as Router Agents, RAG Agents, and SQL Agents work in tandem to manage these
complexities and provide seamless query handling [Lewis et al., 2020].
Router Agents act as the central decision-makers within the system. When a query
is received, the Router Agent analyzes its structure and intent to determine the optimal
processing strategy. This analysis may involve applying predefined rules, natural language
understanding techniques, or pattern recognition to identify key indicators that guide
query routing. For instance, a query asking for specific numerical data might be routed to
an SQL Agent, while one seeking a contextual explanation might be directed to a RAG
Agent.
RAG Agents specialize in handling queries that involve retrieving unstructured in-
formation. They utilize Retrieval-Augmented Generation (RAG) to fetch relevant text
fragments from vectorstores or document repositories, integrating this data into the con-
text provided to the language model for response generation. This allows the system to
deliver nuanced answers that incorporate insights from external text sources. SQL Agents,
on the other hand, are designed for interacting with structured databases. By leveraging
Text-to-SQL techniques, these agents translate natural language queries into SQL com-
mands, enabling precise retrieval of structured data. This capability is particularly useful
for fact-based queries requiring exact matches, such as financial metrics, inventory details,
or contract deadlines.
The orchestration of these specialized agents ensures that queries are routed and pro-
cessed efficiently, maximizing the relevance and accuracy of the responses. Moreover, this
architecture is inherently scalable and modular, allowing new agents to be integrated as
needed to support additional data types or advanced processing techniques. By coordinat-
ing these agents within a unified framework, the system achieves a high degree of flexibility,
adaptability, and performance, meeting the demands of complex, multi-source Q&A sce-
narios. This agent-based approach not only enhances query handling but also paves the
way for future innovations in intelligent workflow management and data integration.
Agent frameworks, such as Langchain [Langchain, 2024] and Crew AI [cre, 2024], rep-
resent significant advancements in agent-based architectures, offering enhanced capabili-
ties for orchestrating multi-agent workflows in dynamic environments. Crew AI provides
tools for designing, managing, and monitoring specialized agents, ensuring efficient task
routing and execution. By integrating cutting-edge frameworks like Crew AI, agent-based
systems can achieve greater flexibility, scalability, and robustness, especially in complex
multi-source environments. These innovations further underscore the potential of agents
to dynamically adapt to evolving user requirements and domain-specific challenges.
Agents enable dynamic decision-making and modular scalability, improving the rel-
evance and accuracy of responses. By integrating structured and unstructured data re-
trieval, they provide a robust foundation for multi-source Q&A systems [Jin et al., 2024].
The agent-based architecture also allows for adding new capabilities, enhancing adaptabil-
ity across domains.
3 Our Methodology
In designing our multi-source question-answer methodology, we employ a combination
of advanced techniques to seamlessly access diverse data sources and deliver accurate,
contextually relevant responses tailored to the specific query and information source. By
integrating Retrieval-Augmented Generation (RAG), Text-to-SQL, Dynamic Prompt En-
gineering, and Agent-based orchestration, the system effectively manages the complexities
32
International Journal on Natural Language Computing (IJNLC) Vol.13, No.5/6, December 2024
inherent in interacting with both structured and unstructured data sources. Each com-
ponent plays a critical role in addressing different aspects of the information retrieval
process, ensuring the system’s ability to dynamically adapt to the unique requirements of
each query.
RAG is employed to handle unstructured data, such as text documents or knowledge
repositories, by retrieving the most relevant segments of information and incorporating
them into the model’s context for generating precise and well-informed responses. Text-
to-SQL complements this by enabling the system to interpret natural language queries
and translate them into executable SQL commands, allowing precise access to structured
data stored in relational databases. Together, these techniques bridge the gap between
different data modalities, ensuring comprehensive coverage of query requirements.
Dynamic Prompt Engineering serves as the interface between the user’s intent and
the model’s capabilities, guiding the system to focus on relevant aspects of the data and
format responses in a way that aligns with the query’s context. By embedding explicit
instructions and contextual cues into the prompts, the system ensures relevance, accuracy,
and clarity in the generated outputs.
Finally, Agent-based orchestration underpins the entire methodology, acting as the sys-
tem’s decision-making and coordination layer. Specialized agents, such as Router Agents,
RAG Agents, and SQL Agents, dynamically analyze and route queries to the most suitable
processing path based on their nature and complexity. This agent-based architecture not
only streamlines the workflow but also allows the system to scale and evolve by integrating
additional agents for new data types or advanced functionalities.
By harmonizing these components into a unified framework, our methodology effec-
tively addresses the challenges of multi-source question answering, delivering robust per-
formance and adaptability across diverse domains and data ecosystems. This approach
ensures that the system can provide timely, accurate, and context-aware responses regard-
less of the complexity or heterogeneity of the underlying data sources.
RAG enables the retrieval of relevant information from large volumes of unstructured
text, while Text-to-SQL facilitates precise access to structured data within relational
databases. Dynamic Prompt Engineering customizes the query context, ensuring that re-
sponses are tailored to user intent, and Agent-based orchestration coordinates these tech-
33
International Journal on Natural Language Computing (IJNLC) Vol.13, No.5/6, December 2024
niques, directing queries to the appropriate modules and managing workflows seamlessly.
In this section, we detail the approaches and challenges associated with implementing each
of these techniques, along with the strategies we used to optimize their integration.
Our methodology was implemented and validated in a real-world project named Con-
trato360 [Seabra et al., 2024a], a question-answer system tailored to meet the specific
demands of Contract Management. Contrato360 integrates a combination of advanced
techniques, including Retrieval-Augmented Generation (RAG), Text-to-SQL, Dynamic
Prompt Engineering, and Agent-based orchestration, to overcome the challenges of nav-
igating and extracting information from intricate contract documents and structured
databases. This system allows users to efficiently query critical contract-related data, such
as penalty clauses, deadlines, service level agreements, and other contractual obligations,
from diverse data sources. By leveraging these cutting-edge methods, Contrato360 ensures
precise, contextually relevant, and timely responses, addressing the complexity and crit-
icality of the contract management domain. This real-world deployment highlights the
effectiveness and practicality of our methodology in a field where accuracy, relevance, and
contextual comprehension are paramount for decision-making and operational efficiency.
According to [Seabra et al., 2024a], the first step when applying RAG involves (1) reading
the textual content of the PDF documents into manageable (chunks), which are then
(2) transformed into high-dimensional vectors (embedding). The text in vector format
captures the semantic properties of the text, a format that can have 1536 dimensions or
more. These embeddings (vectors) are stored in a vectorstore (3), a database specialized
in high-dimensional vectors. The vector store allows efficient querying of vectors through
their similarities, using the distance for comparison (whether Manhatan, Euclidean or
cosine). Once the similarity metric is established, the query is embedded in the same vector
space (4); this allows a direct comparison between the vectorized query and the vectors
of the stored chunks, retrieving the most similar chunks (5), which are then transparently
integrated into the LLM context to generate a prompt (6). The prompt is then composed
of the question, the texts retrieved from the vectorstore, the specific instructions and,
optionally, the chat history, all sent to the LLM which generates the final response (7).
Chunking strategy One of the first decisions to be made when applying RAG is to
choose the best strategy to segment the document, that is, how to perform the chunking
of the PDF files. A common chunking strategy involves segmenting documents based on
a specific number of tokens and an overlap (overlap). This is useful when dealing with
sequential texts where it is important to maintain the continuity of the context between
the chunks.
There is a common type of document with well-defined sections; contracts are a prime
example. The have a standardized textual structure, organized into contractual sections.
Therefore, sections with the same numbering or in the same vicinity describe the same
contractual aspect, that is, they have similar semantics. For example, in the first section
of contract documents, we always find the object of the contract. In this scenario, we can
assume that the best chunking strategy is to separate the chunks by section of the docu-
ment. In this case, the overlap between the chunks occurs by section, since the questions
will be answered by information contained in the section itself or in previous or subsequent
sections. For the contract page in the example in Figure 2, we would have a chunk for
the section on the object of the contract, another chunk for the section on the term of
34
International Journal on Natural Language Computing (IJNLC) Vol.13, No.5/6, December 2024
the contract, that is, a chunk for each clause of the contract and its surroundings. This
approach ensures that each snippet represents a semantic unit, making retrievals more
accurate and aligned with queries.
Using predefined sections as the boundaries for chunks enhances the relevance of re-
sponses within a single contract. However, this approach presents two main challenges: (1)
within a single document, when a term appears repeatedly, it can be difficult to identify
the specific chunk that answers a question; and (2) as the number of documents increases,
accurately selecting the correct document to address becomes more challenging for the
system. In the Contract Management domain, consider a scenario where the user asks,
”Who is the contract manager of contract number 123/2024?”. This query is intended to
retrieve the specific name of the contract manager for the given contract. However, the
term “contract manager” can appear in various clauses of the contract document, often
in sections that do not contain the name of the actual manager but refer to responsibili-
ties or general rules related to contract management. For instance, multiple clauses across
different sections of the contract might mention the term ”contract manager” in contexts
like assigning responsibilities, explaining the duties of a manager, or defining roles in con-
tract supervision. Even though these clauses contain the term ”contract manager,” they
do not answer the user’s question, which is specifically asking for the name of the contract
manager for contract 123/2024.
Due to the similarity between the query
and these irrelevant sections, the Retrieval-
Augmented Generation (RAG) system may
retrieve a chunk from one of these irrele-
vant clauses that does not actually contain
the required name. For example, instead of
retrieving the clause that explicitly names
the contract manager, the system might re-
trieve a clause that discusses the general
duties of a contract manager. This hap-
pens because the chunk embedding for a
clause about the role or responsibilities of
Fig. 2. Chunking based on Contract’s clauses the manager may be semantically similar
to the query, even though it lacks the spe-
cific information requested. In this case, the
chunk retrieved is related to the term ”contract manager” but does not include the an-
swer the user expects. As a result, the system could return an incorrect response, such as
a general description of the role of a contract manager, rather than identifying the actual
manager for contract 123/2024. This illustrates the challenge of relying solely on textual
similarity in chunk retrieval, as it can lead to the retrieval of information that is similar
to the query in wording but not relevant to the specific context of the user’s question.
To mitigate this, additional filtering mechanisms, such as metadata checks or contract-
specific identifiers, are required to ensure that the system retrieves the most contextually
appropriate information from the correct contract section.
To overcome this issue, several strategies can be applied. One approach is to add
metadata to the chunks and, when accessing the vectorstore, use this metadata to filter
the information returned. This method improves the relevance of the retrieved texts by
narrowing the search to only those chunks that match specific metadata criteria. Figure ??
displays the most relevant metadata attributes for the contracts: source, contract, and
clause. Here, source represents the name of the contract’s PDF file, contract refers to
35
International Journal on Natural Language Computing (IJNLC) Vol.13, No.5/6, December 2024
the contract number, and clause indicates the section title. For instance, when querying,
”Who is the contract manager of contract 123/2024?” the system first filters for chunks
that belong to contract number 123/2024 and clauses related to the contract manager.
Once these chunks are filtered, a similarity calculation is applied to identify the most
relevant text segments, which are then sent to the LLM to generate the final response.
Vectorstore The need to store and query high-dimensional vectors efficiently has led
to the development of specialized vector databases, also known as vectorstores. These
databases allow for the storage and retrieval of vector embeddings, making it possible
to perform similarity searches - a key operation in tasks such as Retrieval-Augmented
Generation (RAG) and semantic search. Unlike traditional databases that are optimized
for structured, tabular data, vector databases are designed to handle embeddings generated
by models like text-davinci-002, which represent semantic relationships in high-dimensional
space.
When choosing the right vector database for a project, several factors come into play,
including scalability, ease of use, latency, and integration with machine learning models.
In our work, we evaluated three popular vector databases: Pinecone, Weaviate, and Chro-
maDB. Pinecone is a cloud-native vector database that excels in providing a fully managed
service for high-performance similarity search. Weaviate is an open-source vector database
that provides a highly flexible, schema-based approach to storing and querying vectors
alongside structured metadata. ChromaDB is an open-source, lightweight vector database
that focuses on simplicity and tight integration with machine learning workflows, making
it ideal for embedding-based retrieval tasks in research and smaller projects. Our choice
was the last one, specially because ChromaDB is easy to set up and integrate into a project
without requiring extensive configuration or overhead. Given that our system is heavily
Python-based, ChromaDB’s Python-first design allowed us to quickly embed it into our
machine learning pipelines. This streamlined our development process, enabling rapid it-
eration and testing, which was especially important in the early stages of system design.
Also, by using ChromaDB, we can directly connect our text-davinci-002 embeddings with
the vectorstore, enabling efficient similarity searches and accurate retrieval of contextually
relevant information.
Agents are central to the functionality and adaptability of our multi-source question-
answer system, enabling it to handle diverse query types efficiently. By leveraging spe-
cialized agents, the system dynamically routes each query to the most suitable processing
pathway, ensuring that user questions are handled with precision and contextual relevance.
In our architecture, the Router Agent serves as the primary decision-maker, evaluating
each incoming query and directing it to the appropriate agent based on predefined criteria.
The Router Agent uses regular expressions to identify keywords, patterns, or struc-
tures within the query. If the query is specific to a clause within a contract, the Router
Agent recognizes this pattern and assigns the query to the RAG Agent. The RAG Agent
is optimized for handling unstructured text data, retrieving relevant text chunks from
the vectorstore. By focusing on textual similarity, the RAG Agent retrieves semantically
aligned information and generates responses that incorporate precise, contextually relevant
excerpts from the documents, addressing the specifics of the the user’s question.
Conversely, if the Router Agent detects that the question involves broader contract
information, such as dates, financial details, or other exact values, it directs the query to the
SQL Agent. The SQL Agent translates the natural language question into a structured SQL
query, which is then executed against the database to retrieve exact data. This approach
is particularly effective for queries requiring precise, structured responses, ensuring that
the system provides accurate and up-to-date information directly from the database.
This dynamic agent-based architecture enables our system to handle both unstructured
and structured data seamlessly. The Router Agent’s decision-making process allows the
system to optimize query processing based on the context and specific needs of each query.
By directing contract-specific questions to the RAG Agent and structured data queries
to the SQL Agent, the Router Agent ensures that user questions are handled efficiently,
providing relevant answers whether they require interpretive text or exact data values.
39
International Journal on Natural Language Computing (IJNLC) Vol.13, No.5/6, December 2024
This modular design not only improves response accuracy but also enhances the system’s
flexibility in adapting to a wide range of contract-related queries.
ensures that responses are contextually accurate, actionable, and user-friendly. This com-
prehensive approach not only enhances the versatility of the question-answer system but
also improves user experience by delivering tailored outputs that align with complex and
varied query requirements.
4 Evaluation
The architecture depicted in the figure represents the implementation of our multi-source
question-answer methodology, combining structured and unstructured data from con-
tracts. The system is built using a modular approach, where each component plays a
critical role in the data retrieval and response generation process. At the core of the archi-
tecture is the User Interface, built with Streamlit, as shown in figure 6, which allows users
to input their queries and view responses in a user-friendly interface. Users can submit
both broad questions or specific contract-related queries, which are then processed by the
backend system.
The Backend Agents act as the decision-making layer of the system, handling queries
based on their type and content. These agents include the Router Agent, which determines
whether to route the query to the RAG Agent (for unstructured text retrieval) or the
SQL Agent (for structured data queries using Text-to-SQL). The agents communicate
bidirectionally with the user interface, allowing for interactive feedback during the query
resolution process.
For the unstructured data flow, contract documents in PDF format undergo processing
in the PDF Documents Processing component. This involves extracting text and metadata
from the documents, which is then passed to the Chunking and Metadata Generation
module. This module divides the documents into manageable chunks, enriching them with
metadata for easier retrieval. These chunks are further processed through the Embeddings
Generation component, where each chunk is transformed into a high-dimensional vector
representation using an embedding model. These embeddings are stored in the Vectorstore
(implemented using ChromaDB) for efficient similarity search during retrieval.
On the structured data side, the Contracts Database (implemented using SQLite)
stores relevant contract data such as specific terms, clauses, dates, and financial informa-
41
International Journal on Natural Language Computing (IJNLC) Vol.13, No.5/6, December 2024
tion. When a query requires precise data retrieval, such as asking for contract values or
deadlines, the SQL Agent retrieves the necessary information directly from this database.
By integrating both the vectorstore and structured database, the Backend Agents can
provide comprehensive answers to user queries, dynamically choosing the most appropriate
data source based on the type of question. This hybrid approach ensures that the system
can handle both semantically complex queries and direct database queries, offering flexible
and accurate responses.
The system was evaluated through experiments conducted by specialists from BNDES
(Social and Economic Development Bank of Brazil), who validated its performance using
a set of 75 contracts. These contracts, including both PDFs and associated metadata,
were processed to assess the system’s ability to retrieve relevant information from both
unstructured documents and structured data. To evaluate the system’s effectiveness in
answering various query types, a set of benchmark questions was developed, divided into
two categories: direct and indirect questions.
Direct questions refer to those that could be answered using information directly avail-
able in the contract PDFs and their metadata. Examples include questions about contract
subjects, suppliers, managers, and contract terms. The results demonstrated that for these
direct questions, the system consistently provided complete and relevant responses, meet-
ing the users’ expectations for accuracy and comprehensiveness.
42
International Journal on Natural Language Computing (IJNLC) Vol.13, No.5/6, December 2024
efficiently retrieves relevant information from diverse data sources to provide precise and
contextually accurate responses. The use of backend agents, particularly the Router Agent,
allowed for a flexible and adaptive workflow where queries are dynamically routed to the
appropriate processing module—whether that be the RAG agent for text-based retrieval
or the SQL agent for direct database queries.
The 6 demonstrates the abil-
ity of Contrato360 in retrieving
and summarizing contract infor-
mation related to Oracle through
a question-and-answer interface.
Our implementation, which in-
cludes the use of ChromaDB as
the vectorstore for storing docu-
ment embeddings and SQLite for
managing contract data, ensures
that the system can handle com-
plex legal documents while main-
taining real-time performance in
answering user queries. The com-
bination of these technologies en-
ables the system to provide a
seamless experience where both
structured and unstructured data
are processed cohesively, offering
a unified approach to contract
management and information re-
trieval.
Despite the success of our ap-
proach, there remain several areas
for future development. One sig-
nificant avenue for improvement
is the further refinement of the Fig. 8. Contract Summarization
Router Agent. Currently, it relies
on predefined regular expressions
to route queries, but integrating machine learning models to dynamically adapt and learn
from query patterns could increase the precision and flexibility of the system. Addition-
ally, expanding the system’s capability to handle a wider variety of legal documents and
domains, beyond contract management, would provide greater scalability and versatility.
Another important direction for future work involves improving the system’s interac-
tion with graph-based data. We have already implemented a Graph Agent to visualize data
using bar graphs, but incorporating more advanced data visualizations, such as time-series
analysis or multi-dimensional comparisons, would provide users with deeper insights into
the retrieved data. Moreover, enhancing the chunking strategy for document segmentation
and metadata generation could mitigate the issue of misalignment between query intent
and retrieved text, especially for more complex and ambiguous legal queries.
Finally, while our current system integrates effectively with contract documents and
databases, there is potential to expand its multi-source retrieval capabilities by incorpo-
rating external data sources such as APIs, web services, or even real-time data streams.
This would provide users with even more comprehensive and up-to-date information.
44
International Journal on Natural Language Computing (IJNLC) Vol.13, No.5/6, December 2024
References
João Nepomuceno received his Bachelor’s Degree in Physics from Universidade Fed-
eral Fluminense, Brazil, and he is currently pursuing his Bachelor’s Degree in Computer
Science at Universidade Federal Fluminense, Brazil. His research interests include Data
Engineering, Artificial intelligence and Natural Language Processing.
Lucas Lago is currently pursuing his Bachelor’s Degree in Computer Science at Uni-
versidade do Estado do Rio de Janeiro, Brazil. His research interests include Artificial
Intelligence and Natural Language Processing.
46