Indexing and Retrieval (Updated) - Part 1
Indexing and Retrieval (Updated) - Part 1
Instructors
Prashant Sahu
Manager - Data Science, Analytics Vidhya
Ravi Theja
Developer Advocate Engineer, LlamaIndex
Recap of RAG Framework
Indexing
What is an Index?
Structured dataset enabling fast and efficient information retrieval.
Allows quick location and retrieval of relevant information.
Improves the generation process by accessing a large corpus
efficiently.
Why do we need Index in a RAG system?
Efficiency: Enables quick retrieval by organizing data for
efficient searches.
Scalability: Handles large-scale datasets, ensuring timely
retrieval operations.
Accuracy: Improves retrieval precision, enhancing response
quality.
Real-time Performance: Supports fast, real-time responses for
applications like chatbots and virtual assistants.
Index Types
Retrieval
Different Retriever depends on the specific indexing
Types of technique
Retrievers
Summary Index Document Summary Index
Default: SummaryIndexRetriever llm:
Embedding: DocumentSummaryIndexLLMRetriever
SummaryIndexEmbeddingRetriever Embedding:
llm: SummaryIndexLLMRetriever DocumentSummaryIndexEmbeddingRetriever
Retrieval
Modes
for
Different Tree Index Keyword Table Index
Indexes Select_leaf: TreeSelectLeafRetriever
Select_leaf_embedding:
Keyword: KGTableRetriever
Embedding: KGTableRetriever
TreeSelectLeafEmbeddingRetriever Hybrid: KGTableRetriever
All_leaf: TreeAllLeafRetriever
Root: TreeRootRetriever
1. Summary Index
Retrieval Modes for Summary Index
SummaryIndexRetriever: Uses traditional keyword-based search techniques.
SummaryIndexEmbeddingRetriever: Uses pre-computed embeddings for
semantic similarity search.
SummaryIndexLLMRetriever: Utilizes a language model for dynamic query
interpretation and retrieval.
1A. Summary Index Retriever
1B. Summary Index Embedding Retriever
1C. Summary Index LLM Retriever
2. Vector Store Index
Chunk 1
Chunk 2
Chunk 3
Vector Store Index Retriever
Vector Store SummaryIndex stores all the nodes in the form of
Index vs the sequence/list in the storage, unlike the vector
storage index.
Summary Embeddings are created during the querying time
Index rather than during index construction itself.
Index Types
Thank You