使用Vearch进行向量搜索和存储的实战指南-CSDN博客

本文链接：https://blog.csdn.net/ppoojjj/article/details/149072689

在AI和机器学习的领域中，随着数据量的增大和对模型性能要求的提升，能够高效存储和快速检索向量数据变得至关重要。Vearch是一个专为此设计的向量数据库，能够有效地应用于大语言模型（LLM）的数据存储和搜索。通过本文，你将了解如何设置和使用Vearch来处理向量数据。

技术背景介绍

Vearch是一个存储大语言模型数据的向量数据库，用于存储和快速检索模型embedding后的向量。其优势在于支持多种语言模型，包括OpenAI、Llama、ChatGLM等，并与LangChain库集成以提供更强的兼容性和可用性。另外，Vearch是基于C语言和Go语言开发的，提供了方便的Python接口，简化了开发者的使用流程。

核心原理解析

Vearch通过embedding将文本数据转换为向量，存储于其数据库中。搜索时，Vearch利用其构建的索引对嵌入向量进行相似度计算，快速返回相关结果。该过程保证了良好的性能和精准度，使Vearch适合需要实时响应的应用场景。

代码实现演示

以下是一个完整的代码示例，展示了如何在本地环境中设置Vearch来处理文档数据。

from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores.vearch import Vearch
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from transformers import AutoModel, AutoTokenizer

# 替换为你的本地模型路径
model_path = "/data/local_model/chatglm2-6b"

# 初始化模型
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda(0)

# 加载本地文件
file_path = "/data/local_data/lingboweibu.txt"
loader = TextLoader(file_path, encoding="utf-8")
documents = loader.load()

# 文本切分
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
texts = text_splitter.split_documents(documents)

# 嵌入处理
embedding_path = "/data/local_model/text2vec-large-chinese"
embeddings = HuggingFaceEmbeddings(model_name=embedding_path)

# 将文档添加到Vearch向量存储
vearch_standalone = Vearch.from_documents(
    texts,
    embeddings,
    path_or_url="/data/vearch_db/localdb",
    table_name="localdb_table",
    flag=0,
)

query = "你知道凌波微步吗，你知道都有谁会凌波微步?"
vearch_standalone_res = vearch_standalone.similarity_search(query, 3)
for idx, tmp in enumerate(vearch_standalone_res):
    print(f"{'#'*20}第{idx+1}段相关文档{'#'*20}\n\n{tmp.page_content}\n")

# Combine query with local knowledge
context = "".join([tmp.page_content for tmp in vearch_standalone_res])
new_query = f"基于以下信息，尽可能准确地来回答用户的问题。背景信息:\n {context} \n 回答用户这个问题:{query}\n\n"
response, history = model.chat(tokenizer, new_query, history=[])
print(f"********ChatGLM:{response}\n")