0% found this document useful (0 votes)
38 views

Building Vector Databases With FastAPI and ChromaDB - by Om Kamath - May, 2024 - Level Up Coding

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

Building Vector Databases With FastAPI and ChromaDB - by Om Kamath - May, 2024 - Level Up Coding

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

Open in app Sign up Sign in

Search Write

Building Vector Databases with


FastAPI and ChromaDB
Beginners guide to ChromaDB vectorstores and FastAPI with
Langchain

Om Kamath · Follow
Published in Level Up Coding · 10 min read · May 7, 2024

464 4

Returning to writing after a lengthy break, I’m finally carving out some time
to dive into it. With the amount of innovations and new AI tech popping up, I

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 1/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

decided to take a step back and explore some of the fundamentals now. After
getting to work around APIs at my work and with Python being my language
of choice, I felt that it was a good time to explore some API development
frameworks for Python.

While studying for my final exams, as usual I got distracted by YouTube


recommendations and came across a video by Travis Media about FastAPI.
Without hesitation, I forgot about my studies and started exploring FastAPI
(yeah, a CS undergrad student in his senior year still hasn’t been taught how
to develop and deploy an app from scratch).

For all those who are starting out with backend development in Python,
there are mainly 3 frameworks available:

1. Flask

2. FastAPI

3. Django

Django is a full-stack web framework that is catered towards building a


complete web app from the scratch whereas Flask and FastAPI are micro-
web framework that are ideal for developing smaller applications and web
APIs.

Why did I choose FastAPI over Flask?


To be honest, I’m not entirely sure. I’ve come across information suggesting
that FastAPI offers a more API-development friendly environment, with
features that we’ll explore further in this article. Being someone who tends
to be indecisive, I felt it was better to simply choose one and dive in, rather

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 2/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

than endlessly deliberating over technologies and wasting time. So, without
further delay, let’s jump right into it.

What is FastAPI?

“FastAPI is a modern, fast (high-performance), web framework for building


APIs with Python based on standard Python type hints.” — From the FastAPI
documentation

Some features of FastAPI are:

1. Fast

2. Intuitive

3. Easy to Code

4. Standards-based: Compatible with OpenAPI and JSON Schema.

5. Automatic Docs

6. Based on and compatible with Pydantic

“No brainfuck” — From the FastAPI documentation

Among the listed features, only two really grabbed my attention: its ease of
coding (perfect for my lazy tendencies) and its foundation built on top of
Pydantic.

Pydantic is a data-validation library that allows you to declare schemas using


classes and inheritance. Its main advantage lies in its built-in features for
type safety, enforcing your data to conform to the schema.

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 3/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

A basic Pydantic code looks like:

from pydantic import BaseModel

#extending the BaseModel


class User(BaseModel):
user: str
age: int

user = User(user="Om",age="21")
print(user)

Output

Even though we provided the model with age in string format, Pydantic
automatically typecasts it to an integer. This is one of the advantages of using
Pydantic over the built-in Python classes.

Setting up FastAPI
Setting up FastAPI is pretty simple and requires just a pip installation using
pip install fastapi

Once you have got FastAPI installed, you can test it out using this sample
code:

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 4/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

from fastapi import FastAPI

app = FastAPI()

@app.get("/")
async def root():
return {"message": "Whatchamacallit"}

app = FastAPI() : This line creates an instance of the FastAPI class and
assigns it to the variable app . This instance represents your FastAPI
application.

@app.get("/") : This is a decorator syntax in Python, used to define a


route for handling HTTP GET requests to the root URL ("/") of your API.
Decorators are a way to modify or extend the behaviour of functions or
methods. The @app.get decorator indicates that the following function
( root() ) will handle GET requests to the specified route. To simplify, you
can think of it as a way of mapping a specific URL endpoint to a Python
function that will handle requests made to that endpoint.

async def root(): : This line defines a function named root using the
async keyword, indicating that it is an asynchronous function. This
function will handle requests to the root URL ("/") of your API.

To run the server (FastAPI uses uvicorn), ensure that you save the file as
main.py , as it will be referenced by the command fastapi dev main.py . This

command should start a localhost server that you can use for testing.

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 5/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

Terminal

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 6/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

HTTP Client

To test the code, we will use an HTTP client like Postman or HTTPie. You can
choose whichever you prefer. HTTPie’s minimalist approach appeals to me
more, but Postman is packed with more API-testing features. Won’t be diving
into it in this article.

Building a real-world API using FastAPI


Well, I don’t want to bore you with the same old tutorial of going over all the
features sequentially. Instead, we will be building something interesting yet
simple using Langchain, ChromaDB, and FastAPI. We will build an API that
creates and deletes a vector database and fetches relevant chunks from a
PDF document using semantic search.

To explain in short:
https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 7/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

1. Langchain: An open-source framework that helps developers build


applications using large language models (LLMs). It contains all the
required LLM tools as built-in functions for convenient development.

2. ChromaDB: An open-source vector database to store all the word


embeddings / chunks.

Semantic Search with Chunking

Vector Databases
If you’re wondering about the purpose of vector databases, they’re incredibly
powerful and play a major role in many AI startups that have emerged in the
last two years. Vector databases are utilized to store embeddings, which are
vector representations of textual data that capture the meaning of the
content. This enables various operations such as fetching data without the
need for formal querying or keyword matches. Data from vector databases is
retrieved through similarity search, which mainly employs either of these
two techniques:

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 8/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

1. K-Nearest Neighbours: This involves calculating the distance between


each vector using methods like Euclidean, Manhattan distances, or
cosine similarity.

2. Approximate Nearest Neighbour Search: Instead of computing distances


between each vector in the database, we retrieve a “good guess” of the
nearest neighbour.

To delve deeper into similarity search, I recommend reading this well-


written article by Rajat Tripathi.

The flow
1. Chunking the PDF Document using Langchain.

2. Generating word embeddings for the chunks using an open-source


embedding model.

3. Uploading word embeddings to the vector database.

4. Fetching the nearest neighbouring chunks to the user query using


similarity search.

5. Deleting the database.

6. Create endpoints for the functions in FastAPI.

Chunking the PDF Document using Langchain


Create a new file named functions.py which will contain all the endpoint
methods we will be calling in the main app.

To chunk the PDF document, we will load the document using PyPDF.
Langchain offers multiple document loader methods, with PyPDF being one
of them.

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 9/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("files/samples.pdf")
pages = loader.load()

After loading the document, the next step is to chunk it. The primary
purpose of chunking documents is to break them into contextually relevant
segments that can be later fed into an LLM for Retrieval Augmented
Generation (RAG). Langchain provides a variety of text splitters to choose
from for chunking documents.

Types of text-splitters:

1. Recursive: Recursively splits text. Splitting text recursively serves the


purpose of trying to keep related pieces of text next to each other. This is
the recommended way to start splitting text.

2. HTML: Splits text based on HTML-specific characters.

3. Markdown: Splits text based on Markdown-specific characters.

4. Code: Splits text based on characters specific to coding languages. 15


different languages are available to choose from.

5. Token: Splits text on tokens. There exist a few different ways to measure
tokens.

6. Character: Splits text based on a user defined character. One of the


simpler methods.

For this application, to keep it simple, we’ll proceed with the


RecursiveCharacterTextSplitter() . It’s the simplest and most effective text

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 10/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

splitter for basic PDF or text documents. As we become familiar with the
chunking process, we can experiment with different text splitters, but that’s
a topic for a separate article. The RecursiveCharacterTextSplitter() splits
chunks based on predefined separators such as ‘\n’ or whitespace, and the
chunk size can be set as a parameter.

from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=10


is_separator_regex=False)
chunks = text_splitter.split_documents(pages)

To understand the RecursiveCharacterTextSplitter() in detail, this is a really


good article that covers all the aspects.

Generating word embeddings for the chunks using an open-source


embedding model
Word-embedding models are trained using supervised learning on a large
corpus of data, which helps the model assign similar vectors to words or
chunks with similar meanings/context. This is the beauty of word
embeddings, as they enable us to facilitate semantic search.

There are multiple embedding models available from OpenAI, Cohere,


Google, etc. Since we’re not building a production-grade application, open-
source models should suffice for our needs. Langchain provides a built-in
function called SentenceTransformerEmbeddings() , which allows us to use the
all-MiniLM-L6-v2 embedding model. This is a free open-source embedding
model provided by sbert.net.

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 11/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

from langchain_community.embeddings.sentence_transformer import (


SentenceTransformerEmbeddings,
)

embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2"

Uploading word embeddings to the vector database


Utilizing the Langchain ChromaDB library makes adding embeddings to the
vector database a breeze. Only one line of code is needed to accomplish this
task.

from langchain_chroma import Chroma

ids = [str(i) for i in range(1, len(chunks) + 1)]


Chroma.from_documents(pages, embedding_function, persist_directory="chroma_db",

We will be assigning IDs to the chunks to avoid duplication while adding the
chunks to the database. persist_directory saves the vector database to the
working directory which can be later loaded up while querying.

Fetching the nearest neighbouring chunks to the user query using


similarity search
The Langchain ChromaDB library includes built-in similarity search
functionality. At this point, just about everything is built into Langchain.
This allows us to focus on building the application. The default
similarity_search() function uses cosine similarity to retrieve the nearest
neighbours. The k parameter will control the number of neighbours to
fetch.

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 12/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2"
db = Chroma(persist_directory="chroma_db",embedding_function=embedding_function)
results = db.similarity_search(query.query, k=query.neighbours)

There’s also another function called similarity_search_with_score() that


retrieves the neighbours along with their cosine similarity scores. This can
be useful if you’re curious or if you want to perform additional reranking
operations on the chunks.

Deleting the database


I wasn’t able to find a specific function for deleting the entire database, so
for this tutorial, I decided to implement it by deleting the persisted directory
of the vector database using shutil . While not perfect, and although we
could make use of collections in the vector database, I wanted to keep this
tutorial simple, especially since it was my first time working with FastAPI
too.

if "chroma_db" in os.listdir():
shutil.rmtree("chroma_db")
print(f"Deleted database and its contents.")
else:
raise FileNotFoundError("Database not found.")

Create endpoints for the functions in FastAPI


Before diving into the exciting part of creating the API, let’s refactor and
structure the code to create callable functions for the endpoints.

functions.py :

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 13/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

from langchain_community.document_loaders import PyPDFLoader


from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings.sentence_transformer import (
SentenceTransformerEmbeddings,
)
from langchain_chroma import Chroma
import warnings
import shutil
import os

warnings.filterwarnings('ignore')

#Creating the database


def create_db():

loader = PyPDFLoader("files/samples.pdf")
pages = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overla


is_separator_regex=False)
chunks = text_splitter.split_documents(pages)
print(len(chunks))

ids = [str(i) for i in range(1, len(chunks) + 1)]

# create the open-source embedding function


embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6

# Create the Chroma database with IDs


Chroma.from_documents(pages, embedding_function, persist_directory="chroma_d

#Deleting the database


def delete_persisted_db():
if "chroma_db" in os.listdir():
shutil.rmtree("chroma_db")
print(f"Deleted database and its contents.")
else:
raise FileNotFoundError("Database not found.")

For mapping these functions to the endpoints, we will have to import the
functions from functions.py . After importing we can edit the main.py file we
https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 14/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

created at the start of this article to include the new endpoints.

main.py :

from fastapi import FastAPI, HTTPException


from models import Query
from langchain_chroma import Chroma
from langchain_community.embeddings.sentence_transformer import (
SentenceTransformerEmbeddings,
)
from functions import create_db, delete_persisted_db

app = FastAPI()

@app.get("/")
async def root():
return {"message": "Whatchamacallit"}

#Create database
@app.get("/create/")
async def create_database():
create_db()
return {"message": "Database created."}

#Delete database
@app.delete("/delete/")
async def delete_database():
try:
delete_persisted_db()
return {"message": "Database deleted."}
except FileNotFoundError as e:
raise HTTPException(status_code=404, detail=str(e))

#Fetch Chunks
@app.post("/neighbours/")
async def fetch_item(query: Query):
embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6
db = Chroma(persist_directory="chroma_db",embedding_function=embedding_funct
results = db.similarity_search(query.query, k=query.neighbours)
return {"message": "Nearest neighbours found.", "results": results}

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 15/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

Another useful feature of FastAPI is its predefined error codes, which can be
called directly from the FastAPI library and customized to suit your needs.

We will also need define the Pydantic model for the /neighbours endpoint
query body.

models.py :

# Create a basic model for the FastAPI

from pydantic import BaseModel

class Query(BaseModel):
query: str
neighbours: int = 3

Outputs

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 16/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

For wrong method (POST instead of GET)

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 17/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

Database Creation Endpoint

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 18/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

Fetching the nearest neighbours

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 19/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

Automatic Documentation (redoc)

Conclusion
That concludes this simple starter tutorial, which should provide both you
and me with an understanding of the fundamental steps involved in building
an app that utilizes an LLM at the backend, coupled with basic API
development. FastAPI offers many more powerful features for
authentication and even frontend development using FastUI, which I plan to
explore in my future blogs. My only gripe while working with Langchain is
the documentation, which can be quite complicated due to the abundance of
functionalities available. It’s both a blessing and a curse. If you’re just
starting with Langchain, it can feel quite intimidating, but remember to take
it one step at a time

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 20/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

Exploring FastAPI and documenting the process has been quite enjoyable. I
hope this article has been both informative and entertaining for you as a
reader, just as it has been for me. If you have any suggestions or if you notice
any mistakes, please feel free to share your feedback in the comments
section.

Llm Api Development API Python Artificial Intelligence

Written by Om Kamath Follow

439 Followers · Writer for Level Up Coding

Tech-enthusiast | Programming | AI

More from Om Kamath and Level Up Coding

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 21/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

Om Kamath in Level Up Coding Alexander Nguyen in Level Up Coding

How I Built A Beautiful Web App The resume that got a software
Purely in Python — with Zero… engineer a $300,000 job at Google.
Using FastAPI, Jinja2 and DaisyUI. 1-page. Well-formatted.

10 min read · May 19, 2024 · 4 min read · Jun 1, 2024

815 7 8.3K 93

Daniel Craciun in Level Up Coding Om Kamath in Level Up Coding

Stop Using UUIDs in Your Database Chat with CSV Files Using Google’s
How UUIDs can Destroy SQL Database Gemini Flash: No Langchain!
Performance An LLM pipeline to interact with CSV files
without Langchain

· 3 min read · May 16, 2024 8 min read · Jun 3, 2024

1.92K 82 143 1

See all from Om Kamath See all from Level Up Coding

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 22/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

Recommended from Medium

Gabe in Level Up Coding Alexander obidiegwu

Speculating on Python 4.0: Could Basic REST API Principles That


These 12 Beloved Features… make you a 1% programmer.
Explore Potential Changes in Python 4.0 and And it only takes 10 minutes
Their Possible Impact on Your Code

· 9 min read · May 17, 2023 11 min read · Jan 29, 2024

2.1K 13 1K 10

Lists

ChatGPT Coding & Development


21 stories · 688 saves 11 stories · 665 saves

Predictive Modeling w/ ChatGPT prompts


Python 48 stories · 1703 saves
20 stories · 1318 saves

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 23/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

Amir Lavasani Sasha Korovkina in Dev Genius

How to Structure Your FastAPI Building a High Precision Financial


Projects PDF Extraction Tool. Part 1.
Part 1: Blueprint Parsing Text from PDF Files

5 min read · May 14, 2024 10 min read · Apr 29, 2024

343 5 417 2

Vishal Rajput in AIGuys Milan Tamang in Towards AI

Prompt Engineering Is Dead: DSPy Build your own Large Language


Is New Paradigm For Prompting Model (LLM) From Scratch Using…
DSPy Paradigm: Let’s program — not prompt A Step-by-Step guide to build and train an
— LLMs LLM named MalayGPT. This model’s task is t…

· 11 min read · May 29, 2024 26 min read · Jun 5, 2024

3.4K 33 718 4

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 24/25
6/24/24, 6:08 PM Building Vector Databases with FastAPI and ChromaDB | by Om Kamath | May, 2024 | Level Up Coding

See more recommendations

https://levelup.gitconnected.com/building-vector-databases-with-fastapi-and-chromadb-0a1cd96fab08 25/25

You might also like