Langchain4J using Redis

Stephan Janssen

Published Oct 27, 2023

I've developed a proof-of-concept that leverages Langchain4J for semantic search in combination with Redis as the vector database. Below is a concise guide outlining how to set this up, complete with Java code examples and steps to configure a cloud-based Redis instance.

Redis setup

I've employed a free-tier cloud Redis instance (v6.2.6) that comes with the "Search and Query" modules pre-installed. Connection details such as the URL, port, and password are readily available, and the default user for the instance is named "default."

In under a minute, you'll have a fully operational Redis instance, ready for you to start tinkering with.

👉🏼 https://app.redislabs.com

Langchain4J setup

I'm using version 0.23.0 of Langchain4J, which you'll need to include in your Maven POM file or Gradle build script to get started.

The Devoxx talks

For individual Devoxx and VoxxedDays CFP instances, I'm using an in-memory embedding store to search the data. However, there's also an aggregated instance where all the talks are consolidated. For this unified instance, I wanted to use a persistent vector database, so I opted to experiment with Redis.

This Devoxx Belgium talk by Alexander Chatzizacharias discusses other vector database options.

The Redis Embedding Store

Langchain4J comes with a built-in RedisEmbeddingStore that you can easily configure to suit your needs. Here's a sample configuration guide to get you started. By following these steps, you'll link Langchain4J with your Redis instance, ensuring that your embeddings are persistently stored.

The host, user, password, and port settings are straightforward, serving to establish the connection to the Redis instance. The vector size dimension for the Redis embedding is set to 384, which specifies the number of features in the embedding vector.

I did encounter some initial hiccups with the metadata fields not mapping correctly. After some digging, I discovered that these needed to be supplied as a list in the builder. I'm using this feature to keep track of extra meta fields like talk_id and talk_title. This enables the semantic search results to display these additional details alongside the similarity score or distance.

Embedding The Content

To embed your content, you'll need to choose a (mini) large language model. You're not limited to using Langchain4J's built-in models; you can also opt for others like OpenAI's GPT models or even custom-trained ones.

To get your content, you'll first need to fetch the talks, which in my case was done via a REST API call.

Here's a skeleton Java method to convert a talk object into a TextSegment, which can then be fed into the language model for embedding:

Once you have a way to convert talks into TextSegments, you're all set to work some vector embedding magic. You can stream through your list of talks, transform each one into a TextSegment, and then feed it to your chosen language model for embedding.

Here's a simple Java code snippet illustrating how you might do this:

Awesome, you've successfully populated your Redis instance with vector embeddings along with their associated metadata. You're now all set for doing some semantic search.

A truncated extract from the Redis index would include the vectors and the metadata fields (talk_id and talk_title), making it both machine-readable for vector operations and human-readable for understanding the context.

The actual Semantic Search

With Redis now populated, we're ready to execute some semantic search queries.

For example when I search on "emotions" I get the following result :

The Converter maps the found metadata fields to "id" and "title" and includes the score.

This is just too easy 😇

-Stephan

BTW You might also want to check out this very interesting Langchain4J overview talk from Lize Raes from Devoxx Belgium 2023.

Aleksandar Stoisavljevic

IT Consultant at Nova CODE doo

Great content Stephan. I'm almost on the same path as you are,except that I'm looking into using Elasticsearch as Vector DB.

You can of course also run Redis locally using Docker : - Execute "docker pull redis/redis-stack:latest" - Execute "docker run -d -p 6379:6379 -p 8001:8001 redis/redis-stack:latest" - Wait until Redis is ready to serve (may take a few minutes)

See more comments

Langchain4J using Redis

Stephan Janssen

Redis setup

Langchain4J setup

The Devoxx talks

The Redis Embedding Store

Embedding The Content

The actual Semantic Search

More articles by this author

Insights from the community

Explore topics

Redis setup

Langchain4J setup

The Devoxx talks

The Redis Embedding Store

Embedding The Content

The actual Semantic Search

The Rise of the Agents @ Devoxx Belgium

Jun 2, 2025

10K+ Downloads Milestone for DevoxxGenie!

Jan 31, 2025

Running the full DeepSeek R1 model at Home or in the Cloud?

Jan 29, 2025

Large Language Models related (study) material

Jan 19, 2025

LLM Inference using 100% Modern Java ☕️🔥

Oct 21, 2024

Basketball Game Analysis using an LLM

Sep 10, 2024

The Power of Full Project Context #LLM

Jul 3, 2024

Using LLM's to describe images

Jun 6, 2024

Devoxx Genie Plugin : an Update

May 28, 2024

MLX on Apple silicon

Dec 7, 2023

Insights from the community

Explore topics