-OceanofPDF.com-Generative AI Apps With Langchain and Python - Rabi Jay
-OceanofPDF.com-Generative AI Apps With Langchain and Python - Rabi Jay
This work is subject to copyright. All rights are solely and exclusively
licensed by the Publisher, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any
other physical way, and transmission or information storage and retrieval,
electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The publisher, the authors and the editors are safe to assume that the advice
and information in this book are believed to be true and accurate at the date
of publication. Neither the publisher nor the authors or the editors give a
warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The
publisher remains neutral with regard to jurisdictional claims in published
maps and institutional affiliations.
Shivakumar Gopalakrishnan
has over 25 years of experience in software
development, DevOps, SRE, and platform
engineering. He has worked in various
industries, from healthcare enterprises to
consumer-facing web-scale companies. He
founded a startup, was a key architect
within a Fortune 1000 company, and is
currently a Principal Architect at BD. He is
a coauthor of Hands-on Kubernetes on
Azure and the author of Kubernetes for Job
Seekers and Modern Python Programming
using ChatGPT.
Keerthi Bharath
is an accomplished thought leader in the AI
industry. He has experience running
numerous startups and leading AI projects
in MNCs. He is also an investor and
mentor to companies. He is an alumnus of
Syracuse University, New York, and
College of Engineering, Guindy, Chennai.
OceanofPDF.com
© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2024
R. Jay, Generative AI Apps with LangChain and Python
https://doi.org/10.1007/979-8-8688-0882-1_1
Welcome to the world of LangChain and LLMs, where you will learn how
to build generative AI applications using one of the most popular generative
AI application development frameworks, namely, LangChain. You will
learn how to tap into the vast knowledge of these highly capable large
language models, or LLMs, as we often call them. Together, we are going to
explore how powerful LLMs like GPT-4, PaLM, and Gemini can be
accessed with LangChain to develop some amazing, intelligent, and real-
world applications that feel almost human-like.
The power of LangChain lies in its ability to make the power of large
language models (LLMs) easily accessible to us to build real-world
applications. Whether you are a veteran coder or just starting out, you are
going to find LangChain refreshingly easy to use. It is that ease of coding
that got me hooked onto LangChain. I hope you will be attracted to it as
well once you start discovering how easy it is, as you start learning from
practical examples throughout the book. The beauty of it is that you don’t
even need to be a machine learning guru or data science expert to leverage
its capabilities.
By the end of this chapter, I am confident you will master the essentials
of LangChain and start developing your own LLM-driven generative AI
applications.
I hope you find this text to be both comprehensive and hands-on. My
goal is for you to not only understand LangChain and LLM theory but also
to apply this knowledge practically to bring your generative AI projects to
life.
Understanding LangChain
LangChain is a powerful framework that will help you develop artificial
intelligence applications based on LLMs easily. Let us take a closer look.
The official definition of LangChain on the LangChain.com website is
like this:
LangChain is a framework for developing applications powered by
language models. It enables applications that:
Are context-aware: connect a language model to sources of context
(prompt instructions, few shot examples, content to ground its response
in, etc.)
Reason: rely on a language model to reason (about how to answer based
on provided context, what actions to take, etc.)
LangChain is essentially a digital toolbox that you can use to build
amazing, intelligent applications that can talk, understand, and even think
like a human to some extent.
Here are some benefits:
You can tap into the vast knowledge of advanced language models like
GPT-4, PaLM, Gemini, or even open source models such as LLaMA.
This opens up a world of possibilities for the types of applications you
can develop.
You can integrate these LLMs with your own specific, private data. This
means you can tailor the LLM’s output more closely to the unique needs
and contexts of your business or project.
And here is the good news. You are not limited to any specific LLM, and
you can mix and match different models as needed. This allows for a
level of customization in generative AI application development that can
truly drive innovation.
LangChain provides the tools you need irrespective of whether you are
looking to develop chatbots that enhance customer service, systems that
generate creative content, or solutions that automate repetitive tasks. It is
very inspiring to see the many ways we can apply this technology to solve
real-world problems and drive innovation across various industries.
Explanation
Below is an explanation for the code:
First, we install the required modules: OpenAI and LangChain. Please
ensure that you download the exact versions used in this code; otherwise,
it is quite likely that the code may not work.
Example
Consider you are developing an educational platform. While you could
construct this platform directly using LLM APIs, you would need deep
expertise in AI-driven educational best practices, model training, and
user interaction design.
LangChain Advantage The advantage with LangChain is that instead
of spending all your time and energy building the application from the
ground up, you can simply leverage the already existing templates,
contributed modules, and data connections. You can also choose to
develop it as an open source project, thus leveraging the collective
expertise and contributions of a world-class, global community of
developers and educators.
You can also tap into the best practices from the LangChain
ecosystem and collaborate with others to speed up the development of
your education platform. More importantly, you can leverage tools
specifically optimized for educational content, thus significantly
enhancing its features and personalization capabilities.
No Cost Barrier
LangChain is completely free to use for all, including individuals, startups,
or large enterprises. You don’t need to worry about expensive licenses or
restrictive terms, which lowers the entry barrier for many innovators and
creators.
In summary, LangChain sets a new standard in the world of LLM
application development. It caters to the growing interest in generative AI
technologies and delivers substantial benefits such as simplicity, increased
developer productivity and speed through reusable code, and rapid
innovation through open source collaboration.
Case Study Let us consider a scenario where you are building a virtual
assistant for the financial analytics domain. The assistant needs to
understand user queries, access real-time market data, and provide
insights.
Legal Analysis Tool A legal analysis tool can consult case law and
legal precedents to provide well-informed legal opinions and
recommendations. Such a tool could revolutionize the legal research
process by making it more efficient and comprehensive.
LangChain Advantage
Whether you need to generate code, create documentation, refactor your
codebase, or debug, LangChain will help you. It is a one-stop shop for all
your coding needs.
But more importantly, you don’t need to be a coding genius to use
LangChain. The LLMs will also help you to adapt the code according to
your unique coding style and project requirements. You can customize and
fine-tune it to fit seamlessly into your development workflow.
Models: At the heart of any LLM application are the models. You will
learn how to connect with powerful language models like GPT-4 in your
applications to create LLM apps. Sure, you could do this with the default
LLM APIs, but like we discussed earlier, LangChain standardizes the
process, making it easy to switch between different LLMs without
rewriting your code.
Prompt Templates: I will teach you the art of creating dynamic prompts
that make your language models understand and respond to queries
effectively. They guide the language models by specifying the task at
hand along with the context. By crafting effective prompts, you can get
more accurate and relevant responses from the models, thus allowing you
to tailor the output to your unique needs.
Data Connections: Data connections component allows you to feed your
LLMs with the right information by connecting them to various data
sources, like documents, PDFs, or even vector databases. We will explore
techniques like indexing and embedding to make your data retrieval
straightforward and efficient for language models.
Indexes: Indexes are all about organization. They transform large
datasets into neatly arranged data libraries that your application can
query effortlessly. This setup not only speeds up information retrieval but
also enhances the overall performance of your LLM applications.
Memory Concepts: Understanding memory concepts is important when
building applications that require ongoing interactions. This component
helps maintain historical context across conversations, which allows the
language models to remember previous exchanges. This continuity is key
to providing a coherent and seamless user experience over time.
Chains: Chains are where things get even more interesting. You can use
chains to link sequences of operations or models to execute complex,
multistep tasks. This functionality is crucial for handling sophisticated
processes within your applications and to make informed decisions based
on a series of interactions.
Agents: Finally, we reach Agents. These are the advanced units within
LangChain that bring together all the previous components. Agents are
capable of executing tasks, making decisions, and interacting with
external systems autonomously. They use APIs, databases, and custom
scripts to reason through tasks and execute actions based on sophisticated
logic.
Define Requirements: Once you have a clear idea of the goals for your
generative AI application, the next step is to spell out the specific
requirements of your application such as the type of LLMs needed, data
sources, user interactions, and specific AI functionalities like sentiment
analysis or entity recognition.
Choose LLM and LangChain Integration: This phase is similar to the
technology stack decisions you would make in traditional SDLC, except
that instead of choosing technology stack components such as databases,
programming languages, and frameworks, you will be focused on
selecting appropriate large language model(s) such as GPT-4 or PaLM
based on your application’s needs. You should also decide how
LangChain will interact with these models to ensure scalability and easy
model maintenance.
Design Application Architecture: Then we move on to the Design
Application Architecture phase (see Figure 1-4), where you transition
from abstract concepts to a detailed, structured design of your
application. This is when you will start planning how LangChain
components such as models, data connections, and agents will be
combined to meet your application’s objectives. Your architecture should
address how to integrate LangChain components, focusing on how the
data flows through LLMs, how models are orchestrated, and how AI
responses are integrated back into the application logic.
How well you design the app will determine the scalability,
maintainability, and performance of the application when done. This is quite
similar to traditional SDLC except that in traditional SDLC, you would be
focusing more on various aspects like client-server interaction, database
design, and service-oriented architecture.
Set Up Development Environment: This is where the rubber meets the
road, as you will start to prepare your development environment,
including necessary software, tools, and access to LLM APIs.
Implement LangChain Components: Now, the exciting part – coding!
You will start integrating LangChain’s components like models, data
connections, and agents to build the functionality you have designed.
This book is all about this step. I will be providing you with practical
examples and guidance. This phase differs from traditional software
programming in the sense that you are not coding for software
functionalities but for LLM interactions.
Incorporate Data Sources: During this phase, you will integrate with
external data, which will provide your application with the much-needed
context and relevance, as depicted in the Incorporate Data Sources step
of this workflow diagram (Figure 1-4). This stage emphasizes the need
for applications to pull in data from various sources, be it databases,
APIs, or live feeds. This phase differs from traditional SDLC in the sense
that you will also need to focus on how data will be used to train and
fine-tune the LLMs.
Train/Test with LLMs: If necessary, you may need to perform any
training or fine-tuning of the LLMs using LangChain. Make sure you test
the application thoroughly to ensure it meets your specified requirements.
This differs from traditional testing in the sense that you will not only
check for bugs but also ensure that the LLM is generating correct and
contextually appropriate responses.
Iterate and Optimize: Use the feedback from your testing to iterate on
your application’s design and functionality. Focus on optimizing LLM
performance, making LLM configuration changes, tweaking LangChain
setups, enhancing usability, and even retraining models with new data.
Prepare for Deployment: It is time now to finalize your application for
deployment. This may involve securing permissions, final testing, and
preparing deployment scripts.
Deploy Application: The Deploy Application phase, highlighted in the
workflow diagram (Figure 1-4), marks the important stage where you
deploy the LLM application for real-world use. This is when you prepare
the application for launch. You need to ensure the app is robust, secure,
and ready to handle user interactions. You need to take care of things like
selecting the right hosting environment, scalability to accommodate
growth, and implementing monitoring tools for ongoing performance
evaluation.
Monitor and Maintain: After deployment, ensure that you are
continuously monitoring the application’s performance and user
interactions. Be proactive in making necessary updates and
improvements to enhance functionality and user satisfaction. You may be
required to make ongoing adjustments to the model given the dynamic
nature of the data and the potential drift in LLM behavior.
Key Takeaways
Now that we have come to the end of the chapter on “this chapter” let us
discuss some key takeaways.
Understanding LangChain: You learned that LangChain is a powerful
framework that makes working with LLMs (like GPT-4, PaLM, and
Gemini) much easier. Its modular, standardized design promotes
scalability and adaptability throughout development.
The Power of LLMs: You also learned that large language models are
sophisticated AI systems that can understand and generate human-like
text. They bring an unprecedented level of sophistication and context
awareness and have revolutionized how we interact with technology.
LLMs can create content, summarize information, and even assist in
coding.
LangChain’s Building Blocks: You also learned the core components of
LangChain, such as models, prompt templates, data connections, memory
concepts, chains, and agents. These elements work together to build
intelligent LLM applications, which is really the scope of this book.
The chapter sets the stage for us to explore deeper into LangChain and
LLMs, and in the next chapter, we will be exploring the advantages of using
the LangChain framework vs. traditional LLM API development
approaches.
Review Questions
These questions are meant to help you assess and reinforce your
understanding of key concepts.
1. What is the primary purpose of LangChain?
A. To replace existing LLMs with more advanced models
B. Chains
C. Data connections
D. Memory concepts
B. Agents
C. Memory concepts
D. Chains
Answers
1. B. The primary purpose of LangChain is to facilitate the development
of applications that use LLMs.
Looking Ahead
In the next chapter, you will learn about the challenges when calling the
LLM API directly and how LangChain addresses those challenges to
enhance your generative AI development experience and productivity. We
will be
Looking into the hurdles of direct LLM API interaction
Exploring practical comparisons, case studies, and exercises
demonstrating LangChain’s benefits
Understanding LangChain’s architecture for seamless integration,
scalability, and innovation
Further Reading
Below is a list of resources to help you solidify your understanding of the
topics covered in this chapter and to further explore the world of LLMs and
LangChain:
“Better Language Models and Their Implications” (OpenAI Blog):
While focused on GPT-2, this article provides valuable insights into how
LLMs understand and generate text, serving as a foundation for prompt
engineering. https://openai.com/research/better-
language-models
“Retrieval-Augmented Generation for Knowledge-Intensive NLP
Tasks” (Google Research Blog): Explores the concept of Retrieval-
Augmented Generation (RAG) with detailed examples and applications
in knowledge-intensive tasks.
https://arxiv.org/abs/2005.11401
OpenAI API Documentation: The official documentation for OpenAI’s
API, covering everything from authentication to request parameters.
Essential reading for anyone building applications with LLMs.
https://platform.openai.com/docs/introduction
Google Gemini LLM Documentation: This contains some notebooks,
tutorials, and other examples to help you get started with Gemini models.
https://cloud.google.com/vertex-ai/generative-
ai/docs/learn-resources
OceanofPDF.com
© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2024
R. Jay, Generative AI Apps with LangChain and Python
https://doi.org/10.1007/979-8-8688-0882-1_2
Security Concerns
When dealing with sensitive user data, security should be your top priority.
Make sure to implement robust measures like authentication, input
validation, and encryption.
Overcoming Latency
When making LLM API calls, there’s always some latency involved. To
mitigate this, you can implement techniques like caching and asynchronous
processing.
Remember, I am sharing these challenges not to discourage you, but to
empower you to build robust AI-driven applications. With the right
knowledge and guidance, you will be able to navigate these challenges like
a pro.
Reduce Costs: You reduced the number of necessary LLM API calls
by using efficient caching and smarter prompt design techniques, thus
lowering costs.
You will have to experiment with different models and adjust their
parameters to get the best results for your app. This process can involve
tweaking things like the model’s architecture, training data, and
hyperparameters.
Development Complexity
When using LLM APIs directly, you need to become deeply familiar with
each model’s intricacies and limitations, which is a steep learning curve.
LangChain Solution: Like we discussed earlier, LangChain abstracts the
complexities involved in making direct API calls by offering you a
unified interface to interact with multiple LLMs seamlessly. This
abstraction layer means you spend less time wrestling with LLM API
specifics and more on crafting the right prompts and integrating valuable
data sources without worrying about the underlying model.
Challenges with Direct LLM API Usage Below are some issues with
calling an LLM API directly:
LangChain Solution
Trade-offs
However, it is not without trade-offs. Let us look at some of them:
You will face challenges when integrating multiple LLM providers.
Maintaining your code for LLM API changes or deprecations can be very
maintenance heavy.
You will miss out on the benefits of using boilerplate templates for
common tasks like prompt engineering or response parsing available with
LangChain.
Using LangChain
Let us discuss when to prefer LangChain:
Rapid Development and Prototyping: If you want to quickly prototype
applications without dealing with the intricacies of each LLM API,
LangChain provides a set of high-level tools and templates that speed up
development.
Flexibility and Scalability: If your project is expected to scale or evolve
over time and you will need to switch between LLM providers or
experiment with different models, then LangChain’s abstraction layer
offers significant benefits. It will be easier to manage modifications and
updates to your LLM without needing major code overhauls to integrate
with the LLM.
Complex Applications Requiring Advanced Features: You can use
LangChain’s advanced functionalities like prompt chaining, memory
concepts, and data connections that can be very critical when developing
sophisticated, enterprise-grade generative AI applications. These features
are not available when making direct API calls.
Multi-LLM Integration: LangChain is particularly useful when
integrating with multiple LLM providers. You do not have to deal with
the complexities involved when interacting with different LLM APIs
because you will be working with LangChain’s interface directly.
Cost Optimization: For projects that do require multiple LLMs,
LangChain allows easy switching between providers. This can be
particularly useful for cost management because you could use low-cost
or open source LLMs for development and testing and commercial
providers for production.
Active Community and Ecosystem: LangChain has a vibrant
community and extensive ecosystem of tools, plug-ins, and resources.
You can use this support network to significantly accelerate development
and problem-solving.
Trade-offs
Let us explore some of the trade-offs when using LangChain:
You may face potential limitations when trying to access all the features
offered by specific LLM APIs.
Debugging can be difficult due to the additional layers of abstraction
introduced by LangChain which adds to the overhead or complexity.
You will have to depend on LangChain updating their code to support
new LLM features or models.
Ultimately, you will have to factor in your project’s scale, complexity,
and specific requirements to choose between making direct LLM API calls
and LangChain.
Create/Open an Account
First, you will need to create an account with OpenAI’s API. This will give
you access to their powerful language models, which we will be using
throughout this book. To get started, head over to openai.com/api and click
the “Sign Up” button. If you already have an account, you can simply log
in.
Now, the sign-up process might require you to verify your account with
a mobile phone number, depending on your location. Just follow the on-
screen instructions, and you will be good to go.
Billing Information
Once you have signed up or logged in, you will need to set up your billing
information. As of this writing, OpenAI offers free credit for the first few
months, so you can get some hands-on practice without spending a dime.
API Key
Note After you reach the free credit limit, you’ll need to provide a
credit card for any additional charges.
To get your API key, navigate to the “View API Keys” section under your
account settings.
Click “Create New Secret Key,” copy the key that’s generated, and store it
somewhere safe.
Note Treat this key like a precious item because if you lose it, you will
have to create a new one, and you won’t be able to access the old one
ever again.
Note You can also use Poetry for managing dependencies. Poetry
provides better dependency management, project isolation, and
reproducibility compared to using pip directly.
import os
import openai
This code will send a request to the OpenAI API, asking to explain what
a machine learning model is. The model parameter specifies which
language model we want to use (in this case, davinci-002), and the
max_tokens parameter limits the length of the response.
Note Visit the OpenAI’s model overview page to ensure that the model
you are choosing is current and is not deprecated. Secondly, you may
also run into issues due to not using the latest version of OpenAI.
On Windows:
set OPENAI_API_KEY=your_api_key_here
import os
import openai
# Get the API key from the environment variable
api_key = os.getenv('OPENAI_API_KEY')
# Set up the OpenAI API client
openai.api_key = api_key
.....
In this code, you use the os.getenv() function to retrieve the API key
from the environment variable named OPENAI_API_KEY. You then set the
openai.api_key with the retrieved value before making API requests.
import json
import openai
In this code, you use the Json module to read the configuration from the
config.json file. You then access the API key using the
config[‘OPENAI_API_KEY’] syntax.
There are also other methods to protect the API key, such as
Using secret management tools like AWS Secrets Manager, HashiCorp
Vault, or Azure Key Vault to store the key
Storing the key as an environment variable on the server
Setting up a proxy server in between the application and the OpenAI API
to store the API key and then your application will call the proxy rather
than the OpenAI API
Congratulations!
You have successfully set up your development environment and connected
it to the OpenAI API. You are now ready to explore the exciting world of
LangChain and start building powerful applications with large language
models.
Pat yourself on the back because you have taken the first and a very
important step toward becoming a LangChain wizard!
Enter your input, in this case, I entered “Tell the story of the earth,” and
press Enter. The get_chat_completion function is called with the user
prompt to generate the chat completion.
The generated result is printed as shown in Figure 2-2.
Outcome
By completing this exercise, you will have hands-on experience with
Obtaining and using an API key for authentication with an LLM API
Installing and using a Python SDK to simplify API interactions
Making requests to an LLM (GPT-4) and processing its responses
api_key = "your_api_key_here"
openai.api_key = os.getenv("OPENAI_API_KEY")
llm = OpenAI(api_key=api_key)
# Use LLMChain for easy interaction
chain = LLMChain(llm=llm)
prompt_template = PromptTemplate(input_variables=
["user_input"], template="You are a helpful
chatbot. User: {user_input} Response:")
llm = OpenAI()
response = chain.run(user_prompt)
print(response)
A couple of things to note from the above example. First is that the
advantage with this approach is that it uses LangChain to abstract the API
interaction, making it easier to switch between LLMs or adjust the prompt
engineering approach without extensive modifications to your code.
Next, let’s discuss the three modules we imported:
OpenAI from langchain.llms: This is the class for interacting with the
OpenAI language model.
PromptTemplate from langchain.prompts: This class is used to define
a template for the prompt that will be used to generate responses.
LLMChain from langchain.chains: This class represents a chain that
combines a language model (LLM) with a prompt template to generate
responses.
Here’s what is happening in the code:
We set the OpenAI API key as an environment variable using
os.environ[“OPENAI_API_KEY”]. The value of the API key is assigned
to the variable.
The API key is confirmed to be set correctly by assigning it to
openai.api_key using os.getenv(“OPENAI_API_KEY”).
We create a PromptTemplate object with the following parameters:
input_variables=[“user_input”]: This specifies that the template
expects a variable named “user_input”.
template=“You are a helpful chatbot. User: {user_input} Response:”:
This defines the template string for the prompt. It includes the
“user_input” variable within curly braces {} to indicate where the
user’s input will be inserted.
We create an instance of the OpenAI class and assign it to the variable
llm. This represents the OpenAI language model that will be used for
generating responses.
We create an instance of the LLMChain class with the following
parameters:
llm=llm: This specifies the language model to be used in the chain,
which is the OpenAI instance created in the previous step.
prompt=prompt_template: This specifies the prompt template to be
used in the chain, which is the PromptTemplate object created earlier.
The user is prompted to enter their request using user_prompt; in my
case, I asked, “Tell the story of earth.”
The run method of the LLMChain instance is called with the argument
variable user_input. This will generate a response from the language
model based on the provided prompt and user input.
The generated response is printed using print(response).
In summary, you just learned how to call an OpenAI language model
using LangChain, create a prompt template, and generate a response based
on the user’s input. The response is then printed to the console.
The code demonstrates how to use LangChain to interact with the
OpenAI API, define a prompt template, and generate responses using a
language model in a structured and modular way.
chain = LLMChain(
llm=llm, prompt=prompt_template,
llm_kwargs={
"temperature": 0.7,
"max_tokens": 100,
"stop": ["\n"]
}
)
Outcome
You can observe that not only does LangChain simplify the process of
switching between different LLMs, but it also opens up possibilities for
refining prompts to achieve higher-quality AI-generated content.
The LangChain code is also cleaner and simpler and allows plugging in
multiple LLMs. Moreover, there is no tight coupling with the LLM’s API.
The benefits become even more apparent as we move on to more complex
use cases, such as customizing responses using prompt engineering,
integrating external data sources, and so on.
Key Takeaways
Let us discuss what we have learned so far:
Streamlined Development: You learned that LangChain simplifies
generative AI application development by abstracting the complexities of
direct LLM API calls. You can focus more on application logic rather
than the nuances of LLM APIs.
Enhanced Flexibility and Scalability: You can easily integrate and
switch between different LLMs without extensive code refactoring. This
ensures applications can evolve alongside emerging AI technologies with
minimal technical debt.
Optimized Prompt Engineering: You can use LangChain tools and
templates for prompt engineering which can reduce the time taken to
experiment and the skills required to elicit desired responses from LLMs.
Cost and Efficiency: By leveraging LangChain’s approach to modular
development and intelligent API request management, you can optimize
costs and improve application performance, especially for complex and
high-volume AI tasks.
Global Accessibility: You can build and deploy AI-powered applications
with global reach and consistent performance by leveraging the cloud-
based LLM APIs.
Security and Data Processing: You can maintain robust security
standards and simplify the handling and processing of complex datasets
while adhering to industry-standard security requirements.
Transition to Innovation: Moving to LangChain from direct LLM API
usage opens up new possibilities for innovation in generative AI
application development.
In summary, you can use LangChain to streamline the development
process, enhance the flexibility and scalability of applications, and open up
innovation possibilities while addressing the inherent challenges of LLM
APIs.
Further Reading
Below is a list of resources to help you solidify your understanding of the
topics covered in this chapter and to further explore the world of LLM
APIs:
OpenAI API Guide: OpenAI offers extensive documentation on
utilizing their API, including best practices for prompt engineering and
model selection.
https://platform.openai.com/docs/introduction
Google Cloud AI Services: Google Cloud provides a wide range of AI
services. Their documentation is a treasure trove of information for
developers looking to leverage Google’s AI capabilities.
https://cloud.google.com/ai-platform/docs
AWS AI Services: AWS offers a comprehensive guide to their AI
services, ideal for developers who want to integrate AWS’s machine
learning tools into their applications.
https://docs.aws.amazon.com/machine-learning/
“Design Patterns for Large Language Models”: This paper explores
various strategies and design patterns for effectively utilizing LLMs in
application development.
https://arxiv.org/abs/2004.13214
OceanofPDF.com
© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2024
R. Jay, Generative AI Apps with LangChain and Python
https://doi.org/10.1007/979-8-8688-0882-1_3
In this chapter, you will learn how to build powerful applications using
large language models (LLMs) through LangChain and OpenAI. We will
focus on building your practical skills and adopting straightforward
strategies to improve your application development skills. I will walk you
through the setup, implementation, and deployment of LLM-driven
applications, such as Q&A and chatbots.
I have designed this chapter to help you with the skills to develop,
deploy, and enhance LLM applications end to end so that you can go on to
develop more impactful and powerful applications.
Development
We already learned in Chapter 1 that LangChain provides the following
components:
Model I/O
Model input/output (I/O) is all about how you can format and manage the
data going into and coming out of the language models.
Prompts
Prompts are specific formats that you can use to feed data into language
models to guide how they generate responses. You can think of them as
tools that help you create questions to steer the conversation in the
direction you need. This approach lets you control the flow and ensures
the responses are relevant to your specific requirements.
Chat Models
Chat models are specialized interfaces to handle chat-like interactions.
You can use these models to feed chat messages as inputs and get back
responses in a conversational format. You can use them to build chatbots
and similar interactive applications.
LLMs (Large Language Models)
LLMs are sophisticated AI models primarily designed for understanding
and generating human-like text. They excel at tasks like text generation,
question answering, and information extraction. While their core
functionality revolves around text, LLMs can be used for a wide range of
applications, from creative writing to data analysis and problem-solving.
Retrieval
The retrieval component is a very significant advancement that allows
you to bridge your application-specific data with the language model,
which is perfect for applications like Retrieval-Augmented Generation
(RAG). You can enable the model to pull relevant information from a
vast dataset to develop its relevant and meaningful responses.
Document Loaders
Document loaders are tools that import data from various sources and
format it into “Documents” for your application to process later. You can
load text files, extract data from databases, or pull content from the Web
using this feature. I will be discussing this in Chapter 7.
Text Splitters
You can use text splitters to take large blocks of text and chop them into
smaller, more manageable pieces. This not only makes processing easier
but also improves the performance of your models by focusing on the
most relevant sections of text.
Embedding Models
Embedding models are a fascinating concept that allows you to convert
chunks of text into vector representations. These vectors capture the
essence of the text in a numerical format, which you can use to perform
natural language searches through large datasets.
Data Storage and Retrieval
LangChain supports various options for storing and retrieving data
efficiently:
1. Vectorstores: These are specialized databases that manage vector
representations for natural language search through unstructured
data. You can use them particularly for building sophisticated search
engines and recommendation systems for providing relevant results
quickly.
The choice between these depends on your specific use case, data
structure, and query requirements. LangChain’s flexibility allows you to
integrate the most appropriate storage solution for your project.
Retrievers
Retrievers are a bit more general than Vectorstores. They are interfaces
that you can use to fetch documents based on an unstructured query. You
can think of them as your personal data retriever that goes and fetches
information based on your current needs.
Composition
Composition components are like the architects of LangChain. You can
use them to combine various systems and LangChain primitives to build
complex functionalities. For instance, you may be linking a text splitter
with an embedding model or creating a full-fledged AI assistant. Using
this composition module, you can tailor your application precisely to
your needs and optimize its efficiency and effectiveness.
Tools
You can use the interfaces provided by tools to allow a language model to
interact with external systems. Using tools, you can expand the
capabilities of your applications, such as connecting to a database or a
third-party API.
Agents
Agents are the decision-makers in LangChain. You can use them to
analyze high-level directives and decide which tools to use to achieve the
desired outcome in an automated fashion. They ensure your application is
using the right resources at the right time on its own. Please refer to
Chapters 8–10 for a detailed treatment of LangChain Agents.
Chains
Chains are fundamental to LangChain’s modular design. They are
compositions of various components that work together as building
blocks. You can mix and match these to customize how your application
behaves. Below is an example just to give you an idea of how chains can
be used:
print(result)
Production
During the production phase, your focus is on making sure your application
runs smoothly and efficiently. You can use LangSmith, a platform within
the LangChain ecosystem, to inspect, monitor, and evaluate your
application’s performance.
Here is a little insight into how you might use LangSmith to check your
application:
import os
!pip install langchain==0.2.7
langchain_openai==0.1.16
from langchain_openai import ChatOpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.smith import RunEvalConfig,
run_on_dataset
# Set your API keys
os.environ["OPENAI_API_KEY"] =
"your_openai_api_key"
os.environ["LANGCHAIN_API_KEY"] =
"your_langsmith_api_key"
# Print results
print(results)
In this example, we use LangSmith, which is LangChain’s evaluation
and monitoring platform. It allows you to run your chain on a dataset and
evaluate its performance using various metrics. The RunEvalConfig lets
you specify which evaluators to use, and run_on_dataset executes the
evaluation.
Note To use LangSmith, you need to sign up for an account and obtain
an API key. Visit https://smith.langchain.com/ to get
started. Remember to keep your API keys confidential.
This command will start a server on port 8100, making your LangChain
application.
This command uses LangServe, a tool designed to easily deploy
LangChain applications. It automatically sets up the necessary API
endpoints for your chains and agents, making them accessible over HTTP.
Before running this command, ensure you have LangServe installed in
your project:
LangChain Ecosystem
LangChain started as a modest Python package but has grown into a robust
framework thanks to input and collaboration from the active developer
community. As it evolved, the LangChain team realized the need to
streamline the architecture for better usability and scalability. When
developing applications, you are going to need to understand these
components to use them effectively and avoid potential confusion. Here is
how they organized the pieces.
LangChain-Core: LangChain-Core is the foundation of the framework.
It provides core abstractions that have become standard building blocks
for LangChain components.
You can use the LangChain Expression Language to compose these
components smoothly.
Now at version 0.1, LangChain-Core ensures that any significant updates
come with a minor version bump to keep things stable for you.
Here is an example of using LangChain core:
Example Usage
Below is an example of how it is used:
Always refer to the latest LangChain documentation for the most up-to-
date information on available loaders and their usage:
https://python.langchain.com/docs/integrations/d
ocument_loaders/.
High-Level Components
LangChain: The LangChain package includes high-level, use case–specific
chains, agents, and retrieval algorithms that form the backbone of your
generative AI application’s architecture. Aiming for a stable 0.1 release
soon, this package will bring you sophisticated functionalities to build
complex AI-driven solutions.
Below is an illustrative example:
2. Then you create a vector store (FAISS) to efficiently store and retrieve
document embeddings.
4. You create a RetrievalQA chain that combines the retriever and the
language model.
5. Finally, you use the chain to answer a question based on the loaded
documents.
This setup allows you to ask questions about the content of your
documents, and the system will retrieve relevant information to generate an
answer.
Remember to replace “your_data.txt” with the path to your actual data
file, and ensure you have the necessary API keys set up for OpenAI.
Note For the full list of LLMs supported by LangChain, check the
LangChain documentation:
https://python.langchain.com/v0.2/docs/integrati
ons/llms/.
prompt = ChatPromptTemplate.from_messages([
("system", "You are a world recognized wellness
expert especially in cardio activities."),
("user", "{input}")
])
llm = OpenAI(api_key="openai_api_key")
Step 7: Output
The final output will be a string containing the LLM’s response about the
benefits of walking a mile a day, framed as if it is coming from a wellness
expert specializing in cardio activities.
output_parser = StrOutputParser()
prompt = ChatPromptTemplate.from_messages([
("system", "You are a world recognized wellness
expert especially in cardio activities."),
("user", "{input}")
])
Note When running this code, if you happen to run into error, check
with Gemini or ChatGPT by pasting the error you are facing, and they
should point you to the correct syntax based on the different versions of
OpenAI and LangChain you may be using.
# Generate a response
response = generate_response(user_input)
if user_input.lower() == 'quit':
break
# Generate a response
response = generate_response(user_input)
2. Prompt Handling
Q&A App: Uses ChatPromptTemplate to structure the prompt
Chatbot: Directly uses HumanMessage for each input
3. Chain vs. Direct Invocation
Q&A App: Creates a chain (prompt | llm | output_parser)
Chatbot: Directly invokes the model for each input
4. Interaction
Q&A App: Single invocation
Chatbot: Continuous loop for multiple interactions
6. Output Parsing
Q&A App: Uses StrOutputParser
Chatbot: Directly accesses response content
7. Use Case
Q&A App: Set up for a specific wellness expert scenario
Chatbot: Generic chatbot without a specific role
8. User Input
Q&A App: Hardcoded input
Chatbot: Takes input from the user in real time
Development Playground
I have added a few suggestions below for you to practice experimenting
with LangChain and LLMs in a controlled, interactive environment.
LangChain Playground
LangChain has a playground where you can freely experiment with
LangChain’s capabilities without any setup required on your end. As shown
in Figure 3-1, it is a web-based interface where you can write, execute, and
test prompts directly in your browser.
Colab Notebooks
Link: https://colab.research.google.com
Google Colab offers a free Jupyter notebook environment that requires
no setup and runs entirely in the cloud. You can use it to write and execute
Python code, and it integrates with GitHub and other external datasets.
Use it to create or search for existing notebooks that demonstrate
LangChain and LLM interactions.
Kaggle Notebooks
Link: https://www.kaggle.com/code
Kaggle provides a cloud-based Jupyter notebook environment similar to
Google Colab. It is integrated with Kaggle’s competitions and datasets, but
you can use it for any data science or machine learning project.
Use Kaggle to explore notebooks that feature LLM experiments. Its vast
dataset repository can also be a valuable resource for feeding real-world
data into your LangChain experiments.
Maximize Your Learning Through Experimenting
I recommend you continue your learning through continuous practice and
experimentation. Here are some suggestions.
Experiment Freely
Don’t be afraid to try out different models, prompts, and configurations.
Remember that learning what doesn’t work is just as important as finding
what does.
Review Questions
Here are some questions to help you synthesize the knowledge gained from
the chapter and apply it to broader contexts and future projects.
1. What is the primary advantage of using LangChain’s model I/O for
LLM integration?
A. It simplifies the computational requirements.
B. It allows for direct access to the Internet.
D. Ecommerce platforms
B. Google Colaboratory
C. Adobe Photoshop
D. AutoCAD
B. langchain.llms
C. langchain.audio
D. langchain.visuals
Answers
1. C. It simplifies the integration of LLMs into applications.
2. B. Q&A and conversational apps.
3. B. Google Colaboratory.
4. C. Using try and except blocks.
5. B. Engaging with interactive platforms and resources.
6. B. langchain.llms
Additional Review
Describe how LangChain’s model I/O benefits the integration of LLMs
into your applications. What specific features make it advantageous over
direct API calls?
Explain the process of setting up a development environment for working
with LangChain and OpenAI. What are the key components you need to
configure?
Discuss the types of applications you learned to build in this chapter.
How do Q&A apps differ from conversational apps in terms of their
development and functionality?
What are some effective error handling strategies you learned in this
chapter? How do these strategies improve the robustness of LLM
applications?
Reflect on the importance of continuous learning in the field of LLM
application development. How can engaging with interactive platforms
and resources enhance your skills and keep you updated with the latest
developments?
Key Takeaways
The journey through this chapter is just the beginning. As you move
forward, remember that the skills and concepts you have acquired here will
serve as the foundation for more complex and diverse applications.
Familiarity with Framework: You learned how LangChain’s model I/O
simplifies the integration of LLMs like OpenAI into your applications,
enhancing both functionality and user experience.
Diversity of LLM: You explored the variety of LLMs available within
LangChain, including specialized models for general queries and
conversational interactions.
Development Setup: You set up a functional development environment,
prepared all necessary tools, and gained confidence in configuring
LangChain and OpenAI APIs.
Building Applications: You built and deployed two key types of
applications – a Q&A app and a conversational app using Python
demonstrating practical application of your skills.
Mastering Troubleshooting: You learned effective error handling and
troubleshooting strategies, ensuring your applications run smoothly and
reliably.
Continuous Learning: I hope by now you have also engaged with
interactive learning resources to continuously improve your skills and
adapt to new developments in LLM technology and application building.
LangChain’s modular and standardized approach offers a flexible and
scalable framework that will continue to help you in your AI development
efforts. In the chapters to come, I will show you more exciting possibilities
that LangChain brings to the table.
Glossary
API Key: A unique identifier used to authenticate a user, developer, or
calling program to an API. It is essential for tracking API usage and
ensuring secure access to the service.
Chat Model: A type of LLM designed for generating responses in a
conversational format, simulating a dialogue between humans and AI.
Completion: In the context of LLMs, a completion refers to the text
generated by the model in response to a prompt or input provided by the
user.
Hugging Face Spaces: An online platform for hosting and sharing
machine learning projects, including those involving LLMs. It allows
developers to create and share interactive ML demos.
Model IO (Input/Output): Refers to the processes of sending inputs to
and receiving outputs from a machine learning model. In the context of
LangChain, it relates to how prompts are sent to LLMs and how their
responses are handled.
Prompt Templates: Predefined structures or formats for creating
prompts to send to LLMs, facilitating consistent and effective
communication with the models.
Streamlit: An open source Python library for creating and sharing
beautiful, custom web apps for machine learning and data science
projects.
This glossary serves as a quick reference to navigate the concepts and
techniques involved in developing applications with LLMs and LangChain.
Further Reading
Below resources can help deepen your understanding of the topics covered
in this chapter:
“Design Patterns for Large Language Models” – Arxiv Paper:
Explores various strategies and patterns for effectively utilizing LLMs in
application development. https://arxiv.org/abs/2004.13214
OpenAI API Documentation: The official documentation for the
OpenAI API, provides detailed information on using different models,
including GPT-3 and Codex.
https://platform.openai.com/docs
LangChain GitHub Repository: The source for LangChain code,
documentation, and examples, is invaluable for developers looking to
integrate LLMs into their projects.
https://github.com/LangChain/langchain
“Prompt Engineering for GPT-3” on OpenAI Blog: Discusses
strategies for designing effective prompts to elicit desired responses from
GPT-3, a crucial skill for developing applications with LLMs.
https://www.openai.com/blog/prompt-engineering/
Hugging Face Forums: A community forum for discussions on
machine learning, with a strong focus on NLP and transformer models.
Great for staying updated on the latest research and practical applications.
https://discuss.huggingface.co/
r/MachineLearning on Reddit: A subreddit dedicated to machine
learning, where practitioners and researchers share news, discuss projects,
and ask for advice on a wide range of topics related to AI and NLP.
These resources will help you expand your knowledge and skills in
LLMs and application development, from theoretical foundations to
practical implementation strategies.
https://medium.com/@swethag04/build-a-q-a-app-
for-a-webpage-431b7b8220e6
https://colab.research.google.com/drive/19a_v8tB
N4fwELxTn04zN1NHQWb0BtMBl?usp=sharing
OceanofPDF.com
© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2024
R. Jay, Generative AI Apps with LangChain and Python
https://doi.org/10.1007/979-8-8688-0882-1_4
OpenAI’s Models
Let us discuss the OpenAI models first.
The above example shows how you can speed up your coding workflow
and get instant suggestions for completing your code using Codex. It can
assist you with tasks like
Completing function implementations based on comments or docstrings
Generating boilerplate code for common programming patterns
Providing you with suggestions for optimizing or refactoring your code
# Example usage
prompt = "A mobile phone displaying a customer
support chatbot app interface"
image_urls = generate_image(prompt)
if image_urls:
for url in image_urls:
print(url)
else:
print("Failed to generate image.")
In this example, you provide a textual description of the image you want to
generate. You can define the generate_image function to take the following
parameters:
prompt: The text description of the image you want to generate.
num_images: The number of images to generate (default is 1).
size: The size of the generated image (default is “1024x1024”).
Inside the generate_image function, you will make a request to the OpenAI
API using openai.Image.create(), passing the prompt, num_images, and size
parameters.
If the request is successful, you extract the image URLs from the response
data and return them as a list.
If an exception occurs during the API request, you print an error message
and return None.
If the image_urls list is not empty, you can iterate over each URL and print
it. Otherwise, you can print a failure message.
Figure 4-1 shows an image that was generated.
Figure 4-1 Example of a DALL-E Generated Image
Note that you may have to create a billing account to use the API. Proceed
with the steps when presented with a pop-up like shown in Figure 4-7.
Figure 4-7 Enable Billing
sh
export
GOOGLE_APPLICATION_CREDENTIALS="path/to/your/service-
account-file.json"
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] =
"path/to/your/service-account-file.json"
Full Example Python Script
Here is the complete Python script to analyze text using the Google Cloud
Natural Language API.
First, let us install the necessary libraries:
# Set up authentication
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] =
"path/to/your/service-account-file.json"
def analyze_text(text):
# Initialize the LanguageServiceClient
client = language_v1.LanguageServiceClient()
Sentiment
Sentiment: 0.800000011920929, 0.800000011920929
This line shows the sentiment analysis results, which consist of two
values: sentiment score and sentiment magnitude.
Sentiment Score
The first value 0.800000011920929 represents the sentiment score.
The sentiment score ranges from –1.0 (very negative) to 1.0 (very
positive).
A score of 0.800000011920929 indicates that the text is strongly positive.
Sentiment Magnitude
The second value 0.800000011920929 represents the sentiment magnitude.
The sentiment magnitude indicates the overall strength of emotion
(positive or negative) expressed in the text.
It is a non-negative number that increases with the amount of emotional
content. In this case, 0.800000011920929 suggests a moderate amount of
positive sentiment.
Interpretation
The analyzed text, “Google Cloud AI services are powerful and easy to use,”
is identified as having a strongly positive sentiment, as indicated by a high
sentiment score close to 1.0.
The sentiment magnitude is also relatively high, reflecting a moderate
level of emotional intensity in the positive sentiment.
Conclusion
The Google Cloud Natural Language API has correctly identified the input
text as positive. It also gives you a sentiment score and magnitude that
quantitatively represent this positive sentiment. This is useful in applications
where you want to understand the sentiment of user feedback, reviews, or any
text content that is crucial for insights and decision-making.
By following these steps, you now have a Google Cloud project with the
Natural Language API enabled, a service account with the necessary
permissions, and a JSON key file for authentication. You have learned a lot,
and now you know how to use the Google Cloud Natural Language API in
your Python applications securely. And it opens up more avenues for you to
leverage many other models from Google for vision, image, audio, and video.
PaLM 2
You then move on to PaLM 2 for Text, which stands out for its ability to
follow complex, multistep instructions and excels in zero-shot and few-shot
learning scenarios. You discover that you can get impressive results with
minimal or no training data, saving you time and resources. You can generate
long, coherent responses that stay on track and maintain a clear and logical
flow.
Your research further leads to PaLM 2 for Chat, which turns out to be your
secret weapon to create engaging conversational AI. It is tailored specifically
for chat-based interactions, and it makes conversations feel more natural and
engaging. You decide to include it in your list of candidates to build customer
support bots, interactive games, or conversational interfaces for IoT devices.
You decide to test it out because it has the potential to amaze your users with
lifelike and contextually relevant responses.
Speech Models
You discover Google’s Chirp Speech, which is a universal speech model that
can transcribe over 100 languages using just one model. You realize that you
can use this to develop voice-enabled applications that work globally without
needing separate models for different languages.
Getting Started with Google AI Models Note that to get started with
these models, Google provides detailed model cards and quickstarts that
offer guidance and code samples. I recommend you explore the
documentation, experiment with the provided examples, and adapt them to
your specific use cases. Check this link:
https://cloud.google.com/vertex-ai/generative-
ai/docs/model-garden/explore-models.
Enter a name for your key and click “Create Key.” Store the key in a safe
place for future use because you will not be able to view it again. Of course,
do not share it with anyone.
Once you have your key, you can easily integrate Claude into your Python
projects using the Anthropic SDK. Moreover, your code will not work unless
you have enough credits, so you can avail their $5 credit by activating as
shown below.
Here is a simple example of how to generate text using Claude 3 Opus:
2. You retrieve the Anthropic API key from the environment variable
ANTHROPIC_API_KEY. Make sure to set this environment variable
with your actual API key before running the code.
As you can see, with just a few lines of code, you can use the power of
Claude 3 models to generate amazing AI-powered content.
In this example, you import the Cohere library and create a Client instance
with your API key.
You then provide a prompt to the Command model and specify the
maximum number of tokens to generate and the temperature (which controls
the creativity of the generated text).
The generated text is stored in the generations attribute of the response,
which you can access and print.
As you can see, with just a few lines of code, you could unleash the power
of Cohere’s generative models and start creating amazing AI-powered
applications.
Here is a simple example of how you can use Cohere’s Command model
to generate text:
!pip install langchain==0.2.0
!pip install langchain_community==0.2.0
!pip install cohere==5.5.0
import os
from langchain_core.prompts import PromptTemplate,
ChatPromptTemplate
from langchain_cohere import ChatCohere
# Set up your Cohere API key
os.environ["COHERE_API_KEY"] = "YOUR_COHERE_API_KEY
"
if __name__ == "__main__":
# Initialize the Cohere model
cohere = ChatCohere(temperature=0,
api_key=COHERE_API_KEY, model_name="command-r")
system = ( prompt_template )
human = "{text}"
# Create a prompt template
prompt =
ChatPromptTemplate.from_messages([("system",
system), ("human", human)])
chain = prompt | cohere;
response = chain.invoke(
{
"text": "Give me fun fact about Cohere",
}
)
print("Cohere Generated Pitch:")
print(response)
Meta AI Models
Next, your model discovery leads you to the set of Meta AI models. You first
start with LLaMA, or Large Language Model Meta AI, which is a
foundational model that comes in various sizes (7B to 65B parameters) and
excels at tasks like language understanding, text generation, and question
answering.
You also discover OPT, the Open Pretrained Transformer, which is
another series of open source models that excels in language modeling and
generation. And then there is NLLB, or No Language Left Behind, a
multilingual model that can translate between over 200 languages.
But you realize that Meta AI’s offerings don’t stop with language models.
They have also developed RoBERTa, an optimized version of BERT that
delivers improved performance on a range of natural language understanding
tasks from text classification to sentiment analysis. DPR, or Dense Passage
Retrieval, is a retrieval-based model that efficiently tackles document
retrieval and question answering. And M2M-100 is a true polyglot capable of
translating between any pair of 100 languages.
You realize that the potential use cases for these models are vast and
exciting, and you have documented some clear use cases such as automated
content creation with LLaMA, global ecommerce platforms powered by
M2M-100, and personalized learning experiences enhanced by RoBERTa.
You then move on to the world of audio and efficient text classification
with WaVE and FastText. WaVE is the short form for Waveform-to-Vector
and converts audio waveforms into fixed-dimensional vector representations.
It opens up opportunities for audio classification, retrieval, and similarity
analysis. FastText, on the other hand, is a lightweight library which will be
helpful for speedy text classification and representation learning.
You document use cases such as voice-activated systems, music
recommendation engines, and real-time sentiment analysis on social media for
future reference.
model_name = "MetaAI/llama-7b"
access_token = "your_access_token"
tokenizer =
LlamaTokenizer.from_pretrained(model_name,
use_auth_token=access_token)
model = LlamaForCausalLM.from_pretrained(model_name,
use_auth_token=access_token)
4. Click the “New token” button to create a new access token. Give it a
name and select the desired permissions. For this example, a READ
permission should suffice.
5. Copy the generated access token and use it in your code as shown above.
huggingface-cli login
This command will prompt you to enter your Hugging Face username and
password. Once logged in, your access token will be saved, and you can use it
in your code without explicitly providing it. Figure 4-11 shows what happens
when you use the huggingface-cli command-line tool.
Figure 4-11 Hugging Face CLI Command Tool
Remember to keep your access token secure and avoid sharing it publicly,
as it grants access to your Hugging Face account and repositories.
By providing the access token or logging in using the huggingface-cli, you
should be able to load the private or gated model repository successfully.
Code Explanation
Here is the explanation for the code.
First, you must import the necessary libraries and modules, such as
PyTorch, LangChain, and Hugging Face’s Transformers. Next, you need to
authenticate with Hugging Face using an access token. Like we discussed
earlier, this authentication step is required if you want to access certain models
such as LLaMA from the Hugging Face platform. Note that you will also have
to sign an agreement to use these models.
Moving on, you load the pretrained model and tokenizer. In this case, you
are using the “Meta-Llama-3-8B” model. The tokenizer helps in breaking
down the text into tokens that the model can understand. You configure the
model to use float16 data type for memory efficiency and automatically utilize
available GPUs.
Now, let us talk about the custom stopping criteria. This optional step
allows you to define when the model should stop generating text. Here, you
create a StopOnTokens class that checks if the generated text reaches the end-
of-sequence token. This helps in preventing the model from generating an
indefinite amount of text.
With the model and stopping criteria ready, you create a Hugging Face
pipeline for text generation. You specify parameters like the maximum
number of new tokens to generate, the stopping criteria, sampling method, and
temperature for controlling the randomness of the output.
To integrate the pipeline seamlessly with LangChain, you wrap it using the
HuggingFacePipeline class. This allows you to use the pipeline as a language
model (LLM) within the LangChain framework.
Finally, you are ready to interact with the model! You provide a prompt
asking about the potential benefits and risks of artificial intelligence. The
LLM processes the prompt and generates a response, which is then printed.
import torch
from langchain import HuggingFacePipeline
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
pipeline,
StoppingCriteria,
StoppingCriteriaList,
logging
)
from huggingface_hub import login
stopping_criteria =
StoppingCriteriaList([StopOnTokens()])
4. If the model size is not a strict requirement, you can explore using smaller
pretrained models that offer a good balance between performance and
execution time.
PyTorch
To complete your analysis of Meta AI models, you review PyTorch as an open
source deep learning framework. While not a model itself, PyTorch provides
the foundation for building and training various AI models and neural
networks for a wide range of AI applications.
Review Questions
Let us refresh our knowledge gained from this chapter through these review
questions.
1. Which of the following is a key feature of GPT-4?
A. Image generation
B. Code completion
C. Translating languages
B. Codex
C. DALL-E 2
D. PaLM 2
D. Speech-to-text conversion
B. Embed-english
C. Codex
D. DALL-E 2
B. Speech recognition
C. Image processing
9. Which model from Google is known for its ability to transcribe over 100
languages?
A. PaLM 2
B. Codey
C. Chirp Speech
Answers
1. C. Text generation and understanding
6. B. Embed-english
9. C. Chirp Speech
Key Learnings
This chapter provided a comprehensive exploration of large language models
(LLMs) from leading entities like OpenAI, Google, Anthropic, Cohere, and
Meta. Each platform offers unique models tailored to specific tasks, from text
generation and coding assistance to image creation and language translation.
Key takeaways include
1. Diversity of LLM Applications: LLMs are versatile and can enhance or
automate various tasks across different domains and industries.
Glossary
In addition to these models discussed so far, you can explore other open
source models.
In this chapter, I will be showing you powerful tools and techniques to use
prompt engineering within the LangChain framework. You will learn how
to design and optimize prompts that effectively communicate with LLMs to
achieve precise and reliable outputs tailored to your specific application
needs.
And you would get a fascinating fact about tech stock rally, like “The
tech stock rally in 2021 was driven by a combination of low interest rates,
economic recovery, strong earnings growth, increased digitization, investor
enthusiasm, government stimulus, and technological advancements.”
But what if you want to switch things up and get facts about the retail
sector growth in 2021? You could use an f-string literal and define the
World War variable outside, like this:
prompt = f"""
Provide a concise summary of the major factors
influencing {topic}.
Include at least three key points in your
response.
Format your answer as a bulleted list.
"""
result = LLM(prompt)
print(result)
While this works, it doesn’t scale well when you start building more
complex applications or chains. That is where prompt templates come to the
rescue!
When crafting prompts, you must consider both system and user
prompts:
System Prompt: You should start by defining the AI’s role and overall
expected behavior using a system prompt. This sets the context for all
subsequent interactions. For example:
System: You are an expert educator skilled at
explaining complex topics to high school
students.
User Prompt: You should then follow it with a user prompt that
specifies the actual task. Keep it simple and straightforward. For
instance:
This approach helps to provide a clear structure for the LLM to follow.
Embracing Iteration
Don’t expect perfection on the first try. Instead, be prepared to iterate, test,
and refine your prompts multiple times.
Components of a Prompt
The prompts module consists of three main components:
Prompt templates
Example selectors
Output parsers
Let us take a closer look at each of these components and see how they
can help you create awesome prompts.
Prompt Templates
You can leverage two types of prompt templates when working with
language models:
1. Regular Prompt Templates: These are used for straightforward text
generation use cases, such as answering questions, completing
sentences, or any other text generation task.
3. Context and Questions: You can help the LLM understand the setting
and focus of the conversation by providing relevant context and
specific questions. It helps the LLM model to provide responses that
are very specific to your task.
template = """
You are a seasoned software engineer.
Explain the following algorithm: {algorithm} in
{language}. Describe its purpose, time complexity,
and a common use case.
"""
In this example, you replace {algorithm} with “machine learning” and
{language} with “French.” The resulting prompt will look like this:
Finally, you invoke the LLM with the prompt string as an argument and
print the response:
In the above code, you are creating an instance of the LLMChain class,
which represents a simple chain in a language model. The LLMChain
constructor takes two arguments, namely, the llm and the prompt, to which
you have assigned the prompt template that you created previously.
Note Remember to import the necessary libraries for this to work. For
example, you would need to import the following:
class BaseExampleSelector(ABC):
"""Interface for selecting examples to include
in prompts."""
@abstractmethod
def select_examples(self, input_variables:
Dict[str, str]) -> List[dict]:
"""Select which examples to use based on
the inputs."""
@abstractmethod
def add_example(self, example: Dict[str, str])
-> Any:
"""Add new example to store."""
This is an abstract base class that defines the interface for our example
selector.
You need to implement the abstract method select_examples in a class
inheriting from BaseExampleSelector. It takes a dictionary of input
variables and returns a list of selected examples.
You can add new examples to the example selector’s store by
implementing the add_example method.
Note that you need to import ABC and abstractmethod from the abc
module. You should also import necessary types from the typing module for
type hinting.
Creating a List of Examples
Below is a list of example inputs and outputs for translating English words
to Italian. Each example is represented as a dictionary with “input” and
“output” keys:
examples = [
{"input": "hi", "output": "ciao"},
{"input": "bye", "output": "arrivaderci"},
{"input": "soccer", "output": "calcio"},
]
class CustomExampleSelector(BaseExampleSelector):
def __init__(self, examples):
self.examples = examples
def add_example(self, example):
self.examples.append(example)
best_match = None
smallest_diff = float("inf")
return [best_match]
Using the Custom Example Selector
Here, you create an instance of the CustomExampleSelector with the list of
examples:
example_selector = CustomExampleSelector(examples)
Then, you call the select_examples method with the input word “okay”
to get the closest matching example:
example_selector.select_examples({"input":
"okay"})
The output is
example_selector.select_examples({"input":
"okay"})
example_prompt =
PromptTemplate.from_template("Input: {input} ->
Output: {output}")
prompt = FewShotPromptTemplate(
example_selector=example_selector,
example_prompt=example_prompt,
suffix="Input: {input} -> Output:",
prefix="Translate the following words from
English to Italain:",
input_variables=["input"],
)
Finally, you use the format method of the prompt template to generate a
prompt with the input word “word”:
print(prompt.format(input="word"))
Now, you have seen how to create a custom example selector that
selects examples based on their similarity in length to the input word. The
example selector is then used in a prompt template to generate prompts for
translating English words to Italian. You should be able to use the same
approach to create custom example selectors for similar user cases.
examples = [
{
"question": "What is the largest planet in
our solar system?",
"answer": "Jupiter is the largest planet
in our solar system."
},
{
"question": "Who painted the Mona Lisa?",
"answer": "The Mona Lisa was painted by
Leonardo da Vinci."
},
{
"question": "What is the currency of
Japan?",
"answer": "The currency of Japan is the
Japanese yen."
}
]
Output:
prompt = FewShotPromptTemplate(
examples=examples,
example_prompt=example_prompt,
suffix="Question: {input}",
input_variables=["input"]
)
Output:
The above line selects the most relevant examples based on the question
“Who sculpted the Statue of David?” using the
LengthBasedExampleSelector. The selected examples are then used to
create a new FewShotPromptTemplate that incorporates the example
selector below.
Let us discuss how the LengthBasedExampleSelector works.
When initializing the LengthBasedExampleSelector, you provide the list
of examples (examples), the example prompt template
(example_prompt), and the maximum length (max_length) for the
selected examples.
The select_examples method is called with an input dictionary containing
the question “Who sculpted the Statue of David?” However, in the case
of the LengthBasedExampleSelector, the input question is not used for
selecting examples based on relevance.
The LengthBasedExampleSelector iterates through the list of examples
and selects examples based on their length. It tries to include as many
examples as possible while ensuring that the total length of the selected
examples does not exceed the specified max_length. Its goal is to
maximize the number of examples to include within the specified length
limit.
prompt = FewShotPromptTemplate(
example_selector=example_selector,
example_prompt=example_prompt,
suffix="Question: {input}",
input_variables=["input"]
)
print("\nGenerated Answer:")
print(llm(prompt.format(input="What is the capital
of Australia?")))
print("\nGenerated Answer:")
print(llm(prompt_with_selector.format(input="Who
sculpted the Statue of David?")))
Below is the answer I got:
Generated Answer:
The capital of Australia is Canberra.
Generated Answer:
The Statue of David was sculpted by Michelangelo.
This is the answer generated by the language model for the question
“Who sculpted the Statue of David?” based on the prompt template
with the example selector. The model uses the selected examples and
its preexisting knowledge to generate the correct answer.
The main difference between the two prompt templates is that the
one without the example selector includes all the examples, while the
one with the example selector (LengthBasedExampleSelector) selects
examples based on their length.
In both cases, the language model uses the provided examples and
its preexisting knowledge to generate appropriate answers to the input
questions. Thus, the examples serve as a guide or context for the model
to understand the type of information being asked and the expected
format of the answer.
Conclusion
You have now created a few-shot prompt template that selects relevant
examples and presents them in a structured format to the language model.
You have enabled the model to learn from the examples and generate
appropriate answers to new questions.
Output Parsers
Finally, we have output parsers. Sometimes, you want your LLM to
generate output in a specific format, like JSON or a question-answer
pattern. You can use output parsers to extract the relevant information from
the model’s response and structure it according to your needs.
Output parsers are helpful when building an application that needs to
extract specific data types from the output of an LLM. Maybe you want a
Python datetime object or a nicely formatted JSON object. You can use the
output parsers to effortlessly convert the string outputs from LLMs into the
exact data types you need or even your own custom class instances using
pydantic.
Output parsers typically perform two primary functions:
1. Get Format Instructions: This method provides guidelines to the
LLM on how the information should be structured and presented.
2. Parse: Once the LLM generates a response, this method takes that text
and structures it according to the provided instructions.
In cases where the LLM’s response doesn’t perfectly match the desired
format, output parsers offer an additional method:
1. Parse with Prompt: This method serves as a second attempt to
structure the data correctly. It takes into account the LLM’s response
and the original prompt and provides more context to refine the output.
Types of Output Parsers
LangChain has a diverse range of output parsers. Whether you need JSON
objects or you are working with CSV files, LangChain likely has a parser
for you. And a big plus is that many of these parsers support streaming,
meaning they can handle continuous data feeds:
OpenAITools Parser: This is handy when dealing with the latest
OpenAI functions. It structures the output based on the given arguments,
tools, and tool choice.
OpenAIFunctions Parser: This one is useful when using legacy OpenAI
function calling arguments like functions and function_call. It is reliable
and streams the output as JSON objects.
JSON Parser: It is one of the most reliable parsers out there, and it
returns a JSON object, which can be defined by a Pydantic model if you
like your data dressed in a particular schema.
XML Parser: Use this parser when your output needs to be in XML
format.
CSV Parser: This parser turns outputs into a neat list, perfect for
spreadsheets.
OutputFixing Parser: Sometimes, the first draft isn’t perfect. This
parser wraps another parser and steps in if there is an error, asking an
LLM to fix the output.
RetryWithError Parser: This is similar to OutputFixing but more
comprehensive, as it also considers the original inputs and instructions
and asks the LLM for a redo if there is an error.
Pydantic Parser: Pydantic helps you define the model output structure
and ensure the output fits right into the structure you want.
YAML Parser: For outputs that need to be in YAML format, particularly
useful when your data schema is defined in YAML.
And that is just a glimpse! There are parsers for DataFrames, Enums,
datetimes, and even basic structured dictionaries for simpler tasks. The key
takeaway is that LangChain’s output parsers are all about giving you control
over how you receive and use the data generated by LLMs.
Example Use Case: Suppose you are building an application to
determine the dates of scientific discoveries, and you want these dates to be
datetime objects in Python. You might face two main challenges:
1. The LLM’s output is always a string, regardless of how you interact
with it or what instructions you provide.
2. The date format can vary. The LLM might respond with “January 1st,
2020,” “Jan 1st, 2020,” or “2020-01-01.”
This is where parsers can help. They use format instructions to ensure
the output is in the correct format (e.g., “2020-01-01” for datetime) and
provide a parsing method to convert the string into the desired Python
object type.
class Movie(BaseModel):
title: str = Field(description="The title of
the movie")
director: str = Field(description="The
director of the movie")
year: int = Field(description="The release
year of the movie")
OutputFixingParser
LangChain has a handy tool called the OutputFixingParser. This parser
takes the original output, sends it back to the model, and asks it to fix any
formatting issues. It is like having a helpful tool that double-checks the
model’s work and makes sure it is just right.
I want to point out that as models continue to improve, they are getting
better at following instructions. Plus, there is a bit of randomness involved
in working with LLMs. So, if you are coding along with this, you might
find that the model generates the correct output right off the bat, and
everything works smoothly. That is great news! But don’t worry if you can’t
reproduce the error – you can still learn from this.
In the next section, you’ll explore the other type of prompt called “chat
prompt template,” which is perfect for building chatbots and conversational
agents.
ChatPrompt Templates
Chat prompt templates are a useful variant of the prompt templates that can
make your life easy when working with conversational tasks using large
language models (LLMs).
As you learned before, the Chat Completions API uses a list of
messages, including system messages, human messages (prompts), and AI
messages. LangChain provides some useful classes that make working with
these messages easy.
openai_api_key = os.environ.get("OPENAI_API_KEY")
if openai_api_key is None:
openai_api_key = "your_api_key_here" #
Replace with your actual API key
chat = ChatOpenAI(temperature=0,
openai_api_key=openai_api_key)
template = """
You are an enthusiastic assistant that rewrites
the user's text to sound more exciting.
User: {text}
Assistant: """
prompt = ChatPromptTemplate.from_messages([
SystemMessagePromptTemplate.from_template(
"You are an enthusiastic assistant that
rewrites the user's text to sound more exciting."
),
HumanMessagePromptTemplate.from_template("
{text}"),
])
formatted_prompt =
prompt.format_prompt(text=user_input)
Printing the Formatted Prompt
You print a separator line and the heading “Formatted Prompt:”. You
convert the formatted prompt to a list of messages using the to_messages()
method and print it:
print("\nFormatted Prompt:")
print(formatted_prompt.to_messages())
response = chat(formatted_prompt.to_messages())
print("\nAssistant's Response:")
print(response.content)
Advanced Engineering
Here are some advanced techniques you can use:
Sentiment Analysis: You can integrate tools to assess the emotional tone
of inquiries and tailor the LLM to respond accordingly, for example, by
detecting frustration in delayed order messages.
Intent Recognition: You can use NLP techniques to differentiate
customer intents behind similar questions and make responses more
contextually appropriate.
Example-Driven Customization: Use examples such as the ones below
to train your LLM for personalized responses.
Frustrated Customer: “I have been waiting for over a week, where is
my order?”
Anxious Customer: “I’m worried about my order status. Can you update
me as soon as possible?”
Impact
If you follow the above best practices, results are sure to follow:
Enhanced Interaction: The well-crafted prompts enabled quicker, more
informative, and empathetic responses, leading to improved customer
satisfaction.
Efficiency in Resolution: The precise design reduced the need for
multiple interactions to solve the customer needs.
This approach showcases the critical role of meticulous prompt design
and ongoing refinement in leveraging LLMs to improve customer service,
demonstrating both operational efficiency and enhanced customer
experience.
Key Takeaways
Let us review the key takeaways from this chapter.
You have learned how to create detailed prompts that guide LLMs to
generate responses that are not only relevant but also contextually aligned
with your specific needs. This will help you to tap into the full potential of
these LLMs and ensure your LLM interactions are both meaningful and
productive.
You also explored how to refine prompts, conduct iterative testing, and
use parameters such as temperature and max tokens. You also learned how
to fine-tune the model’s outputs to make them more precise and customized
to specific tasks. You can now continuously improve your prompts and
adapt to new requirements as they arise.
I hope the knowledge and techniques that you learned in this chapter
will help you to tweak your prompting strategies as LLM technologies
continue to evolve.
Review Questions
Let us test your understanding of this chapter’s content.
1) What is the primary purpose of prompt engineering in the context of
using LLMs?
A. To increase the computational speed of language models
B. Context
Answers
1. B. The primary purpose of prompt engineering is to guide the model to
generate specific and relevant outputs.
Further Reading
You Look Like a Thing and I Love You by Janelle Shane: This
lighthearted yet informative book shows how LLM works in weird and
wonderful ways, including how prompts can go hilariously wrong.
OpenAI Blog (https://openai.com/blog/teaching-with-
ai): A fantastic resource for articles on the latest advancements in AI
technology, including detailed discussions on prompt engineering and LLM
capabilities.
Prompt engineering and you: How to prepare for the work of the
future (https://cloud.google.com/blog/transform/how-
to-be-a-better-prompt-engineer): Stay updated with Google’s
latest research and insights in AI, which often cover topics related to the
development and ethical use of LLMs.
Designing Data-Intensive Applications by Martin Kleppmann: While
not specifically about LLMs, this book is crucial for understanding how to
manage the data that feeds into AI systems, which is vital for prompt
engineering.
LangChain Documentation
(https://python.langchain.com/docs/modules/model_io
/prompts/): For hands-on guidance and to explore more about prompt
templates and other features, the LangChain documentation is the go-to
resource.
OceanofPDF.com
© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2024
R. Jay, Generative AI Apps with LangChain and Python
https://doi.org/10.1007/979-8-8688-0882-1_6
2. Steps (or Nodes): These are the individual tasks that the chain
executes. Each step might involve processing data, making a decision,
or interacting with an external system. You will use the steps to execute
tasks in a predefined order, and each step’s output can serve as input for
the next. In LangChain, you will leverage an LLM, a prompt, and
possibly some tools to implement these steps. You will use the LLM to
process the input, the prompt to guide the LLM’s response, and the
tools to interact with external resources.
4. End Point: This is where the chain will conclude its execution and
return the final output. The end point represents the completion of the
chain’s task, whether it is generating a response, updating a database, or
triggering another action.
Types of Chains
Now that you know the basic components, let us explore the two main types
of chains you will see in LangChain: LCEL (LangChain Execution
Language) chains and legacy chains. Think of LCEL chains as more
modern and flexible, while legacy chains are more straightforward and
suitable for less complex tasks.
LCEL Chains
LCEL stands for “LangChain Execution Language” and is the modern way
of creating chains. LCELs allow you to define chains with granular control
over each step using a domain-specific language designed specifically for
this purpose:
Flexibility: You can define complex logic and integrate various
operations seamlessly.
Scalability: Due to their modular nature, you can easily scale and modify
them, making them ideal for growing applications.
Use Case: Suppose you need to develop a system that fetches user data,
analyzes it, and then dynamically generates a personalized report. You
can use LCEL Chains to build this workflow with precision and handle
each aspect of the task effectively.
Legacy Chains
Legacy Chains are the original method of building chains in LangChain.
While they are less flexible than LCEL Chains, you will find them easier to
use because they have been pre-built to handle specific tasks.
Simplicity: These chains are straightforward to implement with less
setup and configuration required as compared to LCEL Chains.
Direct Application: You could choose them for applications where the
workflow is stable with less frequent changes or customization.
Use Case: If your application needs to perform a standard task, like
sending a formatted email response based on user queries, a legacy chain
might be the perfect fit.
# LCEL Chain
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import load_chain
# Legacy Chain
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
In case your code doesn’t work due to version issues, you may also try
As you can see, the legacy chain is constructed using the LLMChain
class, while the LCEL chain is loaded using the load_chain function. The
LCEL chain offers a more streamlined and flexible approach. The chain
consists of an OpenAI LLM and a prompt template. LCEL chains offer
advanced features like streaming, asynchronous execution, and automatic
observability, making them the go-to for most modern applications.
So, which one should you choose? Let’s discuss it next.
LCEL Chains
Here are some use cases where LCEL is better suited:
1. Use LCEL when you have to define complex logic and need greater
control over the flow of data between steps.
2. Use it if you are working with large datasets or need to process data in
real time, because LCEL supports streaming, asynchronous execution,
and parallelization.
3. When you need to integrate with external APIs, databases, or services,
use LCEL chains to include custom tools and plug-ins.
Legacy Chains
Use legacy chains for the following situations:
Use legacy chains when you have a straightforward use case that doesn’t
require complex logic or custom integrations. They are ideal for quick
prototyping and experimentation.
If you are working with a well-defined and stable dataset.
When you are just starting out with LangChain and want to get a feel for
how chains work, legacy chains are a great place to begin.
Updated Explanation:
To construct a chain in LangChain, you often use the
load_chain function, which initializes the chain
by connecting necessary components like language
models, prompts, and tools. However, the specific
chain template, such as “llm_chain” in this
example, needs to be either created beforehand or
loaded from a custom setup that you’ve defined in
your project. LangChain doesn’t come with
predefined chains like “llm_chain” out of the box.
llm = OpenAI(temperature=0.9)
prompt = PromptTemplate(
input_variables=["product"],
template="What is a good name for a company
that makes {product}?",
)
api_tool =
APITool(api_url="https://example.com/api")
Streaming Execution
When working with large datasets, instead of waiting for the entire dataset
to be loaded into memory, you can process data as it arrives. You can handle
live data streams as well as process data in chunks.
To enable streaming execution, you can use the streaming=True
parameter when calling the chain. Here is how you do it:
chain.run({"product": "smartphone"},
streaming=True)
Async Execution
With the help of asynchronous execution, you can run multiple tasks
concurrently and make the most of your system’s resources. This way, you
can execute multiple independent tasks in parallel.
To use async execution, you can leverage the arun method provided by
LCEL chains. Here is an example:
import asyncio
async def generate_names(product):
return await chain.arun({"product": product})
product_names =
asyncio.run(generate_names("smartphone"))
Batch Execution
You can use batch execution to process multiple inputs in a single call to the
chain. This can significantly improve performance by reducing the
overhead of multiple individual calls.
To perform batch execution, you will have to pass a list of inputs to the
chain’s apply method. Here is an example:
Once you have observability enabled, you can easily monitor the
execution of your chain and gain valuable insights for debugging and
optimization.
print(structured_query)
In this example, we define a list of allowed operations: “search,”
“filter,” and “sort.” We then load the query_constructor_runnable chain,
passing in the allowed operations. This chain takes a natural language query
and converts it into the specified allowed operations.
You provide a natural language query: “Find all products priced below
$100 and sort them by price.” The query_constructor chain processes this
query and returns a structured query based on the allowed operations.
And here is the output you will get:
{
"operations": [
{
"operation": "search",
"query": "products"
},
{
"operation": "filter",
"condition": "price < 100"
},
{
"operation": "sort",
"key": "price"
}
]
}
llm = OpenAI(temperature=0.9)
chain = LLMChain(llm=llm, param1=value1,
param2=value2)
In this example, you simply call the run method on the chain instance,
passing in the input_data. The chain processes the input and returns the
result, which you then print.
These are just a few examples from the extensive list of legacy chains
available. Each chain has its unique strengths and use cases, allowing you
to tackle a wide range of tasks.
In this example, you initialize the retriever and the LLM, create the
ConversationalRetrievalChain, and then start a conversation loop. The chain
takes care of using the chat history to refine the queries and provide
contextual responses.
llm = OpenAI(temperature=0.9)
conversation = ConversationChain(llm=llm)
loader = TextLoader("path/to/documents.txt")
index =
VectorstoreIndexCreator().from_loaders([loader])
embedding = OpenAIEmbeddings(
openai_api_key=MY_OPENAI_API_KEY)
llm = OpenAI(temperature=0.9)
chain = RetrievalQA.from_chain_type(llm=llm,
chain_type="stuff",
retriever=index.vectorstore.as_retriever())
loader = TextLoader("path/to/documents.txt")
documents = loader.load()
llm = OpenAI(temperature=0.9)
prompt = PromptTemplate(template="Summarize this
text: {text}", input_variables=["text"])
chain = MapReduceChain.from_params(
map_prompt=prompt,
combine_prompt=prompt,
llm=llm,
chunk_size=1000,
reduce_chunk_overlap=0,
)
result = chain.run(documents)
print(result)
Outcomes
Post implementation, the company saw a 30% reduction in handling time
per inquiry and improved customer satisfaction due to quicker, more
accurate responses. The Router Chain’s adaptability was enhanced by a
learning mechanism that improved its accuracy over time.
Conclusion
This case study showcases the Router Chain’s effectiveness in
streamlining customer service operations, demonstrating its potential to
significantly enhance operational efficiency and customer satisfaction in
retail settings.
conditional_chain =
ConditionalChain.from_conditions(
conditions=sentiment_conditions,
default_chain=positive_chain,
input_key="query",
output_key="response",
)
llm = OpenAI(temperature=0.9)
prompt = PromptTemplate(template="Summarize this
text: {text}", input_variables=["text"])
map_reduce_chain = MapReduceChain.from_params(
llm=llm,
map_prompt=prompt,
combine_prompt=prompt,
reduce_llm=OpenAI(temperature=0),
)
result = map_reduce_chain.run(large_dataset)
print(result)
In this example, you create a MapReduceChain and pass in a large
dataset and then apply a summarization prompt to each chunk using the
mapping function and then combine the results using the reducing function.
You use the reduce_llm parameter to specify the language model to use for
the final reduction step.
You can also use the StuffDocumentsChain in combination with a
vector store, which allows you to efficiently retrieve relevant documents
based on a query and process them in chunks. Here is an example:
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(large_dataset,
embeddings)
chain =
StuffDocumentsChain.from_llm(OpenAI(temperature=0),
document_variable_name="doc")
In this example, you create a vector store using the FAISS library and
the OpenAI embeddings. You then use the StuffDocumentsChain to retrieve
relevant documents based on the query and process them in chunks. The
document_variable_name parameter contains the variable name you use for
the input documents in the prompt template.
In this example, you use a try-except block to wrap the chain.run() call.
If an exception occurs during the execution of the chain, you catch it and
print an error message. You can add your own fallback behavior or error
handling logic based on your specific requirements.
You can also use the verbose parameter to enable verbose output, which
helps you to debug and understand what is happening under the hood. Here
is an example:
When you set verbose=True, you will get a detailed output of each step
in the chain execution, including the input and output of each component.
This will make it easy to identify issues and optimize your chains.
Key Takeaways
In this chapter, you have learned everything from the fundamentals of
LangChain chains to building sophisticated generative AI applications, such
as intelligent chatbots and automated analysis systems. Let us take a
moment to reflect on the key concepts we have covered so far.
Glossary
Here is a list of terms to help you grasp their meaning better:
Chain: A series of interconnected steps each designed to execute a
specific function, ultimately achieving a desired outcome within a
LangChain application.
Trigger: The initial action that starts a chain, which could be a user
request, a scheduled task, or an event within the application.
Steps (or Nodes): Individual tasks within a chain that process data, make
decisions, or interact with external systems. These are executed in a
predefined order where each step’s output can serve as input for the next.
Decision Points: Points within a chain where choices are made based on
the data processed so far. These guide the flow of the chain based on
predefined conditions.
End Point: The final stage of a chain where it concludes its execution
and delivers the final output.
Tools: Components that allow chains to interact with the outside world,
such as fetching data from APIs or querying databases.
Memory Component: Some chains include a memory feature to retain
context across multiple interactions, enhancing the continuity and
relevance of tasks.
Review Questions
These questions should help clarify the integral roles and functions of
various components within a LangChain Chain and their collaborative
contribution to executing sophisticated tasks.
1. How do decision points within a Chain influence its execution flow?
A. By triggering the start of a Chain
Answers
Here are the answers to the questions above:
1. How do decision points within a Chain influence its execution flow?
Answer: B. By altering the sequence of steps based on dynamic
conditions evaluated during the Chain’s execution
In this chapter, you will learn about Retrieval-Augmented Generation (RAG), an innovative approach
that combines retrieval-based methods with large language models (LLMs) to improve the accuracy and
relevance of generated responses. By the end of this chapter, you will understand the fundamental
concepts of RAG, how it integrates with LLMs, and how to implement RAG using LangChain,
Pinecone, and OpenAI. You will explore various strategies to enhance the RAG process for specific use
cases. You will gain practical skills building advanced question-answering and information retrieval
systems Using Retrieval-Augmented Generation (RAG).
By the end of this chapter, you will learn how to build advanced question-answering and information
retrieval systems Using Retrieval-Augmented Generation (RAG).
Importance of RAG
Let us talk about why RAG (Retrieval-Augmented Generation) is so important. You see, many LLM
applications require user-specific data that was not part of the model’s training set. This presents several
challenges:
1. Recent Information: When you want to ask about events that occurred after the model’s knowledge
cutoff date, the model is unable to give you a meaningful response. It may even lead to
hallucinations.
2. Private Documents: When you need answers based on your own private documents, traditional
LLMs fall short.
3. Enterprise Data Security: For your business, you need to keep your sensitive information secure.
This is where RAG comes in because it allows you to retrieve external data and pass it to the LLM
during the content generation step. It addresses these challenges in several ways:
1. Up-to-Date Information: RAG can ensure your responses are current by pulling in the latest data.
2. Personalized Knowledge: It can access your private documents to provide tailored answers.
3. Data Privacy and Security: Crucially, RAG helps keep highly prized enterprise data mostly out of
the hands of LLM vendors such as OpenAI. Here is how:
Selective Data Sharing: Only the minimal amount of data required to answer a specific question
is sent to the LLM. The rest remains securely within your company’s control.
Data Residency: The bulk of your company’s data stays on your own servers or chosen cloud
infrastructure and never leaves your company’s security perimeter.
Contextual Retrieval: You can design your RAG systems to retrieve and send only nonsensitive
portions of documents and thus protect your sensitive information.
Compliance: You can ensure compliance with data protection regulations by limiting exposure of
sensitive data to third-party services.
Thus, you can strike a balance between leveraging advanced LLM capabilities and maintaining strict
data security protocols. This is why RAG is an essential technology for enterprises looking to adopt AI
solutions responsibly.
As you can see from Figure 7-1, it all starts with the Source, which is where you gather your precious
data from. This source could be either a vast collection of documents, web pages, or any other textual
data that you want to use. It is a one-time process for each source.
It is important to note that data ingestion and processing is not a one-time process, but rather an
ongoing one that requires careful management:
Initial Setup: The initial ingestion of your data source is the first step in setting up your RAG system.
Regular Updates: As your source data changes or grows, note that you will need to update your
processed data accordingly. This could involve
Load Phase: This stage involves extracting data from identified sources and bringing it into your
processing environment. Key steps include the following:
Extraction: Retrieving raw data from original sources
Initial Ingestion: Storing data in your system’s working environment
Validation: Performing basic checks for completeness and integrity
Logging: Recording metadata about the loaded data
The goal is to make raw data available for further processing.
Transform Phase: Then we move on to the Transform phase, where you preprocess and prepare the
data for the next steps. You will be involved in tasks like cleaning the text, removing irrelevant
information, or even splitting the data into smaller chunks.
Embed Phase: Next is the real game changer in RAG, which is the Embed phase. This is where you
take the transformed data and convert it into a numerical representation called embeddings using text
embedding models like OpenAI’s text-embedding-ada-002. Embeddings are like a secret code that
captures the semantic meaning of the text. By embedding your data, you enable the model to
understand the relationships and similarities between different pieces of information.
Store Phase: After the Embed phase, you store the embeddings in a special data structure called a
vector database or an index. You will be using a vector database like Pinecone, Chroma, Milvus, or
Qdrant. This allows you to quickly retrieve relevant information when you need it.
Retrieve Phase: And finally, during the Retrieve phase, you will provide a query or a prompt to the
RAG model to generate the text. Using LangChain, you will generate an embedding for that question
using the same embedding model. You then use the question’s embedding to find the most similar
chunk embeddings from your vector database. This is done by ranking the vectors based on their
cosine similarity or Euclidean distance to the question’s embedding. The top-ranking vectors represent
the chunks most relevant to the question. The model then uses the query to retrieve the most relevant
information from the stored embeddings.
Once the relevant information is retrieved, the model uses it to generate new text that incorporates
the retrieved knowledge as well as its preexisting knowledge.
By following this pipeline, you can build a powerful question-answering system that leverages the
best of both worlds – the language model’s inherent knowledge and the ability to retrieve relevant
information from external sources.
Try It Yourself
There are a number of RAG approaches and architectures like the RAG Token model and the RAG
Sequence model. Each has its own strengths and use cases, and I encourage you to explore them all.
Remember, the key to a successful RAG is having high-quality, relevant data and fine-tuning the
retrieval and generation components, so they work seamlessly together. It is an iterative process. Try to
experiment and iterate until you find the perfect RAG technique for your specific task.
LangChain Components
To deliver the above RAG process, LangChain offers a number of components such as
Document loaders
Text splitters
Text embedding models
Vector stores
Retrievers
Indexes
These tools will come in handy to access the most relevant and up-to-date information when building
applications such as a simple FAQ bot or a complex research assistant.
Let us discuss each one of them now.
Document Loaders
At the core of any retrieval system are document loaders. Document loaders allow you to load
documents from various sources, such as HTML, PDF, or even code files. You can choose from over 100
different document loaders from LangChain to integrate with sources such as a simple .txt file, HTML
from a public website, transcript of a YouTube video, PDF files from private S3 buckets, or code snippets
from various repositories. You can even integrate with popular providers like AirByte and Unstructured.
These documents could be stored in private S3 buckets or even on public websites.
---
Amazing, isn’t it? With just three lines of code, you have successfully loaded the content of file1.txt
into a list of Document objects. It is that simple!
Now, let us break it down:
1. You import the TextLoader class from the langchain_community.document_loaders module for
handling plain text files.
2. You create an instance of TextLoader by passing the file path (“/media/file1.txt) as a parameter. This
tells the loader where to find your precious data.
3. Finally, you call the load() method on your loader instance, which reads the file and returns a list of
Document objects. Each Document contains the text content and metadata associated with the file.
Try It Yourself
Don’t hesitate to experiment and try out different ones. Each loader has its own unique features
and capabilities, so find the ones that best suit your needs.
Next, import the necessary modules such as the PyPDFLoader class from the
langchain_community.document_loaders module. The langchain_community package is a community-
driven extension of the main langchain library and provides additional functionality and utilities:
Next, you should call the load_and_split() method of the PyPDFLoader instance to load the PDF
document and automatically split it into individual pages or sections. The resulting pages or sections are
stored in the pages variable as a list of Document objects:
pages = loader.load_and_split()
The pages variable now contains a list of Document objects, where each object represents a page or
section of the PDF document. You can access the text content and metadata of each page using the
attributes of the Document object, such as page_content and metadata:
print(pages[0].page_content)
This will display the text content of the first page or section of the PDF document.
You can iterate over the pages list to access and process each page or section of the PDF document as
needed.
But what if you want to access the metadata of a Document? No problem! Each Document object has
a metadata attribute that stores a dictionary of metadata information. For instance, to view the metadata
of the 11th page:
print(data[10].metadata)
If you are curious about how many pages are in the loaded PDF, you can easily find out using the
len() function:
And if you want to know the number of characters on a specific page, you can do something like this:
Once you have the package installed, you can import the CSVLoader class and start loading your
CSV files. Here is an example:
from langchain_community.document_loaders.csv_loader
import CSVLoader loader = CSVLoader(file_path='./
sample_data/california_housing_test..csv')
data = loader.load() print(data)
Here, you create an instance of the CSVLoader class and pass the file path of our CSV file to the
file_path parameter. Then, you call the load() method to load the CSV data. The loaded data is stored in
the data variable, which contains a list of Document objects, where each object represents a row from the
CSV file.
When you print the data variable, you will see the loaded documents, including their content and
metadata. The metadata includes information like the source file path and the row number.
The CSVLoader allows you to customize how the CSV file is parsed. You can pass additional
arguments to the csv_args parameter to control the delimiter, quote character, and field names. For
example:
loader = CSVLoader(file_path='./example_data/mlb_teams_2012.csv',
csv_args={ 'delimiter': ',', 'quotechar': '"', 'fieldnames': ['MLB
Team', 'Payroll in millions', 'Wins'] }) data = loader.load()
print(data)
In this case, we specify the delimiter as a comma, the quote character as a double quote, and provide
custom field names for the columns. This way, you can ensure that the CSV file is parsed correctly and
the data is loaded as expected.
Below is the results you will get:
Another cool feature of the CSVLoader is the ability to specify a source column. By default, the file
path is used as the source for all documents. However, if you want to use a specific column from the
CSV file as the source, you can use the source_column parameter:
loader = CSVLoader(file_path='./example_data/mlb_teams_2012.csv',
source_column="Team") data = loader.load() print(data)
In this example, you set the source_column to “Team,” which means that the value from the “Team”
column will be used as the source for each document. This can be particularly useful when working with
chains that answer questions using sources.
import json
from pathlib import Path
file_path = './sample_data/products.json'
data = json.loads(Path(file_path).read_text())
Check the resources section in the downloads section for the product.json file.
Here, you are using the json module to load the JSON data from a file. You do that by specifying the
file path and use Path(file_path).read_text() to read the contents of the file as a string. Then, you pass that
string to json.loads() to parse it into a Python dictionary.
You will notice that JSONLoader is a handy tool that allows you to extract specific data from a JSON
file using a jq schema. In this example, you specify the file path and provide a jq schema
(.messages[].content) to extract the values under the content field within the messages key of the JSON
data. The text_content=False parameter indicates that you don’t want to load the entire JSON file as text
content.
Here is the output:
If you are working with a JSON Lines file, where each line represents a valid JSON object, you can
set json_lines=True in the JSONLoader constructor. This tells the loader to treat each line as a separate
JSON object. You can then specify the jq_schema to extract the desired data from each JSON object.
Here is the output:
Text Splitters
We already discussed that when dealing with large documents, we may need to transform the documents
to better suit the application. For example, when you have a lengthy document that exceeds your model’s
context window, you will need to split it into smaller and semantically meaningful chunks. That is where
text splitters come in by keeping related pieces of text together, which ensures that the model can
understand the context and provide accurate results.
You can choose from a variety of text splitters based on your specific needs. Each splitter divides text
in its own way and some will even add metadata to provide extra information about the chunks. Below
are some of the popular options:
1. RecursiveCharacterTextSplitter: You can use this splitter to recursively chop up your text based
on a list of characters you define. It will help to keep related bits of text close to each other.
2. HtmlTextSplitter: This splitter will be your go-to when you are working with HTML documents. It
splits text based on HTML-specific characters and also adds metadata about the origin of each
chunk.
3. MarkdownTextSplitter: This is similar to the HTML splitter but is designed for Markdown
documents. It splits text based on Markdown-specific characters and includes metadata about the
source of the chunk.
4. TokenTextSplitter: You will use this when splitting text based on the number of tokens. You can
decide how you want to chunk your text by using the different ways available to measure tokens.
Once you have the package installed, you can easily create a text splitter instance and start chunking
your text. Here is an example using the CharacterTextSplitter:
In the code above, you are opening a text file named “The Art of Money Getting.txt” located in the
“./sample_data” directory. You use the with statement to ensure that the file is properly closed after you
are done reading from it.
You use the read() method to read the entire contents of the file and store it in the
art_of_money_getting variable. This variable now holds the complete text of the document.
from langchain_text_splitters import CharacterTextSplitter
Here, you will import the CharacterTextSplitter class from the langchain_text_splitters package. You
use this class to split the text into smaller chunks based on a specified character or set of characters.
In this block of code, you are creating an instance of the CharacterTextSplitter class and configuring
it with the following parameters:
separator=”\n\n”: You specify the separator as two newline characters (\n\n). This means that the text
will be split whenever two consecutive newline characters are encountered.
chunk_size=1000: You set the maximum size of each chunk to 1000 characters. If a chunk exceeds this
size, it will be further split.
chunk_overlap=200: You specify an overlap of 200 characters between consecutive chunks. This
overlap helps you to maintain context between the chunks.
length_function=len: You use the built-in len function to calculate the length of each chunk in
characters.
is_separator_regex=False: You indicate that the separator is not a regular expression but a simple
string.
documents = text_splitter.create_documents([art_of_money_getting])
Here, you use the create_documents method of the text_splitter instance to split the
art_of_money_getting text into smaller chunks. The method takes a list of texts as input (in this case, you
provide a single text) and returns a list of Document objects. Each Document object represents a chunk
of text.
print(documents[4])
Finally, you print the fifth Document object from the documents list (index 4). This will display the
content and metadata of the fifth chunk of text.
The output will look something like this:
page_content='...' metadata={}
The page_content attribute contains the actual text content of the chunk, and the metadata attribute is
an empty dictionary since we didn’t provide any metadata in this example.
Overall, this code demonstrates how to use the CharacterTextSplitter from the
langchain_text_splitters package to split a long document into smaller chunks based on a specified
separator (in this case, two newline characters). The resulting chunks are stored as Document objects in
the documents list, which you can access and manipulate as needed.
In order to test how your text splitter is working, you can use the Chunkviz tool created by Greg
Kamradt that allows you to visualize how your text is being split. Using it, you can fine-tune your
splitting parameters.
Note that text splitting is just one example of the transformations you can apply to your documents
before feeding the text to an LLM. Feel free to choose from a wide range of document transformer
integrations with third-party tools that LangChain provides.
Try It Yourself
You must strike the right balance between chunk size and semantic coherence for successful text
splitting. I encourage you to experiment with different splitters and parameters to find the one that
works best for your specific use case.
Recursive Splitting
Next is recursive splitting. The RecursiveCharacterTextSplitter also splits the text based on a specified
separator character or a set of characters. However, it uses a recursive approach to split the text into
chunks. First, it starts by splitting the entire text based on the separator and creates initial chunks. If any
of the resulting chunks exceed the specified chunk_size, the splitter recursively applies the splitting
process to those chunks. And this recursive splitting continues until all the chunks are within the desired
chunk_size limit. The chunk_overlap parameter is used to maintain context between the recursively split
chunks.
This recursive approach ensures that the chunks are split more evenly, especially when you are
dealing with long paragraphs or sections that exceed the chunk_size.
The difference between the CharacterTextSplitter and RecursiveCharacterTextSplitter is subtle, and
you will find it particularly useful when you want to split the text into smaller, more manageable chunks
while also preserving the logical structure of the content.
Here is an example to illustrate the difference:
CharacterTextSplitter:
This is a sample text.
---
It consists of multiple sentences.
---
Some sentences are longer than others.
---
We will split this text into chunks.
---
RecursiveCharacterTextSplitter:
This is a sample text.
---
It consists of multiple sentences.
---
Some sentences are longer than
---
others. We will split this text
---
into chunks.
---
As you can see, the CharacterTextSplitter splits the text based on the specified separator (“. “ in this
case) and creates chunks accordingly. On the other hand, the RecursiveCharacterTextSplitter recursively
splits the chunks for those chunks that exceed the chunk_size limit, resulting in more evenly distributed
chunks.
Ultimately, you will have to choose the appropriate text splitter based on your specific requirements.
If you have long paragraphs or sections that need to be split into smaller chunks while preserving the
logical structure, the RecursiveCharacterTextSplitter might be a better choice. Otherwise, the
CharacterTextSplitter is a simpler and straightforward option for most text splitting tasks.
CodeTextSplitter
You can also use the CodeTextSplitter when working with source code files or code snippets. It helps you
to split code into meaningful chunks while factoring in the programming language’s structure and syntax
as well. Here are some examples of where you can use it:
1. Code Analysis and Understanding: When you are dealing with a large amount of code, the
CodeTextSplitter can help you break it down into manageable pieces so that you can analyze and
understand the code better.
2. Code Search and Retrieval: If you need to quickly find specific code snippets or functions, you can
use the CodeTextSplitter to create an index of code chunks. You can then find the relevant code in no
time based on keywords or criteria.
3. Code Summarization and Documentation: You can generate documentation for your code easily
with the CodeTextSplitter by splitting your code into logical units and then creating targeted
documentation. It will help you and others to understand the purpose and usage of different parts of
the codebase.
4. Code Comparison and Diff: You can easily compare different versions of your code by using
CodeTextSplitter to split the code into comparable chunks and identify the changes between
versions.
5. Code Formatting and Style Checking: You can use the CodeTextSplitter as a preprocessing step
for formatting and style checking tools. It helps you apply formatting rules and style guidelines to
specific parts of your code and ensure readability across your codebase.
Here is an example of how you can use the CodeTextSplitter to split a Python code snippet:
def main():
name = input("Enter your name: ")
greet(name)
if __name__ == "__main__":
main()
'''
---
def main():
name = input("Enter your name: ")
greet(name)
---
if __name__ == "__main__":
main()
---
As you can see, the CodeTextSplitter splits the Python code snippet into logical units based on
functions and code blocks. You can then easily analyze, understand, and work with your code in smaller
segments.
Splitting by Token
In the real world, sometimes you may be required to split text into chunks while factoring in the token
limits for the LLMs in use. Let us say you are required to process your company’s knowledge base
articles and feed them into your company chatbot’s training pipeline. However, you want to ensure that
each chunk of text fits within the token limit of the language model to avoid any issues during training.
First, you must install the necessary text_splitters and tiktoken packages:
Let us say you have a knowledge base article stored in a file called returns_policy.txt. You can split it
into chunks while keeping an eye on the token limit as shown below:
Try It Yourself
The choice of language model and the chunk_size and chunk_overlap parameters can be adjusted
based on your specific requirements. Go ahead and experiment with different values to find the
optimal balance between chunk size and context preservation.
Vector Stores
In this section, let us discuss vector stores in greater detail, what they are and why they are important.
Vector stores come into play when you have a bunch of data that you want to search through quickly and
efficiently. They store your data in a special format called vectors, which makes it easy to find similar
items.
Now, let us break down the process of how to use vector stores, and then we can make it real with an
example:
1. Load Source Data: Your first step is to gather all your data and load it into the vector store so that
you can search through it later. You can load data from various sources, such as text files, databases,
or even web pages.
2. Query Vector Store: Once your data is loaded, you can use a query to ask the vector store to find
the most similar items to your query. For example, you can ask the vector store to find you the items
that are most like your query, such as a search term or a question. The vector store will then, behind
the scenes, convert your query into a vector representation. It then compares this query vector with
all the vectors in the store and retrieves the most similar ones.
3. Retrieve “Most Similar”: The vector store returns the items that are most similar to your query. It is
like getting a list of the best matches. You can specify how many items you want to retrieve, and the
vector store will give you the top results. These retrieved items are the ones that have the closest
vector representations to your query vector. If you really think about it, it is almost like finding the
needle in the haystack, but the vector store makes it effortless.
Keyword Search vs. Text Embeddings It is worth noting that sometimes plain keyword search can
be more effective for straightforward queries. Recent recommendations suggest combining keyword
search with embeddings to leverage the strengths of both approaches. This hybrid method can provide
the precision of keyword search along with the depth of semantic understanding from embeddings.
Here is how you can implement this combined approach.
Keyword Search Start by using keyword search to filter the documents that contain the relevant
terms. This step can be helpful when handling large datasets and quickly identifying potential
matches.
Text Embeddings Apply text embeddings to the filtered documents to capture their semantic
meaning. Use these embeddings to rank the filtered results based on their relevance to the query.
This combination ensures that you benefit from the speed and precision of keyword search while
also leveraging the semantic depth provided by embeddings.
You can think of text embeddings as magical representations of text in the form of vectors. They capture
the semantic meaning of words and sentences, which helps you to compare them in a meaningful way. It
is like assigning coordinates to each piece of text in a high-dimensional space, where similar texts are
closer together.
You will be using the Embeddings class in LangChain when working with text embedding models. It
provides a standard interface for interacting with various embedding model providers, such as OpenAI,
Cohere, and Hugging Face. You can choose from over 25 different embedding providers and methods,
ranging from open source to proprietary APIs. No matter which provider you choose, the Embeddings
class will help you. The best part is that you can use this interface to easily switch between models, so it
is easy to find the one that suits your needs.
You set the OpenAI API key using the environment variable OPENAI_API_KEY. Make sure to
replace “your_api_key_here” with your actual OpenAI API key:
You initialize an instance of the OpenAIEmbeddings class called embeddings_model, which will be
our embedding model for the rest of the code:
You define a list of customer reviews called reviews. Each review is a string containing feedback
about a product or service:
You print the number of embeddings (which should be equal to the number of reviews) and the
length of each embedding:
# Print the length of the embeddings and the length of each embedding
print(f"Number of embeddings: {len(embeddings)}")
print(f"Length of each embedding: {len(embeddings[0])}")
You define a query text called query_text that asks about the positive aspects mentioned in the
customer reviews:
You use the embed_query method of the embeddings_model to embed the query text. The resulting
embedding is stored in the embedded_query variable:
You print the length of the embedded query and the first five elements of the embedded query:
You can embed the customer reviews and use the query text to perform various analyses and tasks,
such as the following:
Identifying common themes or topics mentioned in the reviews
Clustering similar reviews together based on their embeddings
Searching for reviews that are most relevant to a specific query or topic
Sentiment analysis to determine the overall sentiment expressed in the reviews
Comparing the embeddings of different reviews to find similarities or differences
In subsequent sections, we will see how to use these queries to store in a vector store and then get the
results for these queries.
3. batch_size (optional): If you want to embed multiple documents in batches, you can specify the
number of documents to embed between store updates. It is like telling your caching machine how
many items to process at once.
4. namespace (optional): To avoid collisions with other caches, you can provide a namespace for your
document cache. It is a unique name tag to avoid mixing caches with each other.
Now, here is an important tip: make sure to set the namespace parameter to avoid any collisions if you
are using different embedding models. You don't want your caches to get confused and start mixing up
embeddings from different models!
You import the LocalFileStore class from the langchain.storage module for storing embeddings
locally:
You import the FAISS class from the langchain.vectorstores module to implement a vector store
using the FAISS library:
Below, you are importing the OpenAIEmbeddings class from the langchain_openai module for
generating embeddings using OpenAI’s API:
Then import the CharacterTextSplitter class from the langchain.text_splitter module, which is used
for splitting text into chunks:
Lastly, you must import the CacheBackedEmbeddings class from the langchain.embeddings module
which is used for caching embeddings:
In order to generate embeddings, you must create an instance of the OpenAIEmbeddings class:
Then you must create an instance of the LocalFileStore class and specify the directory where the
cached embeddings will be stored:
# Create a local file store for caching embeddings
store = LocalFileStore("./cache/")
This is the most important step, where you create an instance of the CacheBackedEmbeddings class
and wrap up the underlying embeddings that you previously created with OpenAI. You store the
embedding in the local file store for caching. You use the namespace parameter and point it to the model
name of the underlying embeddings:
Once loaded, you split the document into chunks using the text splitter:
Here, you are creating the vector store using the FAISS vector store and pass in the document chunks
and the cached embedder:
You then perform a similarity search using the query and retrieve the top three most relevant chunks:
Finally, you will iterate over the search results and print each result, including its content and a
separator line:
Try It Yourself!
Remember, caching embeddings is a powerful technique that can save you time and computational
resources. So go ahead, experiment with different embedders, storage mechanisms, and namespaces.
Have fun exploring the world of cached embeddings and see how they can streamline your
development process.
Then, you must create a vector store asynchronously by calling the afrom_documents method:
You need to have docker installed for this command to run. This command will pull the Qdrant
Docker image and start a Qdrant server on your local machine, exposing it on port 6333. You can also
use a hosted Qdrant server provided by Qdrant cloud or other cloud providers.
With your vector store created, you can perform similarity searches asynchronously. Let us say you
have a query and want to find the most similar documents. Here is how you can do it:
query = "What advice did the author give about money" docs = await
db.asimilarity_search(query) print(docs[0].page_content)
The asimilarity_search method takes your query and returns a list of the most similar documents. You
can access the content of the first document using docs[0].page_content.
You can also perform similarity searches using vector embeddings directly. Check this out:
In this case, you first embed your query using the embed_query method of your embedding model.
Then, you pass the resulting embedding_vector to the asimilarity_search_by_vector method to find the
most similar documents.
Retrievers
We have already discussed how to retrieve answers to a query from a vector store. Let us discuss
retrievers in greater detail.
You can use a retriever to pass in an unstructured query and get back a list of relevant documents as
output. It helps you to quickly fetch the information you need based on your question or query.
You can use a number of retrievers from LangChain based on their unique characteristics and use
cases, so understanding them will help you choose the best one for your specific needs:
1. Vectorstore Retriever: If you are just getting started and looking for something quick and easy, the
Vectorstore Retriever is your go-to choice. It is the simplest method using which you will create
embeddings for each piece of text, which makes it easy to find similar ones.
2. ParentDocument Retriever: You can use ParentDocument Retriever when you have documents
with lots of smaller, distinct pieces of information that are indexed separately but can be retrieved
together. It indexes multiple chunks for each document and finds the most similar chunks based on
their embeddings. What is really helpful is that instead of returning individual chunks, it retrieves the
entire parent document for a more comprehensive result.
3. Multi-vector Retriever: Sometimes, you might want to extract additional information from
documents that you think is more relevant to index than the text itself. The Multi-vector Retriever
allows you to create multiple vectors for each document, giving you the flexibility to capture
different aspects of the content. For example, you could create vectors just based on summaries or
hypothetical questions related to the document.
4. Self-Query Retriever: Have you ever encountered situations where users ask questions that are
better answered by fetching documents based on metadata rather than text similarity? The Self-
Query Retriever uses an LLM (language model) to transform user input into two things: a string to
look up semantically and a metadata filter to apply. This is incredibly useful when questions are
more about the metadata of documents rather than their content.
5. Contextual Compression Retriever: Sometimes, retrieved documents can contain a lot of irrelevant
information that distracts the LLM. The Contextual Compression Retriever comes to your rescue by
adding a post-processing step to extract only the most relevant information from the retrieved
documents. It can be done using embeddings or an LLM to ensure that your LLM stays focused on
what matters most.
LangChain offers several other advanced retrieval types, such as Time-Weighted Vectorstore
Retriever, Multi-query Retriever, Ensemble Retriever, and Long-Context Reorder Retriever. Each of
these has its own unique strengths and use cases, allowing you to tailor your retrieval process to your
specific needs.
In this example, you create an instance of the VectorstoreRetriever by passing in your vectorstore.
Then, you define a query and use the get_relevant_documents method to retrieve the most relevant
documents based on your query. Finally, you iterate over the retrieved documents and print their content.
Try It Yourself
Remember, choosing the right retriever depends on your specific use case and the nature of your data.
Go ahead and experiment and explore different options to find the one that works best for you.
Indexing
In this section, let us discuss the power of indexing to manage our documents. The LangChain indexing
API allows you to load and keep your documents in sync with a vector store and avoids duplicating
content, rewriting unchanged documents, and recomputing embeddings unnecessarily. It is all about
saving you time and money while improving your vector search results.
Under the hood, the indexing API uses a RecordManager, which keeps track of document writes into
the vector store by computing hashes for each document and storing some key information such as the
following:
The document hash (a unique fingerprint of the page content and metadata)
The write time (when the document was added or updated)
The source ID (a way to identify the original source of the document)
You can also use the deletion modes offered by the indexing API to control how existing documents
are handled when new ones are indexed. You can choose from
1. None: This mode gives you the freedom to manually manage your content as the API doesn’t do any
automatic cleanup.
2. Incremental: This mode continuously cleans up previous versions of content if the source document
or derived documents have changed.
3. Full: This mode does a thorough cleanup at the end of the indexing process by removing any
documents that are no longer present in the current indexing batch.
These deletion modes help you to ensure your vector store stays lean.
Key Takeaways
In this chapter, you explored Retrieval-Augmented Generation (RAG), a powerful technique for
enhancing large language models (LLMs) by incorporating additional context through retrieval.
Here is what we learned.
Understanding RAG: We defined RAG and understood its importance for enhancing LLMs,
especially in cases where up-to-date or domain-specific knowledge is crucial(OpenAI Cookbook)
(Pinecone).
We saw how RAG can reduce hallucinations, fact-check, provide domain-specific knowledge, and
enhance LLM responses.
RAG Architecture: We examined the two main components of RAG – indexing and retrieval and
generation.
The indexing component involves loading, splitting, and storing data, while the retrieval and
generation component involves retrieving relevant data and generating an answer.
Implementing RAG: We explored practical implementations of RAG using LangChain, Pinecone,
and OpenAI.
We learned how to set up a knowledge base, implement retrieval, and develop a question-answering
application using RAG.
Review Questions
1. What is the primary purpose of Retrieval-Augmented Generation (RAG)?
A. To improve the speed of LLM responses
B. To combine retrieval-based methods with large language models to enhance the accuracy and
relevance of generated responses
4. Which LangChain component is responsible for splitting text into smaller chunks?
A. Document loader
B. Text splitter
C. Vector store
D. Retriever
B. Milvus
C. Qdrant
D. Pinecone
D. A language model
9. Which of the following is NOT a deletion mode in the LangChain indexing API?
A. None
B. Incremental
C. Full
D. Partial
10. What is the benefit of using text embeddings in an information retrieval system?
A. They increase the storage capacity of databases.
B. They capture the semantic meaning of text for comparison and retrieval.
Answers
1. B. To combine retrieval-based methods with large language models to enhance the accuracy and
relevance of generated responses.
4. B. Text splitter
5. B. To cache embeddings and avoid recalculating them
6. C. Qdrant
7. D. A language model
9. D. Partial
10. B. They capture the semantic meaning of text for comparison and retrieval.
Glossary
This glossary provides definitions for key technical terms related to Retrieval-Augmented Generation
(RAG) and information retrieval systems as discussed in this chapter.
CacheBackedEmbeddings: A wrapper around an embedder that caches the embeddings in a key-
value store, using the hashed text as the key to store and retrieve embeddings efficiently
CharacterTextSplitter: A class used to split text into smaller chunks based on a specified character or
set of characters, useful for maintaining context and ensuring manageable chunk sizes
Document Loader: A tool used to load documents from various sources, such as HTML, PDF, or code
files, into a format that can be processed by an application
Document: A container that holds a piece of text along with its associated metadata, used for
processing and analysis in information retrieval systems
Embedding: A numerical representation of text that captures its semantic meaning, allowing for
comparison and retrieval based on similarity
FAISS (Facebook AI Similarity Search): An open source library for efficient similarity search and
clustering of dense vectors, often used for implementing vector stores
Indexing: The process of organizing data to improve the speed and efficiency of information retrieval,
typically by creating a structure that allows for quick searches and updates
JSONLoader: A tool used to load JSON data and extract specific information based on a jq schema,
useful for processing and analyzing JSON files
Multi-vector Retriever: A retriever that creates multiple vectors for each document, capturing
different aspects of the content to improve the accuracy of retrieval
ParentDocument Retriever: A retriever that indexes multiple chunks for each document and finds the
most similar chunks based on their embeddings, returning the entire parent document for
comprehensive results
Qdrant: A vector store that supports async operations, used for efficient similarity search and retrieval
in information retrieval systems
RAG (Retrieval-Augmented Generation): A technique that integrates retrieval-based methods with
large language models to improve the accuracy and relevance of generated responses by incorporating
additional context through retrieval
RecursiveCharacterTextSplitter: A text splitter that uses a recursive approach to divide text into
smaller chunks based on specified separators, ensuring even distribution and maintaining logical
structure
Retriever: A tool used to fetch relevant documents based on an unstructured query, facilitating quick
access to information in an information retrieval system
Self-Query Retriever: A retriever that uses a language model to transform user input into a query
string and a metadata filter, useful for retrieving documents based on metadata rather than text
similarity
Text Embedding: A numerical representation of text in vector form, used to capture semantic meaning
and facilitate comparison and retrieval in information retrieval systems
Text Splitter: A tool used to divide text into smaller chunks, ensuring that each chunk is within a
manageable size and maintaining semantic coherence
Tokenization: The process of breaking down text into smaller units (tokens) such as words or phrases
for processing by machine learning models
Vector Store: A database that stores data in vector format, allowing for efficient similarity search and
retrieval based on vector comparisons
Vectorstore Retriever: A retriever that uses a vector store to find similar documents based on their
embeddings, suitable for quick and easy retrieval tasks
References
These references provide foundational knowledge for implementing Retrieval-Augmented Generation
(RAG) systems within the context of LangChain, Pinecone, and OpenAI technologies.
1. Lewis, P., et al. (2020). “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.”
arXiv:2005.11401.
This research paper introduces the RAG model and explains its architecture and performance in
knowledge-intensive tasks.
5. OpenAI Help Center. "Retrieval-Augmented Generation (RAG) and Semantic Search for GPTs.”
https://help.openai.com/en/articles/8868588-retrieval-augmented-
generation-rag-and-semantic-search-for-gpts
This is an OpenAI resource on applying RAG with OpenAI models.
OceanofPDF.com
© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2024
R. Jay, Generative AI Apps with LangChain and Python
https://doi.org/10.1007/979-8-8688-0882-1_8
In this chapter, we will explore intelligent autonomous agents and learn how to create them using the
LangChain framework. We will start by understanding the fundamental concepts of agents, their key
features, and their thought processes. Then, we will build an end-to-end agent application and cover
everything from setting up the environment to implementing memory capabilities.
Throughout the chapter, you will gain hands-on experience working with LangChain Agent and
learn how to use its powerful tools and libraries to create agents that can perceive, reason, and take
actions autonomously. I will provide step-by-step guidance, code snippets, and practical examples to
ensure you have a solid grasp of the concepts and can apply them in real-world scenarios.
Introduction
Let us understand what agents are and their thought process.
2. Domain-Specific Agents: These agents are specialized for particular fields or industries, such as
finance, healthcare, or legal domains. They have in-depth knowledge and capabilities tailored to
their specific area of expertise.
3. Simulation Agents: These agents are designed to operate in simulated environments, often used
for training, testing, or scenario planning. They can interact with virtual worlds and respond to
simulated stimuli.
4. Task-Specific Agents: Designed to excel at particular tasks like text summarization, question
answering, or code generation.
5. Multi-agent Systems: These involve multiple agents that are working together and often
simulating complex systems or handle intricate, multistep processes.
2. Dynamic Tool Selection: Agents can dynamically select the appropriate tools based on the given
query by understanding its requirements and choose the best tools to generate a comprehensive
response.
3. Reactivity: Agents respond to changes in their environment, such as user input or new data.
5. Context Awareness: LangChain Agents can grasp the intent behind the question, identify
relevant information, and provide responses that are tailored to the specific needs of the user.
6. Integration: Agents can interact with users, other agents, external systems, tools, APIs, and data
sources to provide very rich functionality.
7. Memory: LangChain Agents can maintain context and retain information throughout a
conversation or a series of interactions to provide coherent and contextually relevant answers.
8. Customization and Extensibility: You can define your own tools, create custom Agent classes,
and extend the existing functionality to suit your specific requirements.
Perceiving
The agent receives a new support ticket via an API.
It extracts relevant details from the ticket, such as the customer’s name, issue description, and
previous interaction history.
Reasoning
The agent updates its profile with the new ticket information.
It reviews the customer’s past interactions to understand context and previous issues.
The agent plans a response by considering potential solutions and the most effective way to
address the customer’s issue.
It decides to provide troubleshooting steps or escalate the ticket based on the complexity of the
issue.
Acting
The agent sends a response to the customer with the planned troubleshooting steps.
It logs the interaction and updates the customer’s profile with the new details.
If the issue is resolved, the agent closes the ticket. If not, it provides further assistance or
escalates the issue to a human agent.
Agents aren’t limited to chatbots though. They can also be used in other scenarios such as
Task Automation: You can use Agents to handle repetitive tasks, like data entry or file
organization, which saves you time and effort.
Recommendation Systems: Agents can analyze user preferences and provide personalized
recommendations and improve user experience.
Intelligent Search: You can use Agents to retrieve the most relevant information from a vast
knowledge base using their ability to understand natural language queries.
2) Action: The agent uses the SerpAPI tool to search for weather in NY City.
3) Result: The agent gets the results.
Reflection: The agent evaluates the results, determining if the goal has been met or if further
action is required. If needed, it repeats steps 1–3 until the goal is achieved.
More Thinking: The agent starts to think how to send the message to the friend.
Agent: The agent finds out it would need a messaging tool and then goes on to send the message
as a text.
2. Contextual Decision-Making: Agents can make intelligent decisions based on the context of the
input data and the goals of the application. They can determine the most appropriate content
structure, tone, and length based on the target audience and the purpose of the generated content.
3. Dynamic Content Generation: With the help of Agents, you can create dynamic and adaptive
content generation systems. Agents can interact with multiple components, such as language
models, knowledge bases, and external APIs, to generate coherent and informative content that
meets the specified requirements.
4. Iterative Refinement: Agents can continuously evaluate and refine the generated content based
on predefined criteria or user feedback. They can identify areas for improvement, make
necessary adjustments, and regenerate content until it reaches the desired quality level.
In this example, you are initializing the Agent with the necessary tools and a prompt template.
The user provides a prompt, such as “What is the revenue increase due to the benefits of AI,” and the
Agent uses the prompt template to generate an engaging article on the given topic.
The Agent is smart enough to leverage the appropriate loaded tool, such as SerpAPI for retrieving
relevant information and LLM-Math for performing any required calculations, to generate
comprehensive and accurate content.
In this example, the Chain consists of a single step: generating the benefits of a given product
using a language model (OpenAI) and a prompt template. The Chain takes the product name as input,
applies the prompt template, and produces the generated response as output.
So, what are the key differences between agents and chains?
Autonomy: Agents are autonomous, while chains are not. Agents can make decisions and take
actions without direct human intervention, whereas chains rely on human input to function.
Goal-Oriented: Agents are goal-oriented, meaning they are designed to achieve specific goals or
tasks. Chains, on the other hand, are more flexible and can be used for a wide range of
applications. Note that there are also exploratory agents that are designed to explore and learn from
their environment, often without a predefined goal.
Contextual Understanding: Agents can understand the context in which they are operating,
allowing them to generate more meaningful and relevant responses. Chains, while capable of
processing context, are not designed to understand it in the same way as agents.
In the world of agents, the language model takes center stage as the decision-maker, determining
the sequence of actions to take based on the given context.
Think of it this way: in chains, the sequence of actions is predefined and hardcoded. It is like
following a recipe step by step, with no room for deviation. But with agents, the language model acts
as a reasoning engine, dynamically choosing the actions to take and the order in which to take them.
It is able to improvise and adapt the steps based on the tools and the desired outcome.
import os
from dotenv import load_dotenv
import openai
from langchain.agents import load_tools, initialize_agent
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
Code Explanation
Let us now walk through the code step by step.
Installing dependencies
!pip install serpapi: First, you must install the serpapi library, which is used to interact with the
SerpAPI search engine.
!pip install google-search-results: Then you must install the google-search-results library, which
provides an alternative way to interact with the Google Search API.
Importing necessary modules
os: Import the os module to interact with the operating system.
dotenv: This module is used to load environment variables from a .env file.
openai: This module is the OpenAI API client library.
load_tools and initialize_agent from langchain.agents: Use the load_tools and initialize_agent
functions from langchain.agents to load tools and initialize an agent.
OpenAI from langchain.llms: This class represents the OpenAI language model.
Loading environment variables
load_dotenv(): Use the load_dotenv function to load the environment variables from the .env
file.
OPENAI_API_KEY = os.getenv(“OPENAI_API_KEY”): This line retrieves the OpenAI API
key from the environment variable.
os.environ[“OPENAI_API_KEY”] = “Your OpenAI API Key”: This line sets the OpenAI API
key in the environment variables.
os.environ[“SERPAPI_API_KEY”] = “Your SERPAPI key”: This line sets the SerpAPI API key
in the environment variables.
Initializing the OpenAI client
openai.api_key = OPENAI_API_KEY: This line sets the OpenAI API key for the OpenAI client.
openai.api_key = os.getenv(“OPENAI_API_KEY”): This line confirms that the OpenAI API
key is set correctly.
SERPAPI_API_KEY = os.getenv(“SERPAPI_API_KEY”): This line retrieves the SerpAPI API
key from the environment variable.
Initializing the language model (LLM)
llm = OpenAI(openai_api_key=OPENAI_API_KEY, temperature=0): Initialize the OpenAI
language model with the provided API key and set the temperature to zero for deterministic
responses.
Loading the necessary tools
tools = load_tools([“serpapi”, “llm-math”], llm=llm): Using this line, you are loading the
SerpAPI and LLM-Math tools using the load_tools function and passing the initialized language
model.
Initializing the Agent
agent = initialize_agent(tools, llm, agent=”zero-shot-react-description”, verbose=True): You
must then initialize the agent using the loaded tools, language model, and the “zero-shot-react-
description” agent type. The verbose=True argument enables verbose output.
Running the Agent with a query
query = “A software company is planning to develop a new mobile app. They estimate that the
initial development cost will be $200,000, and the app will generate a monthly revenue of
$15,000. The company wants to know how many months it will take to break even on their
investment, assuming a monthly maintenance cost of $5,000. Can you help calculate the
breakeven point?”: This line defines the query to be asked to the agent.
response = agent.run(query): You then run the agent with the provided query and store the
response.
print(response): Finally, you print the agent’s response.
2. The search results provide the basic formula for breakeven analysis, which involves dividing the
fixed costs by the difference between revenue per unit and variable costs per unit.
3. The agent identifies the information from the query to proceed with the calculation of the
breakeven period:
The initial development cost ($200,000) as the fixed cost
The monthly revenue ($15,000) as the revenue per unit
The monthly maintenance cost ($5,000) as the variable cost per unit
4. Once it got the formula, the agent is smart enough to use a calculator to apply the breakeven
analysis formula, dividing the initial development cost by the difference between the monthly
revenue and monthly maintenance cost: $200,000 / ($15,000 – $5,000).
5. The calculator returns the result of 20, indicating that it will take 20 months to break even.
6. The agent provides the final answer, stating that given the initial development cost, monthly
revenue, and monthly maintenance cost, it will take the software company 20 months to break
even on their investment in the new mobile app.
This example shows how the agent can handle a business-related query that requires both
information retrieval (searching for the breakeven analysis formula) and mathematical calculations
(applying the formula to the given values).
The agent’s thought process and actions are transparently visible in the verbose output. It shows
the step-by-step approach taken by the agent to solve the problem.
So, to recap:
Chains are best suited for fixed sequences of operations where the steps are predefined.
Agents are ideal for more complex and open-ended tasks that require dynamic decision-making
and adaptability.
Key Takeaways
In this chapter, we have explored the incredible potential of agents and how they can transform the
way we build intelligent applications. We started by understanding what agents are, their key
features, and how they differ from traditional chains. You learned that agents are autonomous entities
that can perceive, reason, and act to achieve specific goals, making them perfect for tackling complex
and dynamic tasks.
We even walked through the thought process of agents, discovering how they select tools, make
decisions, and take actions based on the queries they receive. We also discussed some examples of
how agents can be used in various domains, such as content generation, task automation, and
intelligent search.
More importantly, we got our hands dirty and built our first end-to-end working agent
application. Together, we set up the environment, loaded language models, defined tools, and created
prompts to guide the agent’s behavior.
Review Questions
Let us test your understanding of this chapter’s content.
1. What is a key feature of LangChain agents?
A. They can only execute predefined sequences of tasks.
2. Which of the following best describes the thought process of a LangChain agent?
A. Receiving inputs, generating random outputs
B. It ensures that the agent always uses the same tool for all tasks.
C. It allows the agent to choose the best tools based on the query.
6. What is the purpose of setting up environment variables when developing a LangChain agent?
A. To hardcode sensitive information into the script
7. How does the LangChain framework simplify the development of intelligent agents?
A. By providing a single model for all tasks
C. Chains require manual intervention for each step, while agents operate autonomously.
D. Agents can make dynamic decisions, while chains follow predefined sequences.
10. Which tool in LangChain helps manage conversation history for agents?
A. ChatOpenAI
B. ConversationBufferMemory
C. LLMChain
D. PromptTemplate
Answers
Below are the answers to the questions above:
1. C
2. B
3. C
4. B
5. D
6. C
7. B
8. D
9. B
10. B
Further Reading
These references will help deepen your understanding of LangChain agents and provide practical
insights into building and optimizing your AI applications:
1. Understanding Agents in LangChain
LangChain Documentation on Agents: This section provides an in-depth explanation of
what agents are, how they function, and the various use cases where they can be applied.
https://python.langchain.com/v0.1/docs/modules/agents/
OceanofPDF.com
© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2024
R. Jay, Generative AI Apps with LangChain and Python
https://doi.org/10.1007/979-8-8688-0882-1_9
In this chapter, we will be designing and implementing agents using LangChain. We will explore how to define an
agent’s objectives, manage its inputs and outputs, and use various tools and toolkits to supercharge its capabilities.
We will start by discussing how to define an agent’s objectives. This is crucial because it sets the foundation
for everything that follows. I will walk you through the process of articulating the problem your agent is meant to
solve and outlining the specific tasks it needs to perform to achieve its goals. We will also identify the tools and
resources your agent will require to get the job done.
Next, we will explore some core agent concepts like AgentAction, AgentFinish, and intermediate steps.
These components are the building blocks that allow agents to reason and act dynamically.
One of the exciting aspects of working with LangChain is the variety of agent types available. We will explore
Zero-Shot-React agents, structured chat agents, ReAct agents, and more. Each agent type has its own unique
characteristics and use cases, and I will provide you with practical, working code examples to illustrate how they
work.
As we progress through the chapter, you will get hands-on experience implementing both basic and advanced
agents. We will start with simple examples and gradually build up to more complex scenarios.
We will also explore how to integrate built-in tools and create custom tools to extend their capabilities. You
will learn about toolkits to group related tools together for specific tasks. This will help you organize your agent’s
functionality and make it more efficient.
Finally, we will discuss adding memory to your agents, which enables your agents to remember previous
interactions and maintain context over multiple conversations. I will show you how to implement memory
capabilities and use chat history to enhance your agents’ context awareness and coherence.
Learning Objectives
By the end of this chapter, you will be able to
1. Define your agent’s objectives clearly.
2. Understand core agent concepts like AgentAction, AgentFinish, and intermediate steps.
5. Use built-in tools and create custom tools to extend your agent’s capabilities.
Let us say you want to create an Agent business app that assists users in finding the best restaurants in a given
city. Your Agent’s objective could be defined as follows:
Objective: The Restaurant Recommendation Agent will help users discover top-
rated restaurants in a specified city based on their cuisine preferences and
budget constraints. The Agent will interact with users through a
conversational interface, understand their requirements, and provide
personalized restaurant recommendations along with relevant information such
as ratings, reviews, and contact details.
With this objective in mind, you can now outline the tasks your Agent needs to perform:
1. Understand user input and extract relevant information (city, cuisine, budget).
3. Apply filtering and ranking algorithms to select the best restaurants based on user preferences.
4. Generate a natural language response presenting the recommended restaurants and their details.
5. Handle follow-up questions and provide additional information as requested by the user.
Now that you have a clear objective and a set of tasks, you can start gathering the necessary tools and
resources. In this case, you might need the following:
A language model or conversational AI framework for natural language understanding and generation
A restaurant database or API to access restaurant information
Libraries for data manipulation, filtering, and ranking
Integration with external services for fetching reviews, ratings, and contact details
Concepts
Alright, let us discuss the core concepts behind agents and explore the key components that make agents tick:
1. AgentAction
AgentAction is a dataclass that represents the action an agent should take. It is like a blueprint for the
agent’s next move.
It has two important properties:
tool: This is the name of the tool that the agent should invoke. Think of it as the specific skill or
capability the agent will use.
tool_input: This is the input that should be provided to the tool. It is like giving the agent the
necessary information to perform the action effectively.
2. AgentFinish
AgentFinish represents the final result from an agent when it has completed its task and is ready to
return the output to the user.
It contains a return_values key-value mapping, which holds the final agent output. This is where the
agent’s response or solution is stored.
Typically, the return_values mapping includes an output key, which contains a string representing
the agent’s final response.
3. Intermediate Steps
Intermediate steps are like the agent’s memory of previous actions and their corresponding outputs within
the current agent run.
They are crucial for passing information to future iterations, allowing the agent to know what work it has
already done and build upon it.
Intermediate steps are represented as a list of tuples, where each tuple contains an AgentAction and its
corresponding output.
The type of the intermediate steps is List[Tuple[AgentAction, Any]]. The Any type is used for
the output to provide maximum flexibility, as the output can be of various types depending on the tool used.
In most cases, the output is a string, but it can be other types as well.
Here is a code snippet to illustrate the usage of these components. Note that this is for illustrative purposes
only.
# Create an AgentAction
action = AgentAction(tool="search", tool_input="What is the annual revenue
of Amazon in 2023?")
In this example, you create an AgentAction specifying the tool to use (e.g., “search”) and the input to
provide to the tool. You then perform the action and store the output. The intermediate step, consisting of the
action and its output, is added to the intermediate_steps list. Finally, you create an AgentFinish with
the final output, which in this case is a string stating the annual revenue of Amazon.
You can see that agents can leverage the power of these LLMs and tools to dynamically reason and determine
the best course of action to solve a given problem. They can adapt and make decisions based on the available tools,
the input provided, and the intermediate results obtained along the way.
Agent
Alright, let us talk about the heart of the agent, namely, the chain responsible for deciding the next step to take. It
is usually powered by a language model, a prompt, and an output parser.
You should remember that different agents have their own unique styles when it comes to reasoning, encoding
inputs, and parsing outputs. LangChain provides a variety of built-in agents that you can choose from. Each has
their own strengths and characteristics. You can find a full list of these agents in the agent types documentation that
I have included in the “Further Reading” section of this chapter.
If you need more control or have specific requirements, you can easily build your own custom agents. Building
custom agents allows you to define your own prompting style, input encoding, and output parsing logic. We will
discuss this as well later in this chapter.
Let us dive into the inputs and outputs of an agent.
Agent Inputs
When it comes to the inputs of an agent, it is all about key-value pairs. The only required key is
intermediate_steps, which corresponds to the Intermediate Steps we discussed earlier. These steps
are crucial because they provide the agent with the context of what has been done so far.
But here is where the PromptTemplate comes in. It takes care of transforming these key-value pairs into a
format that can be easily understood by the language model.
Agent Outputs
Next is Agent output. The output of an agent can be either the next action(s) to take or the final response to send
back to the user. In technical terms, these outputs are represented by AgentActions or AgentFinish. You
can think of them as the agent’s decisions or the final verdict.
The output can be one of three types:
AgentAction: A single action the agent wants to take next
List[AgentAction]: A list of actions the agent wants to take next
AgentFinish: The final response the agent wants to send back to the user
It is like the agent is saying what the next step or answer is. The output parser is responsible for taking the raw
output from the language model and transforming it into one of those three types. In other words, it interprets the
agent’s thoughts and turns them into concrete actions or responses.
Here is a code snippet to illustrate the usage of agent inputs and outputs:
AgentExecutor
AgentExecutor is the core engine behind the scenes which provides the runtime for the agent to run smoothly and
efficiently. It is responsible for calling the agent, executing the actions it chooses, passing the action outputs back
to the agent, and repeating this process until the agent reaches a conclusion. It is like a loop of communication
between the agent and the executor, with the executor facilitating the flow of information and actions.
Here is a simplified pseudocode representation of how the AgentExecutor works, again for illustrative purposes
only:
next_action = agent.get_action(...)
while next_action != AgentFinish:
observation = run(next_action)
next_action = agent.get_action(..., next_action, observation)
return next_action
It may seem straightforward, but the AgentExecutor handles several complexities behind the scenes to make
your life easier. Let us review some of the scenarios:
1. When the agent selects a nonexistent tool, the executor gracefully handles the situation and keeps the agent on
track.
2. If a tool encounters an error during execution, the executor catches the exception and manages it appropriately
to ensure the agent can continue its work.
3. In cases where the agent produces output that cannot be parsed into a valid tool invocation, the executor
handles the situation and guides the agent back to a valid path.
4. The executor provides comprehensive logging and observability at all levels, such as agent decisions and tool
calls. It can output this information to stdout and/or send it to LangSmith for further analysis and visualization.
Tools
Tools are interfaces that an agent, chain, or LLM (large language model) can use to interact with the world. They
combine a few essential elements:
1. Name of the Tool: A concise, descriptive label that tells you what the tool does.
3. JSON Schema: A structured definition of the inputs required by the tool. Think of it as a blueprint for how to
use the tool correctly.
4. The Function to Call: The actual code that executes the tool’s action.
5. Flag: A flag that determines if the tool’s output should be immediately visible or processed further.
The name, description, and JSON schema help the LLM understand how to specify the desired action, while
the function to call is the equivalent of actually taking that action.
A Tool abstraction in LangChain consists of two key components:
1. The Input Schema for the Tool: This is like a blueprint that tells the language model (LLM) what parameters
are needed to call the tool. It is crucial to provide sensibly named and well-described parameters, so the LLM
knows exactly what inputs to provide when invoking the tool.
2. The Function to Run: This is the actual Python function that gets executed when the tool is invoked. It is the
code that performs the desired action based on the provided inputs.
One important thing you should keep in mind is that the simpler the input to a tool, the easier it is for an LLM
to use it. I recommend you to use tools that have a single string input because agents work well with them.
LangChain has documentation on which agent types can handle more complex inputs. Please see the “Further
Reading” section for the link to the documentation.
Now, let us use the WikipediaQueryRun tool, which is a handy wrapper around Wikipedia:
# Call the tool with a single string input (since it expects only one input)
print(tool.run("langchain"))
# Output: 'Page: LangChain\nSummary: LangChain is a framework designed to
simplify the creation of applications '
But what if we want to customize the tool’s name, description, or JSON schema? Let us create a custom
schema for the Wikipedia tool:
class WikiInputs(BaseModel):
"""Inputs to the wikipedia tool."""
query: str = Field(description="query to look up in Wikipedia, should be
3 or less words")
# Now, let's create a new instance of the tool with our custom settings
tool = WikipediaQueryRun(
name="wiki-tool",
description="look up things in wikipedia",
args_schema=WikiInputs,
api_wrapper=api_wrapper,
return_direct=True,
)
You have just learned how to work with built-in tools and customize them to your liking. LangChain offers a
wealth of resources to help you on your journey:
Built-In Tools: Check out the official documentation for a comprehensive list of all built-in tools.
Custom Tools: While built-in tools are handy, you will likely need to define your own tools for your specific
use cases. LangChain provides a guide on how to create custom tools.
Toolkits: Toolkits are collections of tools that work well together. The documentation offers an in-depth
description and a list of all built-in toolkits.
Tools As OpenAI Functions: Tools in LangChain are similar to OpenAI Functions, and you can easily convert
them to that format. Check out the official notebook for instructions on how to do that.
Toolkits
Toolkits are carefully curated collections of tools that are designed to work seamlessly together for specific tasks.
Sometimes, accomplishing a task requires a set of related tools working together. That is where toolkits come in
handy. They come with handy loading methods, which makes it easier to get started with the tools you need. For
example, the GitHub toolkit includes tools for searching through GitHub issues, reading files, commenting on
issues, and more. LangChain provides a comprehensive list of ready-made toolkits, and you can locate them in the
Integrations section of the documentation.
First, you need to initialize the toolkit you want to use. Let us say you are working with the
ExampleToolkit (it is just a placeholder for now):
toolkit = ExampleToolkit(...)
Every toolkit exposes a get_tools method, which returns a list of the tools contained within that toolkit.
Here is how you can access them:
tools = toolkit.get_tools()
Now, with these tools at your disposal, you can create an agent that can harness their collective power.
LangChain provides a create_agent_method function that allows you to do just that. Simply pass in your
LLM (large language model), the list of tools, and a prompt (if needed), and you have got yourself an agent ready
to tackle any task as shown below:
With this agent at your command, you can effortlessly orchestrate the tools to perform complex tasks,
streamline workflows, and achieve results that would have been otherwise challenging or time-consuming.
The Integration section of LangChain’s documentation provides a comprehensive list of ready-made toolkits.
You have toolkits ranging from web scraping, working with databases, social media platforms, and data processing
to natural language processing and more. By combining the right tools, you can create powerful workflows and
automate complex tasks with ease.
Considerations
When working with tools, there are two important design considerations to keep in mind:
1. Giving the Agent Access to the Right Tools: It is essential to equip your agent with the necessary tools to
accomplish its objectives. Without the right set of tools, your agent will be limited in its capabilities and may
struggle to complete the tasks at hand.
2. Describing the Tools in a Way That Is Most Helpful to the Agent: The way you describe the tools plays a
crucial role in how effectively the agent can use them. You should remember to provide clear descriptions that
explain the purpose of each tool, so the agent can make informed decisions on when and how to use them.
What Is LangGraph?
LangGraph allows you to structure information in a graph format, where nodes represent pieces of data or tasks,
and edges represent the relationships between them. This structure makes it easier for agents to handle complex
workflows, understand context, and perform multistep tasks efficiently.
Imagine you have an agent that needs to perform a series of tasks based on user input. Without a structured
way to manage these tasks, your code can get messy and hard to maintain. LangGraph helps you by organizing
tasks and data in a way that is both logical and scalable.
Setting Up LangGraph
First, let us set up your environment. Make sure you have LangChain installed:
1. Install LangChain: If you haven’t already, install LangChain using pip.
2. Import Necessary Modules: You will import the classes you need from LangChain.
llm = OpenAI(api_key="your_openai_api_key")
2. Create the Graph: Link the nodes together to form the workflow.
3. Define the Agent: Create an agent that uses this graph to interact with users.
class TravelAgent(Agent):
def __init__(self, llm, graph):
super().__init__(llm=llm)
self.graph = graph
current_node = self.graph.get_node("GetDestination")
response = current_node.action(user_input)
print(response)
current_node = self.graph.get_node("SuggestActivities")
response = current_node.action()
print(response)
agent.run("Hawaii")
In this illustrative example, the agent starts by greeting the user, then asks for a destination, and finally
suggests some activities based on the destination. As you can see, the graph structure makes it easy to manage
these steps and ensures that the agent follows a logical sequence.
Agent Types
LangChain offers a variety of agent types, each with its own unique characteristics and capabilities that you can
choose from, depending on your specific needs and the models you are working with.
2. OpenAI Tools
Intended Model Type: Chat
Supports Chat History: ✅
Supports Multi-input Tools: ✅
Supports Parallel Function Calling: ✅
Required Model Params: tools
When to Use: [Legacy] If you are using a recent OpenAI model (1106 onward). Generic Tool Calling
agent recommended instead
3. OpenAI Functions
Intended Model Type: Chat
Supports Chat History: ✅
Supports Multi-input Tools: ✅
Supports Parallel Function Calling: ✅
Required Model Params: functions
When to Use: [Legacy] If you are using an OpenAI model or an open source model that has been fine-tuned
for function calling and exposes the same functions parameters as OpenAI. Generic Tool Calling agent
recommended instead
4. XML
Intended Model Type: LLM
Supports Chat History: ✅
Supports Multi-input Tools: ✅
Supports Parallel Function Calling: ✅
Required Model Params: None
When to Use: If you are using Anthropic models, or other models good at XML
5. Structured Chat
Intended Model Type: Chat
Supports Chat History: ✅
Supports Multi-input Tools: ✅
Supports Parallel Function Calling: ❌
Required Model Params: None
When to Use: If you need to support tools with multiple inputs
6. JSON Chat
Intended Model Type: Chat
Supports Chat History: ✅
Supports Multi-input Tools: ❌
Supports Parallel Function Calling: ❌
Required Model Params: None
When to Use: If you are using a model good at JSON
7. ReAct
Intended Model Type: LLM
Supports Chat History: ❌
Supports Multi-input Tools: ❌
Supports Parallel Function Calling: ❌
Required Model Params: None
When to Use: If you are using a simple model
2. Conversation Agent: As the name suggests, you will use this Agent for conversational AI applications such as
a chatbot or a virtual assistant. It can engage in back-and-forth dialogue, maintain context, and provide
relevant responses based on the conversation history.
3. Structured Tool Agent: Sometimes, you have a specific set of tools or actions that you want your Agent to
use in a structured manner. In such situations, you can employ the Structured Tool Agent to define a
predetermined sequence of actions and tools which will guide the Agent’s behavior.
4. MRKL Agent: The MRKL (Mean Reciprocal Rank with Logistic Regression) Agent is a powerful addition to
the LangChain family. It uses a combination of search algorithms and logistic regression to rank and select the
most relevant tools for a given query. You can use it to prioritize and choose the best course of action.
In this example, you load the required tools (serpapi and llm-math) and initialize the Zero-Shot-React
Agent with those tools and a language model (OpenAI). You then provide a query to the Agent, and it generates a
response by leveraging the appropriate tools based on the query’s requirements.
Setup
To get started with tool calling, you will need a model that supports it. LangChain offers a wide range of options,
including Anthropic, Google Gemini, Mistral, and OpenAI. You can check out the supported models in the
LangChain documentation.
For this demo, you will be using Tavily, but feel free to swap in any other built-in tool or even add your own
custom tools. To use Tavily, you will need to sign up for an API key and set it as
process.env.TAVILY_API_KEY.
First, let us install the necessary dependencies:
import getpass
import os
os.environ["OPENAI_API_KEY"] = getpass.getpass()
os.environ["TAVILY_API_KEY"] = getpass.getpass("Enter your Tavily API key:
")
Now, you will import the ChatOpenAI class from langchain_openai and create an instance of the
language model:
llm = ChatOpenAI(model="gpt-4")
Initializing Tools
Let us create a tool that can search the Web using Tavily:
tools = [TavilySearchResults(max_results=1)]
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant. Make sure to use the
tavily_search_results_json tool for information.",
),
("placeholder", "{chat_history}"),
("human", "{input}"),
("placeholder", "{agent_scratchpad}"),
]
)
Watch as your agent springs into action, using the Tavily search tool to find information about your question
and provide a concise summary:
agent_executor.invoke(
{
"input": "what's my name? Don't use tools to look this up unless you
NEED to",
"chat_history": [
HumanMessage(content="hi! my name is Rabi Jay"),
AIMessage(content="Hello Rabi Jay! How can I assist you
today?"),
],
}
)
In this case, the agent remembers the previous conversation and responds accordingly:
Based on what you told me, your name is Rabi Jay. I don't need to use any
tools to look that up since you directly provided your name.
You have just witnessed the power of tool calling agents in action. They can intelligently choose and use tools,
provide structured outputs, and even engage in conversational interactions using chat history.
OpenAI Tools
Let us talk about an exciting feature from OpenAI called “tools,” which help your agent to detect when it should
call one or more functions and respond with the appropriate inputs. Newer OpenAI models have been fine-tuned
for this capability, making your agent smarter and more efficient.
In an API call, you can describe functions to your agent, and it will intelligently choose to output a JSON
object containing the arguments needed to call those functions.
The goal of OpenAI tools is to ensure that your agent reliably returns valid and useful function calls, going
beyond what a generic text completion or chat API can do. It makes your agent more precise and effective.
OpenAI has two related concepts: “functions” and “tools.” Functions allow your agent to invoke a single
function, while tools enable it to invoke one or more functions when appropriate. In the OpenAI Chat API,
functions are now considered a legacy option and are deprecated in favor of tools.
So, if you are creating agents using OpenAI models, you should be using the OpenAI Tools agent instead of
the OpenAI Functions agent.
Using tools has a significant advantage because it allows the model to request that more than one function be
called when appropriate. This can help reduce the time it takes for your agent to achieve its goal, making it more
efficient and effective.
Now, let us look into the code and see how you can create an OpenAI Tools agent in action.
First, make sure you have the necessary libraries installed:
Initializing Tools
For this example, you will give our agent the ability to search the Web using Tavily:
tools = [TavilySearchResults(max_results=1)]
prompt = hub.pull("hwchase17/openai-tools-agent")
Watch as your agent springs into action, using the Tavily search tool to find information about LangChain and
provide a concise summary:
agent_executor.invoke(
{
"input": "what's my name? Don't use tools to look this up unless you
NEED to",
"chat_history": [
HumanMessage(content="hi! my name is Rabi Jay"),
AIMessage(content="Hello Rabi Jay! How can I assist you today?"),
],
}
)
In this case, the agent remembers the previous conversation and responds accordingly:
You have just witnessed the power of OpenAI Tools agents in action. They can intelligently choose and use
functions, provide structured outputs, and even engage in conversational interactions using chat history.
Initializing Tools
For this example, you will be testing the agent using the Tavily Search tool to search for information online. This
line creates a list of tools that the agent will have access to. You set the max_results parameter to 1 to indicate that
the tool will return a maximum of one search result:
tools = [TavilySearchResults(max_results=1)]
prompt = hub.pull("hwchase17/structured-chat-agent")
prompt.messages[0].prompt.template = """
You are a business analyst assistant tasked with helping entrepreneurs and
business owners make informed decisions.
Use the provided search tool to find relevant information and answer the
user's questions as best as you can.
If the question cannot be answered using the search results, provide
guidance on where the user can find more information.
Here are some examples of the types of business-related questions you may be
asked:
- What are some effective marketing strategies for a small business?
- How can I improve my company's cash flow?
- What are the key steps in creating a business plan?
- How do I conduct market research for a new product idea?
- What are some common challenges faced by startups, and how can they be
overcome?
Finally, you will construct the agent by calling the create_structured_chat_agent function, passing
in the LLM, tools, and prompt:
def process_llm_output(output):
if "Invalid or incomplete response" in output:
raise ValueError("The language model generated an invalid or
incomplete response.")
return output
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True,
max_iterations=3,
early_stopping_method="force",
)
Finally, you use a try-except block to execute the agent and handle any potential errors. You invoke the
agent using agent_executor.invoke({“input”:question}) and pass the question as input. The agent generates a
result, which you process using the process_llm_output() function. If the output is valid, you print the
processed result. If a ValueError is raised due to an invalid or incomplete response, you catch the exception
and print an error message.
Watch as the agent searches for information about LangChain using Tavily Search and provides a concise
summary. You can ignore any errors for now as it is out of scope for the exercise. You have now learned how to set
up and use a structured chat agent with LangChain to answer business-related questions.
> Entering new AgentExecutor chain...
{
"response": "Reducing operational costs in a manufacturing business can be
achieved through various strategies:",
"strategies": [
{
"title": "Lean Manufacturing",
"description": "Implement lean manufacturing principles to eliminate
waste, improve efficiency, and reduce costs. This involves streamlining
processes, optimizing inventory levels, and minimizing downtime."
},
{
"title": "Energy Efficiency",
"description": "Invest in energy-efficient equipment and processes to
lower utility expenses. Conduct an energy audit to identify areas for
improvement and consider renewable energy sources."
},
{
"title": "Supplier Negotiation",
"description": "Negotiate with suppliers for better pricing,
discounts, or favorable payment terms. Consolidate purchases with key
suppliers to leverage volume discounts."
},
{
"title": "Inventory Management",
"description": "Optimize inventory levels to reduce carrying costs and
minimize the risk of excess or obsolete inventory. Implement just-in-time
inventory practices where feasible."
},
{
"title": "Outsourcing Non-Core Activities",
"description": "Consider outsourcing non-core functions such as
janitorial services, maintenance, or certain manufacturing processes to
specialized third-party providers to reduce overhead costs."
},
{
"title": "Process Automation",
"description": "Invest in automation technologies to streamline
production processes, improve accuracy, and reduce labor costs. This may
involve robotics, automated assembly lines, or software systems."
}
]
}
ReAct Agent
Next is the ReAct agent, a powerful tool that allows you to implement the ReAct logic in your AI applications. It
enables the agent to reason and act based on the information it gathers.
To get started, make sure you have the necessary libraries installed. In this case, you will be using LangChain,
Tavily Search, and OpenAI. Here is how you can import them:
Initializing Tools
First, let us load some tools for our ReAct agent to use. In this example, you will be using Tavily Search to allow
your agent to search for information online:
tools = [TavilySearchResults(max_results=1)]
prompt = hub.pull("hwchase17/react")
Next, you will choose the language model (LLM) to use. In this case, let us go with OpenAI:
llm = OpenAI()
Finally, construct the ReAct agent by calling the create_react_agent function, passing in the LLM,
tools, and prompt:
Watch as the agent goes through a series of thoughts and actions to gather information about LangChain:
I should read the summary and look at the different features and
integrations of LangChain.
Action: tavily_search_results_json
Action Input: "LangChain features and integrations"
prompt = hub.pull("hwchase17/react-chat")
To use chat history, you can pass in a string representing previous conversation turns. Here is an example:
agent_executor.invoke(
{
"input": "what's my name? Only use a tool if needed, otherwise
respond with Final Answer",
# Notice that chat_history is a string, since this prompt is aimed
at LLMs, not chat models
"chat_history": "Human: Hi! My name is Rabi\nAI: Hello Rabi! Nice to
meet you",
}
In this case, the agent will reason about whether it needs to use a tool or not based on the given chat history:
Note You may also get different messages depending on what model you use. Here is one such example:
As an AI, I don’t have access to personal data about individuals unless it has been shared with me in the course
of our conversation. I am designed to respect user privacy and confidentiality. Therefore, I don’t know the
user’s name.
Final Answer: I’m sorry, but I don’t have access to that information.
{‘input’: “what’s my name? Only use a tool if needed, otherwise respond with Final Answer”,
‘chat_history’: ‘Human: Hi! My name is Rabi\nAI: Hello Rabi! Nice to meet you’,
Self-Ask Agents
Let us look at self-ask agents with search capabilities to find answers to your burning questions.
To get started, make sure you have the necessary tools in your toolkit. In this case, you will be using
LangChain, Fireworks LLM, and Tavily Answer. Go ahead and import them:
from langchain import hub
from langchain.agents import AgentExecutor,
create_self_ask_with_search_agent
from langchain_community.llms import Fireworks
from langchain_community.tools.tavily_search import TavilyAnswer
Initializing Tools
Now, initialize the tools your self-ask agent will use. For this agent, you will be using Tavily Answer, which
provides you with direct answers to your questions.
One important thing to note is that this agent can only use one tool, and it must be named “Intermediate
Answer.” So, let us set it up:
prompt = hub.pull("hwchase17/self-ask-with-search")
Next, choose the LLM that will power your agent’s thinking process. In this example, you will go with
Fireworks LLM:
Now, ask your agent a question and watch it work its magic:
agent_executor.invoke(
{"input": "What is the headquarters location of the company with the
largest market capitalization in the tech industry?"}
)
The agent will start by asking itself a follow-up question to gather more information:
Yes.
Follow up: Which company has the largest market capitalization in the tech
industry?
Using the Tavily Answer tool, the agent will find the answer to its own question:
As of June 18, 2024, the company with the largest market capitalization is
NVIDIA.
Armed with this information, the agent will then provide the final answer:
The self-ask with search agent has successfully found the answer to your question by breaking it down into
smaller steps and using the available tools.
llm = OpenAI(temperature=0)
search = GoogleSearchAPIWrapper()
You are using the OpenAI language model and the Google Search API to enable the agent to search for
information online.
tools = [
Tool(
name="Search",
func=search.run,
description="Useful for searching the internet for information."
)
]
In this example, you define a single tool called “Search” that allows the agent to perform Internet searches
using the Google Search API.
agent = initialize_agent(
tools,
llm,
agent="zero-shot-react-description",
verbose=True
)
You initialize the agent with the defined tools, language model, and the “zero-shot-react-description” agent
type, which enables the agent to autonomously decide which tools to use based on the user’s request.
6. Provide a task to the agent:
task = "I need to plan a trip to Paris. What are the top tourist
attractions I should visit, and what is the best time of year to go?"
result = agent.run(task)
print(result)
You give the agent a task related to planning a trip to Paris. The agent will autonomously break down the task,
search for relevant information, and provide a comprehensive response.
When you run this code, the agent will autonomously process the task and provide a response similar to the
following:
To plan your trip to Paris, here are the top tourist attractions you should
visit and the best time of year to go:
- Spring (March to May): Mild weather, beautiful blooms, and fewer crowds
compared to summer. Ideal for outdoor activities and sightseeing.
- Summer (June to August): Warm to hot weather, long days, and peak tourist
season. Perfect for outdoor events and festivals, but expect larger crowds
and higher prices.
- Fall (September to November): Pleasant weather, fewer tourists, and
beautiful autumn foliage. Great for sightseeing, cultural events, and food
festivals.
- Winter (December to February): Cold weather, shorter days, and festive
holiday decorations. Suitable for indoor activities, museums, and Christmas
markets. Expect lower prices and fewer crowds.
Ultimately, the shoulder seasons of spring and fall offer a good balance of
pleasant weather, manageable crowds, and reasonable prices. However, Paris
is a year-round destination with unique charms in every season.
export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_API_KEY="<your-api-key>"
Make sure to replace <your-api-key> with your actual LangSmith API key.
Defining Tools
We will equip your agent with two powerful tools – Tavily for online search and a retriever for querying a local
index.
Tool 1: Tavily
Tavily is a built-in tool in LangChain that allows your agent to search the Web effortlessly and gives access to a
vast knowledge base. To use Tavily, you will need an API key. They offer a free tier, but if you don’t have one or
don’t want to create one, feel free to skip this step.
Once you have your Tavily API key, export it as an environment variable:
export TAVILY_API_KEY="..."
search = TavilySearchResults()
This will return a list of search results related to the weather in San Francisco.
Tool 2: Retriever
In addition to online search, create a retriever that allows your agent to look up information from a local index.
This is particularly useful when you have specific data that you want your agent to access quickly.
To create the retriever, follow these steps:
1. Load the data using WebBaseLoader:
loader = WebBaseLoader("https://docs.smith.langchain.com/overview")
docs = loader.load()
documents = RecursiveCharacterTextSplitter(
chunk_size=1000, chunk_overlap=200
).split_documents(docs)
Now you have a retriever that can search for information within the indexed documents. You can invoke the
retriever with a query:
This will return the most relevant document chunk based on the query.
retriever_tool = create_retriever_tool(
retriever,
"langsmith_search",
"Search for information about LangSmith. For any questions about
LangSmith, you must use this tool!",
)
This creates a retriever tool with a specific name and description, making it more intuitive for your agent to
understand when and how to use it.
This allows your agent to perform online searches and look up information from a local index.
Here, you are using the gpt-3.5-turbo-0125 model with a temperature of 0. Feel free to adjust these
parameters based on your specific requirements.
prompt = hub.pull("hwchase17/openai-functions-agent")
prompt.messages
This will retrieve the prompt template and display its messages, which include system messages, placeholders
for chat history and agent scratchpad, and human input.
This function takes in the LLM, the list of tools, and the prompt and returns an initialized agent ready to tackle
our tasks.
We pass in the initialized agent, the list of tools, and set verbose=True to enable detailed output during
execution.
Let us put the agent to the test and see what it can do! You will run a few queries and observe how the agent
handles them. Keep in mind that for now, these queries are stateless, meaning the agent won’t remember previous
interactions.
First, start with a simple greeting:
agent_executor.invoke({"input": "hi!"})
When you run this code, the agent will process the input and generate a response. Here is what I got:
Now, let us ask the agent a more specific question related to LangSmith and testing:
For this query, the agent invokes the tavily_search_results_json tool to search for weather
information in San Francisco. It retrieves the relevant data from the search results and presents a concise summary
of the current weather conditions, including temperature, wind speed, and humidity.
These examples demonstrate how our agent can handle different types of queries and use the appropriate tools
to generate informative responses.
In this quick start, we covered the basics of creating a simple agent and progressively enhanced it with memory
capabilities. We learned how to pass in chat history and structure messages using AIMessage and
HumanMessage.
Try It Yourself
You can dive deeper into different types of agents, experiment with various prompts, and integrate additional
tools to expand your agent’s capabilities.
As you can see, the initialize_agent function takes care of selecting the appropriate Agent type based
on the provided tools and language model, making the initialization process more intuitive and straightforward.
Key Takeaways
In this chapter, we explored the process of designing and implementing various types of agents using LangChain.
You learned how to define clear objectives for your agents, understand their core concepts, and leverage a range of
tools and toolkits to enhance their capabilities. We delved into different agent types, including Zero-Shot-React
agents, structured chat agents, and ReAct agents, providing practical code examples for each. Additionally, we
covered the importance of adding memory to your agents, allowing them to engage in more natural, context-aware
conversations.
By now, you should have a solid foundation for building, customizing, and deploying intelligent agents tailored
to your specific needs. These agents can autonomously handle complex tasks, making your AI applications more
efficient and effective.
Review Questions
Let us test your understanding of this chapter’s content.
1. What is the first step in designing an agent?
A. Implementing the agent’s memory
B. AgentAction
C. AgentLoader
D. AgentManager
C. ReAct Agent
D. Conversation Agent
B. start_agent
C. initialize_agent
D. create_agent
B. It allows the agent to remember previous interactions and provide context-aware responses.
B. SerpAPI
C. OpenAITool
D. WikiQuery
8. Which agent type is optimized for handling tools with multiple inputs?
A. Zero-Shot-React Agent
B. ReAct Agent
D. Self-Ask Agent
Answers
1. B
2. B
3. D
4. C
5. B
6. B
7. C
8. C
9. B
10. B
Further Reading
These references will provide you with in-depth knowledge and practical examples to help you understand and
implement various types of LangChain agents effectively:
1. Tool Calling Agents
Tool Calling Agents: Information on how to set up and use tool calling agents to handle various tasks dynami
https://python.langchain.com/v0.1/docs/modules/agents/agent_types/tool_ca
4. ReAct Agents
ReAct Agent Guide: Detailed instructions and examples for setting up and using ReAct agents to implement r
action capabilities.
OceanofPDF.com
© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2024
R. Jay, Generative AI Apps with LangChain and Python
https://doi.org/10.1007/979-8-8688-0882-1_10
In this chapter, we will explore how to create custom agents using LangChain. By the end of this
chapter, you will have a solid understanding of how to load language models, define tools, create
prompts, and bind everything together to build a functional agent. We will also cover practical use cases
like customer support automation, personalized recommendations, and real-time data analysis and
decision-making.
Defining Tools
Next up, you need to equip your agent with tools. Let us start with a simple Python function that
calculates the length of a given word:
@tool
def get_word_length(word: str) -> int:
"""Returns the length of a word."""
return len(word)
get_word_length.invoke("abc") # Output: 3
Pay close attention to the docstring here. It serves as a crucial guide for your agent to understand
how to use the tool effectively.
Now, create a list to hold all the tools your agent will have at its disposal:
tools = [get_word_length]
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a very powerful assistant, but don't know
current events",
),
("user", "{input}"),
MessagesPlaceholder(variable_name="agent_scratchpad"),
]
)
You have provided your agent with a brief system message, a placeholder for the user's input, and a
placeholder for the agent's scratchpad (a space to store intermediate steps and tool outputs).
llm_with_tools = llm.bind_tools(tools)
agent = (
{
"input": lambda x: x["input"],
"agent_scratchpad": lambda x: format_to_openai_tool_messages(
x["intermediate_steps"]
),
}
| prompt
| llm_with_tools
| OpenAIToolsAgentOutputParser()
)
Awesome! Your agent successfully used the get_word_length tool to answer your question.
Adding Memory
But what if you want your agent to remember previous interactions and engage in a more natural
conversation?
First, you need to add a placeholder for chat history in your prompt:
MEMORY_KEY = "chat_history"
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a very powerful assistant, but bad at
calculating lengths of words.",
),
MessagesPlaceholder(variable_name=MEMORY_KEY),
("user", "{input}"),
MessagesPlaceholder(variable_name="agent_scratchpad"),
]
)
chat_history = []
Finally, update your agent and AgentExecutor to include the chat history:
agent = (
{
"input": lambda x: x["input"],
"agent_scratchpad": lambda x: format_to_openai_tool_messages(
x["intermediate_steps"]
),
"chat_history": lambda x: x["chat_history"],
}
| prompt
| llm_with_tools
| OpenAIToolsAgentOutputParser()
)
agent_executor = AgentExecutor(agent=agent, tools=tools,
verbose=True)
Now, when interacting with your agent, track the inputs and outputs as chat history:
Your agent can now engage in a back-and-forth conversation, remembering previous interactions
and providing contextual responses.
You have just created your very own custom agent using LangChain.
def search_knowledge_base(query):
# Implement logic to search the knowledge base based on the
query
# Return the most relevant answer or information
pass
tools = [
Tool(
name="Knowledge Base Search",
func=search_knowledge_base,
description="Useful for searching the knowledge base for
answers to customer inquiries."
)
memory = ConversationBufferMemory(memory_key="chat_history")
agent = initialize_agent(
tools,
OpenAI(temperature=0),
agent="conversational-react-description",
verbose=True,
memory=memory
)
6. Integrate the agent into your customer support channels:
Implement a user interface, such as a chatbot or a webform, where customers can interact with
the agent.
Use the agent to handle incoming customer inquiries:
def handle_inquiry(inquiry):
response = agent.run(inquiry)
return response
# Example usage
customer_inquiry = "How can I reset my account password?"
response = handle_inquiry(customer_inquiry)
print(response)
import os
from dotenv import load_dotenv
# Example usage
customer_inquiry = "How can I reset my account password?"
response = handle_inquiry(customer_inquiry)
print(response)
In this example, you define a simple search_knowledge_base function that returns predefined
responses based on the customer's inquiry. The function can be enhanced to search an actual knowledge
base or database for more dynamic responses.
The agent is initialized with the knowledge base search tool and a memory object to store the
conversation history. The handle_inquiry function takes a customer inquiry as input, passes it to the
agent, and returns the generated response.
When you run this code with the example customer inquiry, it will output the predefined response
for resetting the account password such as shown below:
Entering new AgentExecutor chain...
Personalized Recommendations
In this section, we will explore how you can leverage agents to build a personalized recommendation
system that thrills your users. Whether you are working on an ecommerce platform, a content streaming
service, or any application where personalization is key, this use case can help.
Let us dive into the step-by-step process of creating a personalized recommendation system using
agents.
1. Collect user data:
Gather relevant information about your users, such as their preferences, behavior, and
interaction history within your application.
This data can include user profiles, browsing history, purchase records, ratings, and reviews.
Ensure that you comply with data privacy regulations and obtain necessary user consent.
Here is a sample data for illustrative purposes:
user_preferences_data = {
"1234": {
"favorite_genres": ["Action", "Sci-Fi"],
"favorite_actors": ["Tom Cruise", "Brad Pitt"],
"favorite_directors": ["Christopher Nolan"]
}
# Add more user data as needed
}
def recommendation_generator_tool(user_preferences):
# Generate personalized recommendations based on user
preferences
recommendations = []
if user_preferences:
favorite_genres =
user_preferences.get("favorite_genres", [])
favorite_actors =
user_preferences.get("favorite_actors", [])
favorite_directors =
user_preferences.get("favorite_directors", [])
if favorite_genres:
recommendations.append(f"Based on your favorite
genres ({', '.join(favorite_genres)}), we recommend:")
recommendations.append("1. Inception (Action, Sci-
Fi)")
recommendations.append("2. The Matrix (Action, Sci-
Fi)")
recommendations.append("3. Guardians of the Galaxy
(Action, Comedy, Sci-Fi)")
if favorite_actors:
recommendations.append(f"Considering your favorite
actors ({', '.join(favorite_actors)}), you might enjoy:")
recommendations.append("1. Mission: Impossible -
Fallout (starring Tom Cruise)")
recommendations.append("2. Once Upon a Time in
Hollywood (starring Brad Pitt)")
recommendations.append("3. The Hunger Games
(starring Jennifer Lawrence)")
if favorite_directors:
recommendations.append(f"Given your appreciation for
{' and '.join(favorite_directors)}, we suggest:")
recommendations.append("1. The Dark Knight Trilogy
(directed by Christopher Nolan)")
recommendations.append("2. Pulp Fiction (directed by
Quentin Tarantino)")
recommendations.append("3. Interstellar (directed by
Christopher Nolan)")
if not recommendations:
recommendations.append("Oops! We couldn't find
personalized recommendations based on your preferences.")
recommendations.append("Please provide more information
about your favorite genres, actors, or directors.")
return "\n".join(recommendations)
tools = [
Tool(
name="User Preference Tool",
func=user_preference_tool,
description="Retrieves user preferences based on the user
ID."
),
Tool(
name="Recommendation Generator Tool",
func=recommendation_generator_tool,
description="Generates personalized recommendations based
on user preferences."
)
]
recommendation_prompt = PromptTemplate(
input_variables=["user_id"],
template="""
Given the user ID {user_id}, retrieve their preferences and
generate personalized recommendations.
Provide a list of top recommendations along with a brief
explanation for each recommendation.
"""
)
7. Initialize the agent:
recommendation_agent = initialize_agent(
tools,
OpenAI(temperature=0.7),
agent="zero-shot-react-description",
verbose=True
)
# Example usage
user_id = "1234"
recommendations = generate_recommendations(user_id)
print(recommendations)
user_id = "1234"
recommendations = generate_recommendations(user_id)
print(recommendations)
Here is a complete code example that demonstrates a basic personalized recommendation system:
def recommendation_generator_tool(user_preferences):
# Generate personalized recommendations based on user preferences
recommendations = []
if user_preferences:
favorite_genres = user_preferences.get("favorite_genres", [])
favorite_actors = user_preferences.get("favorite_actors", [])
favorite_directors =
user_preferences.get("favorite_directors", [])
if favorite_genres:
recommendations.append(f"Based on your favorite genres
({', '.join(favorite_genres)}), we recommend:")
recommendations.append("1. Inception (Action, Sci-Fi)")
recommendations.append("2. The Matrix (Action, Sci-Fi)")
recommendations.append("3. Guardians of the Galaxy
(Action, Comedy, Sci-Fi)")
if favorite_actors:
recommendations.append(f"Considering your favorite actors
({', '.join(favorite_actors)}), you might enjoy:")
recommendations.append("1. Mission: Impossible - Fallout
(starring Tom Cruise)")
recommendations.append("2. Once Upon a Time in Hollywood
(starring Brad Pitt)")
recommendations.append("3. The Hunger Games (starring
Jennifer Lawrence)")
if favorite_directors:
recommendations.append(f"Given your appreciation for {'
and '.join(favorite_directors)}, we suggest:")
recommendations.append("1. The Dark Knight Trilogy
(directed by Christopher Nolan)")
recommendations.append("2. Pulp Fiction (directed by
Quentin Tarantino)")
recommendations.append("3. Interstellar (directed by
Christopher Nolan)")
if not recommendations:
recommendations.append("Oops! We couldn't find personalized
recommendations based on your preferences.")
recommendations.append("Please provide more information about
your favorite genres, actors, or directors.")
return "\n".join(recommendations)
# Example usage
user_id = "1234"
recommendations = generate_recommendations(user_id)
print(recommendations)
In this example, we have a dummy user preference dataset (user_preferences_data) that
maps user IDs to their favorite genres, actors, and directors. The user_preference_tool retrieves
the preferences for a given user ID, while the recommendation_generator_tool generates
personalized recommendations based on those preferences.
The agent is initialized with the recommendation tools and the recommendation prompt. The
generate_recommendations function takes a user ID, retrieves their preferences, and generates
personalized recommendations using the agent.
When you run this code with the example user ID, it will output a set of personalized movie
recommendations based on the user's favorite genres, actors, and directors as shown below:
Remember, this is a simplified example to illustrate the concept. In a real-world scenario, you would
integrate the agent with your actual user data, recommendation algorithms, and domain-specific
knowledge to generate more accurate and diverse recommendations.
def data_analysis_tool(data):
# Perform data analysis tasks on the provided data
# Return insights, patterns, or anomalies
# Here's a dummy example that checks for high temperature and
humidity
try:
temperature = int(data.split("Temperature: ")
[1].split("°C")[0])
humidity = int(data.split("Humidity: ")[1].split("%")[0])
if temperature > 25 and humidity > 50:
return "High temperature and humidity detected.
Adjustments may be necessary."
else:
return "Temperature and humidity within normal range.
No action required."
except (IndexError, ValueError):
return "Error: Invalid data format. Unable to analyze."
decision_prompt = PromptTemplate(
input_variables=["data_insights"],
template="""
Based on the data insights: {data_insights},
make a decision on the appropriate action to take.
Provide a clear and concise decision along with a brief
justification.
"""
decision_agent = initialize_agent(
tools,
OpenAI(temperature=0.7),
agent="zero-shot-react-description",
verbose=True
# Example usage
while True:
decision = process_data_and_make_decision()
print("Decision:", decision)
# Perform actions based on the decision
input("Press Enter to retrieve the next batch of data...")
Here is a complete code example that demonstrates a basic real-time data analysis and decision-
making system:
def data_analysis_tool(data):
# Perform data analysis tasks on the provided data
# Return insights, patterns, or anomalies
# Here's a dummy example that checks for high temperature and
humidity
try:
temperature = int(data.split("Temperature: ")[1].split("°C")
[0])
humidity = int(data.split("Humidity: ")[1].split("%")[0])
if temperature > 25 and humidity > 50:
return "High temperature and humidity detected.
Adjustments may be necessary."
else:
return "Temperature and humidity within normal range. No
action required."
except (IndexError, ValueError):
return "Error: Invalid data format. Unable to analyze."
# Example usage
while True:
decision = process_data_and_make_decision()
print("Decision:", decision)
# Perform actions based on the decision
input("Press Enter to retrieve the next batch of data...")
In this example, you have dummy functions for data retrieval (data_retrieval_tool) and data
analysis (data_analysis_tool). The data_retrieval_tool simulates retrieving real-time
data by generating random temperature and humidity values. The data_analysis_tool performs a
simple analysis by checking if the temperature and humidity exceed certain thresholds.
The agent is initialized with the data retrieval and analysis tools, along with the decision-making
prompt. The process_data_and_make_decision function retrieves the latest data, analyzes it,
and uses the agent to make a decision based on the insights.
The example usage demonstrates a continuous loop where the system retrieves data, makes
decisions, and prompts the user to retrieve the next batch of data.
This is a simplified example to illustrate the concept. In a real-world scenario, you would integrate
the agent with your actual real-time data sources, use more sophisticated data analysis techniques, and
implement the necessary actions based on the decisions.
Key Takeaways
In this chapter, we successfully created a custom agent using LangChain. We explored loading language
models, defining tools, and creating prompts. Additionally, we covered practical applications for agents,
such as customer support automation, personalized recommendations, and real-time data analysis. By
adding memory, we enhanced our agents to engage in more natural and coherent conversations.
Review Questions
Let us test your understanding of this chapter’s content.
1. What is the first step in creating a custom agent?
A. Defining tools
C. Creating prompts
B. ConversationBufferMemory
C. ChatPromptTemplate
D. AgentExecutor
D. Decision-making tool
D. Simplified codebase
Answers
1. B
2. B
3. B
4. B
5. B
Further Reading
These references will provide you with in-depth knowledge and practical examples to help you
understand and implement various LangChain agent use cases effectively, from customer support
automation to real-time data analysis and personalized recommendations:
1. Creating Custom Agents
This is a cookbook that shows you how to build a custom agent using LlamaIndex.
https://docs.llamaindex.ai/en/latest/examples/agent/custom_agent/
2. Defining and Using Tools in Agents
LangChain Tools Integration: Explore how to define and integrate various tools into your
agents, enhancing their functionality and effectiveness.
https://python.langchain.com/v0.2/docs/integrations/platforms/
OceanofPDF.com
© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2024
R. Jay, Generative AI Apps with LangChain and Python
https://doi.org/10.1007/979-8-8688-0882-1_11
In this chapter, we are going to use two powerful tools, namely, LangChain and
Streamlit to develop a ChatGPT-like LangChain-based UI application. In
particular, we will be transitioning from Jupyter Notebooks to a web application
to create something that is not just functional but also user-friendly with sleek
web interfaces and production-ready. Here is a sneak peek of what you will be
doing:
1. Setting up our development environment
If you already have Streamlit installed, you can skip this step.
Installing Python
If you haven’t installed Python in your desktop environment, you may get an
error as shown below:
The error message suggests that the Python command is not recognized in your
Command Prompt. This usually means that either Python is not installed or the
Python installation directory is not added to the system’s PATH environment
variable.
To resolve this issue, you can try the following steps:
1. Check if Python is installed.
Open a new Command Prompt window.
Type python --version and press Enter.
If Python is installed correctly, it will display the version number. If not,
you need to install Python first.
Install Python from the Microsoft Store as shown below:
2. Install pip.
If you have Python installed but pip is missing, you can download the
get-pip.py script from the official pip website:
https://bootstrap.pypa.io/get-pip.py.
Save the get-pip.py file to a location on your computer, such as
C:\Users\abc\get-pip.py.
Open a Command Prompt window and navigate to the directory where
you saved the get-pip.py file using the cd command.
Run the following command to install pip:
3. Add Python and pip to the system’s PATH as shown in the figure below.
Open the Start menu and search for “Environment Variables.”
Click “Edit the system environment variables.”
In the System Properties window, click the “Environment Variables”
button.
Under “System variables,” scroll down and find the “Path” variable, then
click “Edit.”
Click “New” and add the path to your Python installation directory,
typically C:\Python39 or similar.
Click “New” again and add the path to the Scripts directory within your
Python installation, typically C:\Python39\Scripts.
Click “OK” to save the changes.
4. Verify the installation.
Open a new Command Prompt window.
Type pip --version and press Enter.
If pip is installed and accessible, it will display the version number.
After completing these steps, you should be able to run the pip install
streamlit command successfully in your Command Prompt.
If you still encounter issues, make sure you have the necessary permissions to
install packages and that your Internet connection is stable. Additionally, you can
try running the Command Prompt as an administrator by right-clicking the
Command Prompt icon and selecting “Run as administrator.”
After making these changes and ensuring the dependencies are installed, you
should be able to run the code on your desktop.
If you still encounter issues, make sure you have the latest versions of the
required dependencies installed. You can update them using the following
commands:
import os
import streamlit as st
from langchain.chains import LLMChain
from langchain.prompts import ChatPromptTemplate,
HumanMessagePromptTemplate
from langchain.chat_models import ChatOpenAI
os.environ["OPENAI_API_KEY"] =
"your_openai_api_key_here"
This sets the OpenAI API key as an environment variable, which will be
used by the ChatOpenAI model.
3. Streamlit UI setup
Here, you are creating the title of the app and an input box for the user’s
question.
You then loop display all previous questions and answers stored in the
chat history.
You check if there is a user query, and the submit button is clicked.
st.session_state['chat_history'].append({"question":
user_query, "answer": response})
st.write("Answer:")
st.write(response)
You add the new Q&A pair to the chat history and display the response.
9. Fallback message
else:
st.write("Please enter a question and click
Submit.")
# Previous code...
Make sure that the pc.create_index() function call and its arguments
are indented correctly under the if statement. The indentation should match the
level of the if statement.
Review the indentation of the entire code block and ensure that it is
consistent throughout. Each line within the same code block should have the
same level of indentation.
After fixing the indentation, save the changes and run the Python script again
using Streamlit.
Run Your Streamlit Application
To run your Streamlit application, follow these steps:
Open a terminal or command prompt.
Navigate to the directory where your LangChainUI.py file is located using
the cd command. For example:
cd /path/to/your/app/directory
Once you are in the correct directory, run the following command:
This command will start the Streamlit server and run your LangChainUI.py
file.
Streamlit will provide you with a URL (usually http://localhost:8501) that you
can open in your web browser to view and interact with your Streamlit
application.
I asked the first question – “Who is the CEO of Apple Inc.?” – and the answer is
straightforward (as of 2024, Tim Cook is the CEO of Apple Inc.).
The second question directly refers to the answer of the first question without
naming the person explicitly. It also requires the app to know that (a) the
“person” referred to is Tim Cook, (b) Tim Cook’s predecessor was Steve Jobs,
and (c) the product in question is the iPhone.
If the app’s memory is working correctly, it should be able to
1. Provide the correct answer to the first question (Tim Cook).
2. Understand that “this person” in the second question refers to Tim Cook.
4. Identify that the iPhone was the revolutionary product introduced by Steve
Jobs.
A correct response to the second question would demonstrate that the app has
maintained context across the two questions and can link information from both
to provide a coherent answer.
c. Run the installer and follow the installation wizard. Use the default
settings unless you have specific preferences as shown below:
2. Add Git to PATH: After installation, Git should automatically be added to
your system’s PATH. However, if it is not, you can add it manually.
a. Right-click “This PC” or “My Computer” and select “Properties.”
d. Under “System variables,” find and select the “Path” variable, then click
“Edit.”
e. Click “New” and add the path to your Git installation. It is typically
C:\Program Files\Git\cmd
f. Click “OK” to close all dialog boxes.
git –version
This should display the installed version of Git as shown in the figure below
if everything is set up correctly.
5. Try Git Init: Now you should be able to run git init in your project
directory without any errors.
2. Set Your Name: Run this command, replacing “Your Name” with your
actual name.
This will display all your Git configurations, including the email and name
you just set.
4. Try Committing Again: After setting your email and name, try your commit
command again.
git commit -m "Initial commit for Streamlit Q&A
app"
Important Notes
The --global flag sets this configuration for all Git repositories on your
computer. If you want to use different settings for different projects, you can
omit --global and run these commands from within a specific repository.
Make sure to use an email address that is associated with your GitHub account
if you plan to push your commits to GitHub.
If you are concerned about privacy, GitHub provides options for keeping your
email address private. You can use the GitHub-provided no-reply email
address in your Git configuration.
After completing these steps, you should be able to commit your changes
without any identity-related errors. Remember, this is a one-time setup unless
you want to change your Git identity later.
b. Permanent
Search for “Environment Variables” in the Start menu.
Click “Edit the system environment variables.”
Click “Environment Variables.”
Under “User variables,” click “New.”
Variable name: OPENAI_API_KEY
Variable value: your_api_key_here
Click “OK” to save.
2. For macOS/Linux
a. Temporary (for current session only)
Open Terminal.
Type: export OPENAI_API_KEY=your_api_key_here
b. Permanent
Open your shell configuration file (e.g., ~/.bash_profile, ~/.zshrc).
Add the line: export
OPENAI_API_KEY=your_api_key_here
Save the file and restart your terminal or run source
~/.bash_profile (or relevant file).
OPENAI_API_KEY=your_api_key_here
load_dotenv()
openai_api_key = os.getenv("OPENAI_API_KEY")
openai_api_key = os.getenv("OPENAI_API_KEY")
6. Best practices
Never commit your .env file to version control.
Add .env to your .gitignore file.
Provide a .env.example file with placeholder values for other developers.
.env
*.pyc
__pycache__/
import os
from dotenv import load_dotenv
load_dotenv()
openai_api_key = os.getenv("OPENAI_API_KEY")
8. Clean Git history. If the API key is still in your Git history, you may need to
clean it:
Remember, after cleaning the Git history, anyone who has cloned your
repository will need to re-clone it or perform a forced pull.
These steps should remove the API key from your repository and prevent it
from being pushed to GitHub. You should always be cautious with sensitive
information and double-check your commits before pushing.
b. Go to Settings ➤ Emails.
c. Look for a paragraph that says something like “Keep my email address
private. We will use [email protected] when
performing web-based Git operations and sending email on your
behalf.”
2. Set this email in your Git configuration. Open your command prompt and
run
This will open your default text editor. Just save and close the editor
without making any changes.
2. Uncheck the box that says “Keep my email address private” as shown below.
3. Make sure your public commit email is set to an email you are comfortable
being public.
After making either of these changes, try pushing your changes again. This
should resolve the email privacy restriction issue.
Remember, if you choose to use the no-reply email, you will need to use this
email for all your Git commits to avoid this issue in the future. If you are
working on a shared machine or multiple projects, you might want to set this on
a per-repository basis instead of globally.
Run this command in each repository where you want to use the no-reply
email.
Deploying the App in GitHub
Below are the steps to deploy your app in GitHub:
1. First, create a new repository on GitHub if you haven’t already. Let us say
you named it “streamlit-qa-app.”
3. Check your remote URL. Run this command to verify the remote URL:
git remote -v
It should show
origin https://github.com/your_user_name/streamlit-
qa-app.git (fetch)
origin https://github.com/your_user_name/streamlit-
qa-app.git (push)
4. If the URL is incorrect, update it with
5. Ensure you are logged in if you see this error message: “please complete
authentication in your browser.”
9. Use HTTPS instead of SSH. If you are using an SSH URL, try the HTTPS
URL instead:
3. Select your GitHub repository, branch (usually “main”), and the main
Python file (e.g., “LangChainUI.py”).
4. Click “Deploy.”
Congratulations, if everything goes well, you must have successfully
deployed the app to Streamlit Cloud for others to access.
3. AWS Elastic Beanstalk: If you are looking for something more robust, this
is a solid choice. It is a bit more complex, but it scales well. You will need to
Set up an AWS account
Install the AWS CLI and EB CLI
Initialize your EB environment with eb init
Create an environment with eb create
Deploy with eb deploy
4. Google Cloud Run: This is great if you are comfortable with containers.
Here is the gist:
Create a Dockerfile for your app.
Build your container image.
Push it to Google Container Registry.
Deploy to Cloud Run using the Google Cloud Console or gcloud CLI.
6. Streamlit Cloud: You have just now deployed using Streamlit Cloud.
Remember, each of these options has its pros and cons. Heroku and Streamlit
Cloud are great for getting started quickly. AWS and Google Cloud offer more
control and scalability but have a steeper learning curve. DigitalOcean sits
somewhere in the middle.
My advice? Start with something simple like Heroku or Streamlit Cloud. As
you get more comfortable and your needs grow, you can explore the more
advanced options.
And don’t stress if it doesn’t work perfectly the first time. Deployment can be
tricky, and even experienced developers sometimes need a few tries to get it
right. Before you know it, you will be deploying apps like a pro!
Key Takeaways
Great job! You have just wrapped up an incredible journey from prototype to
production.
You have transformed that initial Jupyter code into a fully functional web app
using Streamlit. In particular, you have harnessed the power of OpenAI’s GPT-
3.5-turbo model to create a web app that people can interact with, ask questions,
and get meaningful responses. That is huge!
Let us further break down what you have achieved:
1. You have built an interactive Q&A app where users can have a conversation
with AI!
2. You have mastered chat history management which allows your app to
maintain context over multiple exchanges.
3. You have learned to deploy your application so that it is ready for the world
to use.
But here is the real beauty, you haven’t just learned to use tools like
LangChain and Streamlit, but you have gained skills that are directly applicable
to real-world AI development. The next time someone says, “We need an AI-
powered web app,” you can confidently say, “I can do it!”
So, what is next on your AI development journey? Whatever it is, I know you
are ready for it. Keep coding, learning, and, most importantly, keep pushing the
boundaries of what is possible with gen AI.
Review Questions
Test your understanding of this chapter.
1. What is Streamlit primarily used for?
A. Data storage
B. Building interactive web applications
C. Machine learning model training
D. Network security
9. What should you do if you need to maintain chat history across different
user sessions in Streamlit?
A. Use session cookies
B. Implement a database backend
C. Store the data in global variables
D. Use a local text file
10. Which of the following is a key step in integrating OpenAI’s GPT-3.5-turbo
model using LangChain?
A. Training the model from scratch
B. Setting the model temperature parameter
C. Uploading data to the model
D. Visualizing the model’s architecture
Answers
1. B. Building interactive web applications
2. C. LangChain
4. A. Session state
7. D. streamlit run
Further Reading
By exploring these resources, you can deepen your understanding and enhance
your ability to build, deploy, and manage AI-powered applications using
Streamlit and LangChain.
GitHub Integration
GitHub for Beginners: Learn how to use GitHub for version control and
collaboration.
https://docs.github.com/en/get-started/start-your-
journey/hello-world
Using Environment Variables: Best practices for managing API keys and
other sensitive information using environment variables.
https://12factor.net/config
OceanofPDF.com
Index
A
Access token
add_example method
Advanced chain techniques
errors and exceptions
large datasets with chains
optimize your chain performance
test and debugging chains
AgentAction
Agent design and implementation
AgentExecutor
concepts
defininig Agent’s objective
defininig Agent’s tasks
design considerations
gathering tools and resources
inputs
LangGraph
See LangGraph
outputs
toolkits
tools
AgentExecutor
AgentFinish
Agentic applications
Agents
inputs
outputs
Agents as task managers
code generation
creative writing
generative AI enhancing capabilities
Agent’s importance
building applications
autonomous coders
creative collaborators
research assistants
generative AI application
content understanding
contextual decision-making
dynamic content generation
iterative refinement
Agent’s key parts
LLM
prompt
tools
Agent’s Thought Process
Agent types selection criteria
chat history support
multi-input tools
parallel function calling
AIMessage
AI tutoring system
ALBERT
Anthropic’s Claude AI models
Claude 3 model family
Create Key
Get API Keys
SDK
sign up
API
See Application programming interface (API)
API tool calling agents
agent creation
agent running
chat history
description
goal
initializing tools
setup
tools selection
Application programming interface (API)
LLMs
See LLM APIs
app.py file
AR
See Augmented reality (AR)
Argument variable
asimilarity_search_by_vector method
Augmented reality (AR)
Automated content scheduler
AutoModelForCausalLM.from_pretrained()
Autonomous decision-making capabilities
AWS Elastic Beanstalk
B
BART
BaseExampleSelector class
BaseLLM
BaseTool class
BLEU score
Building Q&A
vs. chatbot
API key handling
chain vs. direct invocation
interaction
model used
output parsing
prompt handling
single-turn interaction
use case
user input
conversational app
create a prompt template
create the chain
full end-to-end working code
import libraries
invoke the chain
LLM initialize
output
Output Parser
C
CacheBackedEmbeddings class
Caching embeddings
Chains
vs. agents
autonomy
contextual understanding
goal-oriented
overview
components
higher-level components
internal components
LCELs
LCEL vs. legacy chains
advanced features
load_chain function
legacy chain example
legacy chains
CharacterTextSplitter class
Chatbot development
Chatbots
Chat completion
ChatGPT
ChatGPT-like LangChain-based UI application
Chat models
ChatOpenAI
Chat prompt templates
create chat model instance
create prompt template
define
format prompt with user input
generate chat completion
import required libraries
print assistant’s response
print formatted prompt
prompt templates
set up the OpenAI API key
user_input variable
Chirp Speech models
chunk_overlap parameter
Claude 3 Haiku
Claude 3 model family
Claude 3 Haiku
Claude 3 Opus
Claude 3 Sonnet
easy to use
following directions
legacy models
multilingual capabilities
transitioning
vision and image processing
Claude 3 Opus
Claude 3 Sonnet
Code Snippet
CodeTextSplitter
Codex
Codey Suite
Cohere’s AI model
Cohere’s command model
Colab notebooks
Command-light model
Command-R model
Complex workflow apps, chain composition strategies
data summarization app, sequential chains
sentiment analysis app, conditional chains
SequentialChain use case example
automated fraud detection, finance
content generation pipeline app
customer support chatbot app
task allocation app, router chains
Conditional chain
Content generation, agents
loaded tool
prompt
ReAct format
SerpAPI
Content generation platform
Context-aware applications
Context-rich applications
Contextual Compression Retriever
ConvBERT
Conversation Agent
Conversational app
ChatOpenAI
ChatOpenAI Object creation
generate_response
importing necessary modules
interaction loop
setting the API key
conversational-react-description agent
ConversationalRetrievalChain
ConversationBufferMemory object
create_agent_method function
create_documents method
create_react_agent function
create_tool_calling_agent function
CSV files
CSVLoader class
CSV Parser
CTRL
Custom agent creation
adding memory
binding tools
creating agent
creating prompt
defining tools
loading language model
testing agent
Customer support automation
agent initialization
agent integration
agent setup
code example
defining tools and memory
handle_inquiry function
identifying common customer inquiries
implementation using agents
knowledge base creation
monitoring and improving
search_knowledge_base function
self-service support system
Customer support systems
Custom Example Selector
D
DALL-E 2
Data aware
Data connections
Data storage and retrieval
graph databases
traditional databases
vectorstores
Data Summarization App, sequential chains
Decision Points
Dense Passage Retrieval (DPR)
Development environment
LangChain
OpenAI’s LLMs
Python
Development environment setup
installing Python
installing required dependencies
installing streamlit library
Development, LangChain
agents
callbacks
chains
chat models
composition
data storage and retrieval
document loaders
embedding models
LLMs
memory
model I/O
prompts
retrieval component
retrievers
text splitters
tools
Development playground
Colab notebooks
Hugging Face Spaces
Kaggle notebooks
LangChain
OpenAI API
Development productivity
DigitalOcean App Platform
Direct LLM API vs. LangChain
benefits
content generation platform
development complexity
flexibility
generic responses
integration and scalability
scalability
simplicity
streamlining data integration
text generation task
trade-offs
Document loaders
in action
CSV files
JSON files
load documents, sources
PDFs
Document objects
Domain-specific applications
DPR
See Dense Passage Retrieval (DPR)
E
Ecosystem, LangChain
high-level components
and integration
LangServe
LangSmith
LangTemplates
LangChain-community
LangChain-Core
ELECTRA
Embedding models
embed_documents method
embed_query method
End Point
End-to-end fully working Agent
code explanation
importing necessary modules
initializing agent
initializing LLM
initializing openAI client
installing dependencies
loading environment variables
loading tools
running agent with query
code generation
interpreting outputs
outputs
SerpApi API key
.env file
Error handling and troubleshooting
API and reliability
API connectivity issues
authentication errors
diagnosing and resolving
API connectivity issues
authentication errors
invalid request errors
logging and monitoring
model-specific limitations
rate limiting
implementing
invalid request errors
model-specific limitations
rate limiting
example_prompt template
Example selectors
create list of example
Custom Example Selector
custom implementation
define BaseExampleSelector Class
factors
few-shot learning
Length example selector
LLM
MMR example selector
Ngram example selector
prompt
similarity example selector
ExampleToolkit
F
FAISS (Facebook AI Similarity Search)
FastText
Few-shot learning
Few-shot prompt template
question and answer
create FewShotPromptTemplate
integrate example selector into prompt template
prepare example set
review output
select examples with example selector
test prompt templates
works
FewShotPromptTemplate
Flagship Command model
Float16 data type
Float32 data type
format_prompt() method
Formatted Prompt
from_bytes_store method
from_messages() method
from_tiktoken_encoder() method
G
Gemini
Gemini 1.0 Pro Vision
Generative AI Apps
LangChain
See LangChain
Generative models
get-pip.py file
get-pip.py script
get_relevant_documents method
get_tools method
GitHub
git remote add command
--global flag
Google Cloud natural language API
API Key creation
billing account
create and download the JSON key file
enabling
environment variables
Google Cloud Console
Granting Access
Python script
sentiment analysis
sentiment magnitude
Service Accounts
Google Cloud Run
Google Colab
Google’s AI model
Chirp Speech
Codey Suite
language and chat models
Gemini 1.0 Pro
Gemini 1.0 Pro Vision
Google Cloud natural language API
PaLM 2
multimodal and security
text and image data
GPT
GPT-3.5-turbo
GPT-4
GPT-Neo
GPU capabilities
H
Heroku
HtmlTextSplitter
Hugging Face
code to use LLaMA model
explanation for the code
pass your access token
running time
Hugging Face Spaces
HumanMessage
HumanMessagePromptTemplate.from_template()
I
Image generation wizard
Imagen
Indexing API
deletion modes
full mode
incremental
none
key information
RecordManager
initialize_agent function
__init__ method
InMemoryCache
Intelligent agent tasks performance
setup LangSmith
tools
agent initialization
creating AgentExecutor
creting Retriever tool
LLM selection
prompt selection
Retriever
Tavily
Intermediate steps
Internet connection
J
jq schema
JSON files
JSONLoader
json.loads()
JSON Parser
K
Kaggle notebooks
Keyword search
L
LangChain
accuracy and reliability
active community and ecosystem
adapt to user needs and preferences
advanced functionalities
advantage
agentic applications
agents
applications
automate complex workflows
benefits
building blocks
chains
chatbots
community
components
deployment
development
production
RAG process
content generation platform
context-aware applications
cost optimization
customer support chatbot
data aware
data connections
definition
developers
development efficiency
vs. Direct LLM API
See Direct LLM API vs. LangChain
ecosystem
enhanced flexibility
flexibility and scalability
framework
generative app
indexes
integration with multiple LLMs
libraries
and LLMs
See Large language models (LLMs)
memory concepts
models
modify your script
modular and scalable architecture
multi-LLM integration
no cost barrier
and OpenAI
open source and community collaboration
prompt engineering
prompt templates
RAG
rapid development and prototyping
real-world example
retrieval types
trade-offs
LangChain Agents
API tool calling
Conversation Agent
description
JSON chat
key features
MRKL Agent
OpenAI functions
OpenAI tools
ReAct agent
scenarios
intelligent search
recommendation systems
task automation
self-ask agents
structured chat agent
Structured Tool Agent
types
domain-specific agents
general-purpose agents
multi-agent systems
simulation agents
task-specific agents
workflow
XML
Zero-Shot-React Agent
LangChain Application deployment
in GitHub
installing Git
OpenAI Key, environment variable
other cloud deployment options
preventing email privacy-related error
providing access, GitHub
resolving sensitive information issues
setting up your identity
in Streamlit Cloud
LangChain chains
example
future, generative AI
generative AI applications
langchain.chat_models
LangChain-community
langchain_community package
LangChain-Core
langchain.debug module
LangChain Documentation on Agents
LangChain Execution Language (LCEL) chains
construct
customize
execution modes
async execution
batch execution
observability, LCEL chains
streaming execution
flexibility
vs.legacy chains
load_query_constructor_runnable chain
natural language query
query_constructor chain
scalability
types
use cases
LangChain indexing API
LangChain library
LangChain Memory Modules
langchain.output_parsers
langchain.prompts
langchain.schema
LangChain’s GitHub repository
langchain-text-splitters package
LangChain tools
built-in tools
custom tools
OpenAI functions
toolkits
LangChain Tools Integration Guide
LangChain Tutorial on Building Agents
LangChainUI.py file
LangChain Use Cases
LangChain v0.1 vs. v0.2 Agents
enhanced agent types
enhanced error handling and debugging
improved tool integration
simplified agent initialization
LangGraph
creation
description
installation
organizing tasks and data
LangServe
LangSmith
langsmith_search tool
LangTemplates
Language models
Large Language Model Meta AI (LLaMA)
using Hugging Face
Large language models (LLMs)
Anthropic’s Claude models
APIs
See LLM APIs
app development workflow
choose LLM and LangChain integration
client-server interaction
conceptualization
database design
define requirements
deploy application
design application architecture
implement LangChain components
incorporate data sources
iterate and optimize
monitor and maintain
preparing deployment scripts
service-oriented architecture
set up development environment
train/test with LLMs
applications
capabilities
chain
chat models
Cohere’s AI model
and developers
diversity applications
error handling and troubleshooting
Gemini
general LLM model
generating code
Google’s AI
GPT-4
on Internet
Meta AI
OpenAI
PaLM
prompts
summarizing large texts
super-reading robots
translating languages
writing stories
“Lazy load” method
Learning through experimenting
document your findings
experiment freely
share and collaborate
Legacy chains
construct
conversational apps with ConversationChain
direct application
Document Chatbot App
ConversationalRetrievalChain
Document Processing App, MapReduceChain
execution
vs.LCEL chains
Q&A Apps, RetrievalQA
simplicity
Text Generation Apps, LLMChain
types
use case
Legal analysis tool
LengthBasedExampleSelector
Length example selector
LLM APIs
business benefits
continuous improvement in language capabilities
cost-effective access to massive language models
domain-specific knowledge integration
multilingual and cross-cultural capabilities
natural language processing
rapid prototyping of AI-powered features
scalable applications
calling an OpenAI API directly
challenges
debugging and troubleshooting
latency
managing deprecation
rate limits and cost management
security concerns
choose the right use case
development environment
direct vs. LangChain
See Direct LLM API vs. LangChain
model performance
pre-built models and functionalities
prepare data
select the right model
technical benefits
advanced prompt engineering capabilities
efficient handling of context and memory
flexible integration of language models
seamless integration of multimodal inputs
simplified complex NLP tasks
LLMChain
from langchain.chains
run method
“llm_chain” template
load_and_split() method
load_chain function
load() method
load_query_constructor_runnable chain
LocalFileStore class
Longformer
M
MapReduceChain
MarkdownTextSplitter
Market analysis tool
Maximal Marginal Relevance (MMR) Example Selector
MegatronLM
Memory component
Memory concepts
Meta AI models
DPR
FastText
LLaMA
M2M-100
NLLB
OPT
PyTorch
RoBERTa
WaVE
Meta-Llama-3-8B model
M2M-100
Model fine-tuning
Model input/output (I/O)
MRKL Agent
Multimodal and security models
Multi-vector Retriever
N
Natural language processing
Natural language query
Next-generation language models
Ngram example selector
No Language Left Behind (NLLB)
O
Observability, LCEL chains
OpenAI
API key
chat model
ChatOpenAI model
GPT-3.5-turbo model
from langchain.llms
OpenAIEmbeddings class
OpenAI function calling
OpenAIFunctions Parser
OpenAI models
Codex
DALL-E 2
GPT
OpenAI’s LLMs
API key
billing information
create/open an account
OpenAI tools agent
advantage
agent creation
agent running
chat history
description
goal
initializing tools
necessary libraries
OpenAITools Parser
Open Pretrained Transformer (OPT)
os.environ.get()
OutputFixingParser
Output parsers
functions
LLM
OutputFixingParser
parse with Prompt
PydanticOutputParser, movie data
choose the LLM and its settings
create prompt template
define movie data model
generate movie information
import libraries
parse the LLM’s response
set up the OpenAI API key
types
P, Q
page_content attribute
PaLM
PaLM 2
ParentDocument Retriever
pc.create_index() function
PDFs
Personalized recommendation system
agent setup
code example
data preprocess and analysis
defining recommendation tools
ecommerce platform
generate_recommendations function
generating personalized recommendations
initializing the agent
integrating recommendations into application
recommendation prompt setup
tools setup
user data collection
user preference dataset
user_preference_tool
pip install streamlit command
Practical use cases
customer support automation
personalized recommendation system
real-time data analysis and decision-making
Pretrained model
Prompt engineering
access to knowledge
career opportunities
cost efficiency
formats
LLM models
multiple models
scalability
steps
vs. model fine-tuning
Prompts
chaining
components
output parsers
See Output parsers
prompt templates
streamlining customer service, case study
advanced engineering
customization
impact
initial design
Prompt templates
advantages
chat
context and questions
create multiline string
example selectors
create list of example
Custom Example Selector
custom implementation
define BaseExampleSelector Class
factors
few-shot learning
Length example selector
LLM
MMR example selector
Ngram example selector
prompt
similarity example selector
types
few-shot examples
instructions
from langchain.prompts
LLM
regular
pydantic
PydanticOutputParser
Pydantic Parser
PyPDFLoader class
Python command
Python development environment
configure API key
Google Colaboratory
install OpenAI library
securing the API keys
configuration files
environment variables
test the setup
PyTorch
R
RAG
See Retrieval-augmented generation (RAG)
ReAct agent
chat history
creating agent
definition
initializing tools
necessary libraries
running agent
ReAct (Reasoning and Acting) format
read() method
Real-time customer service chatbot
Real-time data analysis and decision-making
data analysis tools
data preprocess and transform
data retrieval
identifying data sources
initializing agent
integrating decisions
process_data_and_make_decision function
process real-time data and make decisions
real-world scenario
setup agent
setup data ingestion
setup the decision-making prompt
Real-world AI development
recommendation_generator_tool
RecursiveCharacterTextSplitter
Recursive splitting
Regular prompt templates
Representative models
Research assistant
response = agent.run(query)
Retrieval-augmented generation (RAG)
architecture
embed phase
high-quality
implementations
importance
LangChain components
Document loaders
See Document loaders
indexing API
See Indexing API
retrievers
See Retrievers
text embedding models
See Text embedding models
text splitters
See Text splitters
vector stores
load phase
retrieve phase
Source
store phase
transform phase
use case example
visual representation
Retrieval-based question-answering system
Retrieval component
RetrievalQA chain
Retrievers
code
Contextual Compression Retriever
LangChain
Multi-vector Retriever
ParentDocument Retriever
Self-Query Retriever
unstructured query
Vectorstore Retriever
RetryWithError Parser
returns_policy.txt
RoBERTa
S
Sec-PaLM2
select_examples method
Self-ask agents
creating agent
initializing tools
necessary tools
running agent
search capabilities
Self-Query Retri
Sentiment analysis
Sentiment analysis app, conditional chains
Sentiment magnitude
SequentialChain class
SerpAPI
Similarity example selector
Simplified Agent initialization in v0.2
Simulation agents
Software Development Kit (SDK)
Speech recognition (Whisper)
split_text() method
SQLiteCache
Steps (or Nodes)
Streamlit Cloud
Streamlit LangChain UI app
building steps
components
indentation error in code
running
interaction
steps
stopping Streamlit server
viewing, web browser
testing
streamlit run LangChainUI.py command
Structured chat agent
chat history
creating agent
defining helper function
definition
initializing tools
LangChain community tools
running agent
Structured Tool Agent
StuffDocumentsChain
Super-reading robots
SystemMessagePromptTemplate
System prompt
T
Task Allocation App, router chains
challenges
example
implementation
outcomes
streamlining customer service operations
Task-specific agents
tavily_search_results_json tool
Tavily search tool
Text and image processing models
Text completion model
text_content=False parameter
Text embedding models
cache embeddings
customer reviews
embed_documents method
embed_query method
find the documents
information retrieval system
install required packages
vs. keyword search
LangChain
OpenAI API key
OpenAIEmbeddings class
query text
reviews
vectors
vector stores asynchronously
TextLoader class
Text splitters
code example
CodeTextSplitter
HtmlTextSplitter
langchain-text-splitters package
MarkdownTextSplitter
RecursiveCharacterTextSplitter
Recursive splitting
token
TokenTextSplitter
text_splitters and tiktoken packages
Text-to-speech (TTS)
The Restaurant Recommendation Agent
Tokenizer
TokenTextSplitter
to_messages() method
Toolkits
Tools
abstraction, key components
custom schema
definition
essential elements
properties
WikipediaQueryRun
Transformer-XL
Trigger
try-except block
TTS
See Text-to-speech (TTS)
TwitterTweetLoader
U
user_input variable
User prompt
V
ValueError exception
Vectorstore Retriever
Vector stores
load source data
query vector store
retrieve “most similar”
verbose parameter
Visual Question Answering (VQA)
W
WaVE
Web application
WebBaseLoader
WikipediaQueryRun tool
X
XLNet
XML Parser
Y
YAML Parser
Z
Zero-Shot-React agents
advantage
definition
disadvantage
required tools
zero-shot-react-description agent
OceanofPDF.com