Functionality
Functionality
the co-founder and CEO of Nexusflow, as well as an assistant professor of ESS stats at UC Berkeley,
and Venkat Srinivasan, who is a founding engineer at Nexusflow.
Large language models (LLMs) are trained on text, which is a form of unstructured data, but our
computing infrastructure is largely built on structured data with virtually defined interfaces like
APIs that expect data in a certain format. Function-calling bridges this gap. Here's how LLMs and
function-calling work: In the prompt sent to a function-calling capable LLM, descriptions of functions
available to the LLM are included. These descriptions include texts describing what a function might
do so the LLM knows when to use that function, as well as additional information needed to call that
function, such as the function's name and a description of its arguments.
When the LLM has a query that it determines is best served by calling a function, it will generate the
needed parameters from the query and return a string that could be used to call that function.
Notice that the LLM doesn't call the function directly; it just returns a string that could be used to
call that function. These functions are often referred to as tools and can be used to extend the
capabilities of a chatbot or to build agents, such as a research agent whose tools might include web
search or Wikipedia lookups. Function-calling has proven useful in many applications beyond chat.
For example, DeepLearning.AI uses simple AI agents that we build internally to analyze learner
feedback to keep improving our courses. Of course, our core team does read your feedback, and we
appreciate you taking the time to provide it.
To gather statistics, we provide an LLM with a prompt that includes a description of a function called
"record learner feedback." They can record the sentiment and rating as well as report any technical
problems described. The LLM that we used for this is actually a special-purpose LLM called
NexusRavenV2-13B, fine-tuned for function-calling by Nexusflow. NexusRavenV2-13B is an open-
source model you can download from HuggingFace. You can also use a hosted version available on
our sites. It is a model you'll be using in this course.
Many applications do not require the full capability of a general-purpose foundation model.
NexusRavenV2-13B has only a certain billion parameters but can outperform Chat GPT-4 in some
function-calling benchmarks. This and other fine-tuning smaller models are small enough to be
locally hosted, eliminating the latency and cost barriers that might prevent you from adding a natural
language interface to your applications.
There are many instances where you may want to convert natural language inputs into structured
outputs, like a function call. Your business may have a library of functions that perform dashboard
operations, and you would like to add a natural language interface. Or you might have applications
that need to process transcripts, notes, or proceedings and store them in a database. Every business
has its own unique applications.
In this course, Venkat will be showing you how you can add a function-calling capability to your
applications. We'll start by taking a deeper dive into what function-calling is and how you can use it.
You will form prompts with function definitions, as Andrew described, and then you'll use the LLM
response to call those functions. Once you have mastered that, we'll kick things up a notch by
defining and calling multiple functions, including nested functions, where arguments to one function
are themselves functions.
Many services on the web have APIs defined using an open API description. You'll also learn to
convert these specifications into functions callable by your LLM. You will finish the course with a
practical application that takes a customer's service transcripts and builds SQL calls to store selected
data to a database.
Many people have worked to create this course. I'd like to thank from Nexusflow, Jian Zhang and
Banghua Zhu. From DeepLearning.AI, Geoff Ladwig and Esmaeil Gargari have also contributed to this
course. Lots of exciting things with function-calling in this course. Let's go on to the next video to
get started, and I hope this course will help you make your applications highly functional.
In the introduction, you learned a bit about what function calling is. In this lesson, we will describe it
in more detail, and you will get some hands-on experience. Let's dive in.
So, what is function calling? As described in the introduction, function calling is the capability of a
large language model (LLM) to take in a natural language query, along with a description of the
function, and output a string that can be used to call that function. Consider this example: you would
like to know the temperature in New York, and you have a function that can provide you that value.
Without function calling, your LLM can help you out. With function calling, the LLM can take your
query and a description of the function, recognize that it can use functions as defined in the prompt
to answer the query, and generate a string that can be used to invoke that function. In this case, it
would generate a call to the function with the city argument set to New York. You can then execute
that function and return the result to the LLM, which can now properly answer the question.
Note here: even though they're called function-calling LLMs, they only generate a string; they don't
actually make the call. You have to do that. It's worthwhile to draw a distinction between general-
purpose LLMs and special-purpose LLMs. General-purpose LLMs respond to all types of queries,
which can also include function-calling queries. On the other hand, special-purpose LLMs are fine-
tuned to focus on a single or a small set of tasks. An example of this is NexusRavenV2-13B, which can
be tuned to provide function-calling services and will always try to return a function call given a user
query. Special-purpose LLMs can be smaller and offer better latency than general-purpose LLMs.
Because they're fine-tuned for this task, they can often outperform general-purpose models for the
tasks that they're trained for, such as function calling.
Now, I've used a couple of different terms here, including function calling, tools, and so on. What is
really the difference? Well, function calling is the name given to the LLM capability of forming the
string containing the function call. Tools are the actual functions that are being called. With that
being said, let's make it more concrete and build some tools. Let's start by building local Python
tools. We'll write a tool that relies on matplotlib. It needs two inputs: x and y, and plots the
coordinates specified by x and y. An example of this tool is if a user says they want to plot y = 10x for
a certain set of values for x. You can use this tool to answer the user query. The way you would do
that is to write a function call that looks roughly like this: you would pull out the details for x = 1, 2, 3,
and because of the transformation function requested, you will map those values to 10, 20, 30 on
the y-axis.
While you can do this, we want the LLM to do this for us. So, the way you would do this is to provide
the tool description and the user query to the LLM. Let's start with the tool description. You start by
providing the prototype of the function you're using earlier. Raven, the LLM you'll be using, uses
Python-formatted functions, so you'll be utilizing that. You'll also add some descriptions of what the
tool does. This will tell the LLM what this function or tool is meant to do and will improve its
reasoning about if it should use this tool to answer the user query or not. You will also provide the
user query to the LLM. Recall the user query was to plot y = 10x, which we can simply add here, and
then we're done. So, this is essentially our tool description, right? You have provided a tool and a
user query. You will call the function-calling LLM named Raven. You can do this through the query
Raven function. The result is a string. The function name came from the function prototype. The
arguments came from the user query, with the LLM actually doing some math to generate ten times
the inputs. Cool. Now you can execute the string like this. Great. Exactly what we expected. Try this
on your own. Here's the user query and a call you can modify. Try changing the query to produce
different results.
So, let's dive deeper into what you just did. You told the LLM via the prompt how to format the
function you've defined earlier by naming the functions in a pythonic format, and you've also
provided in the form of pythonic arguments the information you want the LLM to extract from the
user input. You have also provided via the description the information necessary to understand when
a tool is relevant for a user query. It's important to note that the LLM has been trained to recognize
function calls. This format is very specific to such LLMs. The LLM will respond to your user query with
a string that you can use to call the function.
Now let's take a look at more complicated examples. Similar to earlier, you will use a function that
relies on matplotlib, but you will make it do more. You write a function that draws a clown face, and
you will parameterize it using three arguments that control the clown's face, eyes, and nose color.
The exact implementation of this function isn't too critical, so allow me to quickly implement it, and
we can move forward. Similar to earlier, what you will do is to provide the prompt that will contain
the tool descriptions and the user query to your LLM. Let's say that you want the LLM to draw a pink
clown face with a red nose. Format this and create the prompt. Finally, take a look at the prompt you
have. As you can see, you have the function description identified with the function tag. You can see
that the LLM can identify from the function description and the user query, first that it should invoke
this function, and second, it can fill in the arguments face color, eye color, and nose color from the
information in the user query. Here and here. Now call the function-calling LLM and use it to look at
the code string that it has generated. It has extracted the necessary information from the user
prompt, which is the face color being pink and the nose color being red. Now you will run it. In this
case, it's a pink clown face with a red nose, which is exactly what we expected. Now let's try writing
your own clown. Here is a framework where you can enter your own query or change the prompt
and create your own clown. Give it a try.
So, Raven is not the only LLM that's capable of issuing function calls. Let's try using OpenAI's
function-calling on the same example. You'll import the necessary attributes and try to use the GPT-
3.5 turbo model. For this example, you build a client, and you'll also build a helper function that
wraps around everything that will allow you to kind of query the OpenAI API. So, the thing is, with
the OpenAI API, you'll provide a description of the tool and its arguments in a JSON format. While
the description and the arguments will be identical to the pythonic format you had earlier, the
weights presented will be a bit different. Please pause the video and take a moment to compare the
two. What you notice in the response is that the attributes are in a format that's not directly usable
for us. Let's extract the function and the arguments using the following approach. At the conclusion
of this, you get a Python call that you can now directly execute. With this, you can see how you can
utilize OpenAI's function-calling API to also build a cloud that addresses the user query from earlier.
What you did was actually bridge the gap between the unstructured world of textual training data
that the LLM used in training, with the highly structured world of code. In the next lesson, you will
focus on variations of function calling, including parallel, multiple, and nested calling. A lot of exciting
things are coming up. So, let's go on to the next lesson.
In the last lesson, you used a large language model (LLM) to invoke a single function. In this lesson,
we're going to cover all the variations of this. There are many permutations and combinations of
functions and calls that an LLM can issue, including single calls, parallel calls, no calls, multiple
functions, and nested functions.
Parallel calls are when the LLM issues multiple function call strings in the same turn, either to the
same function or to a set of functions. Before diving into parallel calls, let's do some housekeeping. In
the last lesson, you created a function and embedded its description within a prompt. You can be
more efficient by using the function definition itself to create the prompt automatically. For example,
you can define a generic function called "afunction" and use Python's inspect module to get the
function's signature or arguments. This can be put together in a utility, such as "build Raven prompt,"
to create prompts for a list of functions.
In an example, you use a user query asking for two clowns: one with a red face and a blue nose, and
another with a blue face and a green nose. The first clown should have a sad smile, and the second a
happy smile. The new clown face function is more complex, allowing for more parameters like eye
size, mouth size, and more. You can build the Raven prompt by passing in the clown face function
and your user query. The LLM can then issue two function calls: one for a clown with a red face and
blue nose, and another for a clown with a blue face and green nose, matching the descriptions in the
user query.
Multiple functions allow the LLM to choose the right function or functions from a list. You can also
include a "no relevant query" function if none of the provided functions are relevant. For instance, if
you have a new function called "draw tie" and provide both the "draw clown face" and "draw tie"
functions to the LLM, it will use only the relevant one based on the user query. You can also combine
multiple functions and parallel function calling.
Nested functions involve defining independent functions where the input to one function depends
on the output of another. The LLM can utilize one function first, call it with arguments from the user
prompt, and feed the output into another function. This can sometimes avoid multiple calls to an
LLM. For example, if you split a clown function into parts like head, eyes, nose, and mouth, and have
a function to combine them, the LLM can call each part with necessary arguments and combine the
outputs.
In conclusion, this lesson covered various function call variations, including parallel, multiple, and
nested functions. In the next lesson, you will use external functions, such as those relying on open
API specifications and API endpoints, to add web services to your function-calling LLM.
The last lessons have focused on invoking local functions, but there is a worldwide web of services
that you may want to use. This lesson will describe how you can use them. Let's use the web. Many
RESTful services are available online, often described using API standards such as the OpenAPI
Specification. It's very helpful to integrate function-calling LLMs with such services.
In a simple example, you can write Python code to interface with the Joke API using RESTful
methods. By submitting a GET request to the endpoint of the Joke API, you receive a JSON response
containing many keys, with the most relevant being the setup and delivery of the joke. This
demonstrates interfacing with an external resource using simple Python. However, it is not possible
to call this directly within an LLM. Instead, you need to write a tool to wrap around this endpoint.
To do this, you write a Python function with a single argument for the category. You'll provide a
string, making the URL more dynamic, with the category passed as an argument in your Python tool.
After that, you write code to submit a GET request to your dynamic URL and print out the setup and
delivery from the response JSON. At this point, you're ready to try it with your LLM.
Here's a user query and the Raven prompt. You provide the function definition and the user query
from earlier. When you call the LLM, you might get a joke like, "What says oh oh oh? That's Santa's
working backwards." This shows how you can get a joke that's in theme for December. The core idea
is adding an adapter around your external APIs that converts the Python arguments generated
using the LLM into the external API argument format.
Many external assets use different kinds of API specifications, with OpenAPI being one of them.
Writing tools allows you to unify them together. As practice, you can write a tool that uses the
OpenAPI Specification. For example, using the Open Meteo Weather API, you first download the
OpenAPI Specification in a YAML format. This YAML file describes the Open Meteo APIs, providing
descriptions of how the API behaves, including high-level descriptions and a list of paths for sending
requests to get different responses.
For example, sending a GET request to a URL with the /v1/forecast path provides a seven-day
weather forecast for certain coordinates. There are parameters you can add to your GET request,
described in the YAML, with certain types like arrays and strings with different enumerations allowed.
This allows you to change how the API provides responses for the seven-day forecast by changing
the parameters specified in the GET request.
Since this is in YAML format, you need to convert it into JSON to use the tool. First, load the YAML,
convert some data types like integers and floats by hand, and save the content back into JSON. You
can then use the OpenAPI Python generator utility to convert the JSON into Python code capable of
querying the endpoint. From here, import the Python code you wrote in the previous step. You can
provide a user query that depends on the API, such as asking for the current weather and wind
speed in New York.
Using the inspect approach discussed in the previous lesson, construct the Raven prompt. Notice
that in the prompt, you have the function definition, automatically constructed from the generator
step output, the string you wrote earlier, and the user query you provided. You can now send this to
Raven and run the call, which will give you the output JSON from the API, providing the
temperature and wind speed in New York.
Try it yourself with the city you're currently in. In the next lesson, you'll use Raven to do structured
extraction, such as extracting insights and detailed structured data from unstructured texts. See you
there.
So far, you have been using function calling to call functions, but it can also be used to extract
structured data from natural language input. Let's dive into structured extraction. In previous
lessons, we used function calling as a way of interfacing with tools, both internal and external. In this
lesson, we'll extend the ability of function calling to perform structured extraction.
Structured extraction is when we need to extract details and insights from unstructured texts. For
example, if we have a text like "Mary had a little lamb whose fleece was white as snow," and we want
to extract the person mentioned and the object owned by the person, we can use function calling to
extract that information. You would provide a function with the person's name and the owned object
as arguments that the LLM needs to extract from the unstructured text. This is similar to how
function calling LLMs fill in arguments from user queries when interfacing with Python and API tools.
In this example, the LLM will extract Mary as the person's name and lamb as the owned object. Let's
make this more concrete through an example. Suppose you have a passage with various names and
addresses, and you want to extract the names and map them to the addresses. You can use function
calling to do this by providing a Raven prompt with function annotation, indicating that you want
Raven to extract the names as a list of strings and the addresses as another list of strings.
For more complex structured extraction, you can use data classes to communicate with Raven about
how you want to associate the information. A data class is a Python class decorator that signals the
variables or fields with type annotations. The LLM, trained on Python, will understand this, creating a
convenient way to define parameters. In cases where names are associated with multiple addresses,
such as John Tech being linked to multiple addresses, the previous approach might not work well.
Here, you can define a data class called record, with a name attribute and a list of strings for
addresses.
Another powerful implementation of function calling is in generating valid JSONs. Sometimes, it's
challenging for smaller LLMs to generate valid JSONs. Function calling can help by defining tools with
the same level of hierarchy needed in the JSON. You can define functions that construct dictionaries
for location, country, and continent information, allowing you to convert nested calls into the desired
JSON hierarchy.
By using this approach, you can force your LLM to generate the nested hierarchy needed for your
JSON. For example, you can request city information for London, which is in the United Kingdom,
and in Europe. By passing in the tools, you can generate the JSON in the exact format you want,
making it more predictable to get valid JSONs from your LLMs.
All right. This is your last lesson where you will apply new function coding skills to a more significant
project. Let's take a look. In previous lessons, we looked at using function calling as a way of
interfacing with external tools such as APIs and internal Python tools. We also learned that using
function calling is a way of doing structured extraction, where we extracted insights about data that
was in an unstructured format. In this lesson, we will be combining all of them together into a single
lesson and building a dialog processing system as a core project. Specifically, we will be taking
transcripts of customer interactions, which are dialog exchanges between a customer service
representative and a customer, and extracting insights such as the agent name, product ID, and
customer information from the dialog data. We will then be storing this into a database using an LLM
and then extracting insights from the database regarding aggregated queries that require the LLM to
generate SQL code. Let's get started. First, we'll take a look at the type of data we're dealing with.
What you'll notice is the data is a list of exchanges between an agent and a customer. The customer
provides various insights such as their phone number, their email, as well as their overall sentiment
embedded in the verbiage that they use. The agent also provides details such as their name. We
would want to ideally extract all of this into a structured format and store it in a database. This will
allow us to do aggregated insights, such as analyzing how many customers specific agents have
pleased, or how many customers specific agents have not been able to address their requirements
for. Let's first start by building the tools required to build this system. We'll first define the specific
insights that we want to label as important insights we want to extract. We will be using the data
class approach because this is a far more complicated extraction task. This approach was discussed in
an earlier lesson, and we will be using that here. We will specifically be extracting the agent name,
the customer email, the customer order, the customer phone number, as well as the customer
sentiment. We will also call "exec" on this data class, just so that the Python interpreter is able to
understand our new format. We will also build the database. The database will essentially just
contain a few different columns such as the agent name, the customer email, and the various
attributes that we want to extract. We will store it in a database called extracted.db and name the
table "customer_information". This tool just initializes the database. Let's initialize it. Secondarily, we
would also need to create tools to populate the database. You will define a tool called Update
Knowledge that takes in a list of records. Records being the data class that we added to our
interpreter earlier. And you would just iterate over every record in your results list and insert it into
your database. You'll give it a try using some dummy data. Here you're just defining a single record
object with some dummy data, such as the agent name, a dummy customer order number, as well as
a dummy sentiment. Great. You were able to insert the records successfully. Now, you will also need
to pull information out from the database. Here you define a tool called execute SQL that takes in a
direct SQL string that you will run against your earlier database stored in extracted.db. Specifically,
you'll be running your SQL against a customer information table that you created earlier. Let's give
this a try. You'll define a SQL string where you pull the agent name, where the customer is happy.
From the table that we've been discussing. Running the SQL gives you the information that you need,
such as Agent Smith. This matches what we expect. Great. Now let's start building the pipeline. Let's
first delete the sample database we had earlier, and let's reinitialize the database. Next, we will be
downloading the customer service chatbot dataset from HuggingFace. This dataset contains a list of
traces of dialog between an agent and a customer. We will be using this dataset for this course
project. Let's download the dataset. Let's print just one sample of dialog from this dataset. Here we
will take the sixth element and print it out. What we notice is that the format of the data has a lot of
similarities to the sample data that we saw earlier. It's worthwhile to note that the agent's name here
is Alex. The customer's details include an order number pointing to one, two, three, four, five, and
the customer seems frustrated. Let's pass this to Raven. You will use the inspect approach that we
discussed earlier. You would simply pull the function signature, clean up the function signature to
remove some of the extraneous details that inspect might have added. Pull the function dot string
and build the Raven prompt. The Raven prompt simply contains your data class that you defined in
the first cell, and extraction function, and the entire text from earlier. Let's pass this to Raven. Great.
Raven was able to generate the function call where it notes the name being Alex, the order number
being one, two, three, four, five, and the customer sentiment being frustrated, which is exactly what
we expected. We can run this and insert it into our database. Let's quickly run another example. This
time the 10th sample in the dataset. And let's also insert this into our database. It's important to
note here that the agent name in the sample is John. The order number is BB789012. And the
customer sentiment is happy. Suppose we want to pull some insights out of this. We can actually do
so using SQL such as selecting the number of customer sentiment details we have that matches the
requirements, where the agent name is John and the customer sentiment is happy. This is essentially
querying the database to see how many customers John has made happy. Great. We were able to
run the query and get back a result, which is one. Which matches. Because this corresponds to the
sample we inserted in the previous cell. But this is a bit manual. Can we make this more automatic?
Yes, we can pass the execute SQL tool that you defined earlier to Raven to get the output. You will
simply write a prompt asking how many customers John has made happy. Use the inspect approach
described earlier to pull out the function signature. Pull out the function dot string and provide the
SQL schema to your Raven in a Raven prompt. Passing this to Raven, you get back SQL code that you
can now execute and get the exact result that you had earlier. Great! You're now ready to run this
over the entire dataset. Let's give it a try. Let's first reinitialize the database. Let's iterate over the first
ten samples in the dataset, using the same approach detailed earlier. To build the Raven prompt and
call Raven. You have now populated the database using ten different samples. You can now try
running some aggregated queries here using the Inspect approach, as well as passing the same
schema as earlier. You're simply asking how many happy customers are there over those ten
samples. Let's feed this to Raven. Raven was able to successfully query and answer your question,
which is seven. Next, you can ask to get the names and phone numbers of the customers who are
frustrated, as well as their order numbers. Great. And you were able to see Raven generate the SQL
code, where it gets the agent name, the customer phone number, and the customer order number
from the table where the customer sentiment is frustrated. Here it gives you back the customer
items, including the phone numbers if they exist, as well as the order numbers and the associated
agent name for customers who are frustrated. Please give this a try by adding in an additional
requirement, such as asking for the customer's name. In addition to the agent name, hint, you need
to modify the initialization of the database. The insertion into the database and the data class and
SQL representations. Please give it a try. In this course, we've touched on so many different things
and function calling is very versatile. We hope you had a great time exploring the different
applications of function calling.