大模型系列：OpenAI使用技巧_在合成的函数调用数据上进行微调

本文链接：https://blog.csdn.net/wjjc1017/article/details/135355959

文章目录

这个笔记本介绍了如何进行微调以提高函数调用的准确性和可靠性。
您可以在这里找到有关函数调用的更多信息，
以及有关微调的更多信息 here。

为了提供背景，以上的函数调用笔记本中：

functions是聊天完成API中的可选参数，可用于提供函数规范。其目的是使模型能够生成符合提供规范的函数参数。请注意，API实际上不会执行任何函数调用。开发人员需要使用模型输出来执行函数调用。

函数调用是一个非常强大的工具，当它按预期工作时。然而，我们发现随着函数数量的增加和任务复杂性的增加，函数调用变得不太准确（例如：更多的幻觉调用和错误调用）。

在进行函数调用的微调之前，最好从以下几个方面开始：

改进函数定义。使其更清晰，更与其他函数区分开来。
尝试使用提示工程：通常更详细的提示可以帮助模型调用正确的函数。

如果上述步骤无法将函数调用改进到令人满意的水平，那么可以尝试进行函数调用的微调。

概述

本笔记本包含三个部分：

评估基线函数调用性能： 在给定的函数上评估开箱即用的 gpt-3.5-turbo 模型（假设由于延迟和成本原因，我们不能将 gpt-4 用作无人机驾驶员）
生成合成数据： 使用 gpt-4 创建“黄金”提示和函数调用集，用作训练数据
微调： 运行微调作业，并评估微调模型

注意：本笔记本提供了一个示例，说明如何仅凭函数列表创建合成训练数据以进行函数调用微调。虽然实际生产测试评估更可取，但这种方法可以产生强大的结果，并可与实际训练数据一起使用。

获取基准函数调用性能

# !pip install tenacity
# !pip insta openai
# !pip install typing

# 导入所需的库和模块
import openai  # 导入openai库
import numpy as np  # 导入numpy库
import json  # 导入json库
import os  # 导入os库
from openai import OpenAI  # 导入OpenAI类
import itertools  # 导入itertools库
from tenacity import retry, wait_random_exponential, stop_after_attempt  # 导入tenacity库中的retry、wait_random_exponential和stop_after_attempt函数
from typing import Any, Dict, List, Generator  # 导入typing库中的Any、Dict、List和Generator类
import ast  # 导入ast库

# 创建OpenAI客户端实例
client = OpenAI()

实用工具

让我们定义实用函数来调用聊天完成 API，一个用于获取完成，另一个用于获取函数调用。

# 定义一个函数，函数名为get_chat_completion，接收三个参数：messages、model、max_tokens、temperature、stop和functions
def get_chat_completion(
    messages: list[dict[str, str]],
    model: str = "gpt-4",
    max_tokens=500,
    temperature=1.0,
    stop=None,
    functions=None,
) -> str:
    # 定义一个字典params，包含model、messages、max_tokens、temperature和stop这些键值对
    params = {
   
   
        'model': model,
        'messages': messages,
        'max_tokens': max_tokens,
        'temperature': temperature,
        'stop': stop,
    }
    # 如果有functions这个参数，将其加入params字典中
    if functions:
        params['tools'] = functions

    # 调用client.chat.completions.create方法，传入params字典作为参数，返回一个completion对象
    completion = client.chat.completions.create(**params)
    # 返回completion对象中的第一个choices元素的message属性
    return completion.choices[0].message

基准测试

让我们建立一个智能无人机副驾驶员。我们希望能够给副驾驶员下达指令，并让它调用该指令的函数，或者如果该指令不可行，则拒绝该请求。
我们可以首先为副驾驶员定义一个系统提示。

# 定义一个字符串常量，表示无人机系统的提示信息
DRONE_SYSTEM_PROMPT = """You are an intelligent AI that controls a drone. Given a command or request from the user,
call one of your functions to complete the request. If the request cannot be completed by your available functions, call the reject_request function.
If the request is ambiguous or unclear, reject the request."""

现在让我们为副驾驶可以执行的所有操作定义函数。

# 代码注释

# 定义一个函数列表，包含多个函数
function_list = [
    {
   
   
        "type": "function",
        "function": {
   
   
            "name": "takeoff_drone",
            "description": "Initiate the drone's takeoff sequence.",  # 启动无人机的起飞序列
            "parameters": {
   
   
                "type": "object",
                "properties": {
   
   
                    "altitude": {
   
   
                        "type": "integer",
                        "description": "Specifies the altitude in meters to which the drone should ascend.",  # 指定无人机应该上升到的高度（以米为单位）
                    }
                },
                "required": ["altitude"],  # 参数中必须包含altitude字段
            },
        },
    },
    {
   
   
        "type": "function",
        "function": {
   
   
            "name": "land_drone",
            "description": "Land the drone at its current location or a specified landing point.",  # 将无人机降落在当前位置或指定的降落点
            "parameters": {
   
   
                "type": "object",
                "properties": {
   
   
                    "location": {
   
   
                        "type": "string",
                        "enum": ["current", "home_base", "custom"],
                        "description": "Specifies the landing location for the drone.",  # 指定无人机的降落位置
                    },
                    "coordinates": {
   
   
                        "type": "object",
                        "description": "GPS coordinates for custom landing location. Required if location is 'custom'.",  # 自定义降落位置的GPS坐标。如果location为'custom'，则必填。
                    },
                },
                "required": ["location"],  # 参数中必须包含location字段
            },
        },
    },
    {
   
   
        "type": "function",
        "function": {
   
   
            "name": "control_drone_movement",
            "description": "Direct the drone's movement in a specific direction.",  # 指导无人机朝特定方向移动
            "parameters": {
   
   
                "type": "object",
                "properties": {
   
   
                    "direction": {
   
   
                        "type": "string",
                        "enum": ["forward", "backward", "left", "right", "up", "down"],
                        "description": "Direction in which the drone should move.",  # 无人机应该移动的方向
                    },
                    "distance": {
   
   
                        "type": "integer",
                        "description": "Distance in meters the drone should travel in the specified direction.",  # 无人机应该在指定方向上行驶的距离（以米为单位）
                    },
                },
                "required": ["direction", "distance"],  # 参数中必须包含direction和distance字段
            },
        },
    },
    {
   
   
        "type": "function",
        "function": {
   
   
            "name": "set_drone_speed",
            "description": "Adjust the speed of the drone.",  # 调整无人机的速度
            "parameters": {
   
   
                "type": "object",
                "properties": {
   
   
                    "speed": {
   
   
                        "type": "integer",
                        "description": "Specifies the speed in km/h.",  # 指定速度（以千米/小时为单位）
                    }
                },
                "required": ["speed"],  # 参数中必须包含speed字段
            },
        },
    },
    {
   
   
        "type": "function",
        "function": {
   
   
            "name": "control_camera",
            "description": "Control the drone's camera to capture images or videos.",  # 控制无人机的相机以拍摄图像或视频
            "parameters": {
   
   
                "type": "object",
                "properties": {
   
   
                    "mode": {
   
   
                        "type": "string",
                        "enum": ["photo", "video", "panorama"],
                        "description": "Camera mode to capture content.",  # 拍摄内容的相机模式
                    },
                    "duration": {
   
   
                        "type": "integer",
                        "description"