tio-boot 案例 - 整合 function call

function call 简介

Function Call 的典型案例 —— 天气查询插件。

问「我明天应该穿什么?」此时 GPT 完全理解了我的问题,而且知道穿衣建议是需要了解相关天气情况的。因此它问了我在哪个城市,并且在获得该信息后,调用了天气查询插件。并基于天气查询返回的结果,告诉了我穿衣建议。

但在 GPT 与 Function Call 的加持下,开发者可以轻松实现用户模糊输入的意图识别,并转换为结构化的系统指令,与现有系统做集成。

天气插件的实现原理:Function Call 调用揭秘

来拆解一下上述的交互流程,仔细看看这个插件的交互是如何实现的。这样的一次交互,包含了 5 个对话:

  • 用户提出了一个问题:「我明天应该穿什么?」
  • AI 提问「请告诉你所在的城市」;
  • 用户回答完城市;
  • 输出调用天气插件所需的参数;
  • 通过程序调用获取最近一周的天气数据,获取数据发送到大模型
  • AI 输出穿衣建议;

为了更方便理解本质,我会将上述 5 条消息,转换为伪代码的语法,来方便大家理解:

#1

- role: user
  content: 我明天应该穿什么?

#2

- role: assistant
  content: 请告诉你所在的城市

#3

- role: user
  content: 杭州

#4

- role: assistant
  content: "{"name": "realtimeWeather","arguments": "{\n \"city\": \"杭州\"\n}"}

#5

- role: function
  name: realtimeWeather
  content: [{
  "city": "杭州市",
  "adcode": "330100",
  "province": "浙江",
  "reporttime": "2023-08-10 00:02:43",
  "casts": [
  {
  "date": "2023-08-10",
  "week": "4",
  "dayweather": "雷阵雨",
  "nightweather": "多云",
  "daytemp": "35",
  "nighttemp": "25",
  "daywind": "北",
  "nightwind": "北",
  "daypower": "≤3",
  "nightpower": "≤3",
  "daytemp_float": "35.0",
  "nighttemp_float": "25.0"
  },
  ]
  }]

#6

- role: assistant
  content: 根据杭州的天气预报...

上述伪代码中, #1 ~ #3 和 1~3 步没有任何区别。而 #4 ~ #5 在用户侧的感知就是第 4 步。#6 对应的则是第 5 步。接下来详细讲讲这 6 步到底发生了什么。

交互消息格式

理解意图

首先是 #1 ,用户提问「我明天应该穿什么?」,这背后,系统发送给 AI 的请求的信息是下面这样的:

{
  "messages": [
    {
      "role": "user",
      "content": "我明天应该穿什么?"
    }
  ],
  "functions": [
    {
      "name": "realtimeWeather",
      "description": "获取当前天气情况",
      "parameters": {
        "type": "object"
        "properties": {
          "city": {
            "description": "城市名称",
            "type": "string"
          }
        },
        "required": [
          "city"
        ],
      }
    }
  ]
}

除了 messages 以外,我们还向 GPT 传递了一个 function 的列表 s,新版的接口支持使用了 useTool 参数,这个后面在将,在这个列表中,我们使用 JSON Schema 描述了 GPT 可以调用的方法 ,即 realtimeWeather 。 我们在这个方法的描述中介绍了这个方法的作用。同时这个方法支持传入 city 这一个参数,参数名为城市,且 city 这个参数是必填的。

收集必要信息

讲完了 #1,接下来看 #2,AI 的消息是:「请告诉你所在的城市」。注意到了没,其实从这一步开始,GPT 已经识别了用户的意图,并试图尝试去调用 realtimeWeather 的外部方法。但是由于这个方法中的入参 city 是个必填项,而此时它并不知道,因此需要从用户侧了解到该信息。 再来看 #3,用户回答了 「杭州」,此时我们再来看下发送给 GPT 的消息:

{
  "messages": [
    {
      "role": "user",
      "content": "我明天应该穿什么?"
    },
    {
      "role": "assistant",
      "content": "请告诉你所在的城市?"
    },
    {
      "role": "user",
      "content": "杭州"
    },
  ],
  "functions": [
    {
      "name": "realtimeWeather",
      "description": "获取当前天气情况",
      "parameters": {
        "type": "object"
        "properties": {
          "city": {
            "description": "城市名称",
            "type": "string"
          }
        },
        "required": [
          "city"
        ],
      }
    }
  ]
}

基于 #1~#2 的分析和 #3 的消息,我们现在已经知道: GPT 准确识别了用户的意图,并想要尝试调用 realtimeWeather 的方法; 截止 #3,realtimeWeather 所需要的参数(city),在会话中已经齐全。

返回意图识别结果

那么接下来就看其中最重要的 #4 。#4 的返回消息如下:

{
  "role": "assistant",
  "content": "{"name": "realtimeWeather","arguments": "{\n  \"city\": \"杭州\"\n}"}
}

可以看到 GPT 在 content 中返回了一串 JSON 内容,将其格式化:

{
  "name": "realtimeWeather",
  "arguments": {
    "city": "杭州"
  }
}

通过上述 JSON 可以发现, GPT 准确返回了它想要调用的方法名称 realtimeWeather 与相应的参数 "city": "杭州"。是不是有点 Amazing ?这就是 Function Call 的特性。 为了让机器理解人类的意图,过去我们想方设法去「约束用户行为」或者「猜测用户意图』。但时代已经开始变了,通过 Function Call, 我们只需要在发送给 GPT 请求时加一个 functions 的参数,告知 AI 可以调用的外部方法有什么,然后 AI 就能够自动分析问题的上下文,并通过多轮对话来收集必要的调用参数,最后拼合返回调用方法的 JSON。 这对于开发者来说已经友好到极致了,研发成本大大降低,但效果又大大提升。

外部 API 调用

OK,接下来再看 #5。这一步其实也是之前非常困惑我的地方。我曾以为是 OpenAI 会在背后帮我们执行一个什么插件的服务调用。但通过自行实现一遍后才发现其实并不是这样。 #5 做的事情,本质上就是一次常规的 API 调用。因为当我们获取到调用方法的指令之后,如何运行这个指令,已经和 GPT 的接口无关了。你可以自行决定这个 API 应该如何调用。 比如在上述天气预报的查询结果,就是调用了一下高德的天气预报接口,返回的结果如下:

[
  {
    "city": "杭州市",
    "adcode": "330100",
    "province": "浙江",
    "reporttime": "2023-08-10 00:02:43",
    "casts":
    [
      {
        "date": "2023-08-10",
        "week": "4",
        "dayweather": "雷阵雨",
        "nightweather": "多云",
        "daytemp": "35",
        "nighttemp": "25",
        "daywind": "北",
        "nightwind": "北",
        "daypower": "≤3",
        "nightpower": "≤3",
        "daytemp_float": "35.0",
        "nighttemp_float": "25.0"
      },
      {
        "date": "2023-08-11",
        "week": "5",
        "dayweather": "晴",
        "nightweather": "晴",
        "daytemp": "35",
        "nighttemp": "25",
        "daywind": "北",
        "nightwind": "北",
        "daypower": "≤3",
        "nightpower": "≤3",
        "daytemp_float": "35.0",
        "nighttemp_float": "25.0"
      },
      {
        "date": "2023-08-12",
        "week": "6",
        "dayweather": "多云",
        "nightweather": "多云",
        "daytemp": "35",
        "nighttemp": "26",
        "daywind": "北",
        "nightwind": "北",
        "daypower": "≤3",
        "nightpower": "≤3",
        "daytemp_float": "35.0",
        "nighttemp_float": "26.0"
      },
      {
        "date": "2023-08-13",
        "week": "7",
        "dayweather": "雷阵雨",
        "nightweather": "多云",
        "daytemp": "35",
        "nighttemp": "25",
        "daywind": "西南",
        "nightwind": "西南",
        "daypower": "≤3",
        "nightpower": "≤3",
        "daytemp_float": "35.0",
        "nighttemp_float": "25.0"
      }
    ]
  }
]

那返回的结果应该如何和 AI 的会话集成在一起?OpenAI 在 GPT 系列的消息类型中,专门为外部接口的请求结果,定义了 function 这样一种类型,与 user 、system、assistant 区分开来。(我个人猜测 GPT 应该会针对这类消息做优化,用于提取其中的有效信息) 我们在会话应用中,就需要按照 GPT 的规范,构造出一个 function 的消息:

- role: function
  name: realtimeWeather
  content: "[{ "city": "杭州市", ... }]"

其中 name 字段是必填的,为这个方法的调用名称。content 字段直接放入接口返回内容(需要转成字符串)。

解读 API 返回的结果

构造完成 function 消息后,我们需要将 #1~#5 的消息再次发送给 GPT。此时的请求消息如下:

{
    "messages": [
      {
        "role": "user",
        "content": "我明天应该穿什么?"
      },
      {
        "role": "assistant",
        "content": "请告诉你所在的城市?"
      },
      {
        "role": "user",
        "content": "杭州"
      },
      {
        "role": "assistant",
        "content": "{"name": "realtimeWeather","arguments": "{\n  \"city\": \"杭州\"\n}"}
      },
       {
        "role": "function",
        "name": "realtimeWeather",
        "content": "..."
      }
    ],
}

这一轮消息已经包含了用户的原始问题(#1)、必要入参的收集(#2、#3)、调用 API 的 JSON 指令(#4)与 外部 API 调用的结果(#5)。另外之前需要传入的 functions 字段,已经不需要再传入了。因为 functions 只是为了让 AI 来决策使用什么外部方法。当外部方法已经完成调用后,就不再需要传入。那接下来就是见证魔法的时刻:

{
  "role": "assistant",
  "content": "当前杭州的天气是晴天,温度在 36 度左右。明天预计会有雷阵雨,温度在 26-35 度之间。因此,建议你今天穿短袖和短裤,明天则需要带一把雨伞,并穿一些防水的衣物,同时也不要忘记防晒。"
}

GPT 在上述 5 条消息的输入下,告诉我了我明天穿衣的建议,并贴心地提醒我记得带一把伞。这一轮会话也就此结束。

function call 结合 租房 Api 实现租房查询

租房 API

假设租房 Api 需要 5 个参数

  • 地理位置
  • 半径
  • 排序方式
  • 最低价格
  • 最高价格

编写 function call

我编写了一个简单的 function call,内容如下

{
  "messages": [
    {
      "role": "system",
      "content": "#(init_prompt)"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "find",
        "description": "Call the facebook maketspace rental api",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "description": "physical location",
              "type": "string"
            },
            "radius": {
              "description": "Radius range",
              "type": "string"
            },
            "sortord": {
              "description": "sortord,suport Recommended|Price: Low to High|Price: High to Low|Distance: Near to Far|Distance: Far to Near|Date: New to Old",
              "type": "string"
            },
            "lowestPrice": {
              "description": "lowest price",
              "type": "string"
            },
            "highestPrice": {
              "description": "highest price",
              "type": "string"
            }
          },
          "required": ["location", "radius", "lowestPrice", "highestPrice"]
        }
      }
    }
  ]
}

编写推理代码

推理代码如下,也非常简单,没有什么需要讲解的

import java.io.IOException;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

import com.alibaba.fastjson2.JSONArray;
import com.alibaba.fastjson2.JSONObject;
import com.jfinal.template.Engine;
import com.jfinal.template.Template;
import com.litongjava.openai.chat.ChatMessage;
import com.litongjava.openai.client.OpenAiClient;
import com.litongjava.openai.constants.OpenAiModels;
import com.litongjava.tio.utils.json.FastJson2Utils;

import lombok.extern.slf4j.Slf4j;
import okhttp3.Response;

@Slf4j
public class RentHousingService {

  public String rent(String question, List<ChatMessage> historyMessages) {

    // 1.渲染模版
    Engine engine = Engine.use();
    Template template = engine.getTemplate("fn_call_housing.txt");

    Map<String, Object> values = new HashMap<>();
    values.put("init_prompt", "You're a rental assistant");
    String requestBodyString = template.renderToString(values);

    // set model
    JSONObject jsonObject = FastJson2Utils.parseObject(requestBodyString);
    jsonObject.put("model", OpenAiModels.gpt_4o_mini);
    JSONArray messages = jsonObject.getJSONArray("messages");
    // add history
    if (historyMessages != null) {
      messages.addAll(historyMessages);
    }
    // add user question
    messages.add(new ChatMessage("user", question));
    String string = jsonObject.toString();
    log.info("reqeustBody:{}", string);

    // 2.大模型推理
    try (Response response = OpenAiClient.chatCompletions(string);) {

      String bodyString = response.body().string();
      if (response.isSuccessful()) {
        return bodyString;
      } else {
        throw new RuntimeException(bodyString);
      }
    } catch (IOException e) {
      throw new RuntimeException(e);
    }

  }
}

测试推理方法-单轮对话

import java.util.ArrayList;
import java.util.List;

import org.junit.Test;

import com.litongjava.jfinal.aop.Aop;
import com.litongjava.open.chat.config.EnjoyEngineConfig;
import com.litongjava.openai.chat.ChatMessage;
import com.litongjava.tio.utils.environment.EnvUtils;

public class RentHousingServiceTest {

  @Test
  public void test() {
    String question = "Help me find a house in 5 km near sjsu, the price is between 1000-1500usd, the price is sorted from low to high";
    EnvUtils.load();
    new EnjoyEngineConfig().config();

    String rent = Aop.get(RentHousingService.class).rent(question, null);
    System.out.println(rent);
  }
}

模型的返回数据如下

{
  "id": "chatcmpl-9nLSoEDiIYOwgMYxobPBg9XGwEV2O",
  "object": "chat.completion",
  "created": 1721547606,
  "model": "gpt-4o-mini-2024-07-18",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_xfVRkzhDNNfNSqZ3L2OD8iFM",
            "type": "function",
            "function": {
              "name": "find",
              "arguments": "{\"location\":\"sjsu\",\"radius\":\"5km\",\"sortord\":\"Price: Low to High\",\"lowestPrice\":\"1000\",\"highestPrice\":\"1500\"}"
            }
          }
        ]
      },
      "logprobs": null,
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 153,
    "completion_tokens": 41,
    "total_tokens": 194
  },
  "system_fingerprint": "fp_611b667b19"
}

可以看到 arguments 已经包含了所需的参数,接下来的要做的就是解析参数,调用外部 API

测试推理方法-多轮对话

测试代码

  @Test
  public void testMultiAsk04() {
    String firstQuestion = "Help me find a house";
    String firstAnswer = "I'd be happy to help you find a house! Could you please provide me with the following details?\\n\\n1. Location (city or neighborhood)\\n2. Radius (how far you're willing to search)\\n3. Price range (minimum and maximum)\\n4. Any specific features or requirements you're looking for (e.g., number of bedrooms, bathrooms, pet-friendly, etc.)?";
    String secondQuestion = "sjsu";
    String secondAnswer = "Could you please provide more context or specify what you're looking for related to San Jose State University (SJSU)? Are you interested in rental listings near the university, information about the campus, or something else?";
    String thirdQuestion = "5km,1000-1500 usd";
    String thirdAnswer = "Could you please provide me with the physical location (city or address) you want me to search for rental listings within a 5 km radius, and any specific sorting preferences you have (e.g., price, distance, etc.)?";

    EnvUtils.load();
    new EnjoyEngineConfig().config();

    List<ChatMessage> history = new ArrayList<>();
    history.add(new ChatMessage("user", firstQuestion));
    history.add(new ChatMessage("assistant", firstAnswer));
    history.add(new ChatMessage("user", secondQuestion));
    history.add(new ChatMessage("assistant", secondAnswer));
    history.add(new ChatMessage("user", thirdQuestion));
    history.add(new ChatMessage("assistant", thirdAnswer));

    String question = "San Jose,price is sorted from low to high";

    String rent = Aop.get(RentHousingService.class).rent(question, history);
    System.out.println(rent);
  }

request

{
  "messages": [
    { "role": "system", "content": "You're a rental assistant" },
    { "content": "Help me find a house", "role": "user" },
    {
      "content": "I'd be happy to help you find a house! Could you please provide me with the following details?\\n\\n1. Location (city or neighborhood)\\n2. Radius (how far you're willing to search)\\n3. Price range (minimum and maximum)\\n4. Any specific features or requirements you're looking for (e.g., number of bedrooms, bathrooms, pet-friendly, etc.)?",
      "role": "assistant"
    },
    { "content": "sjsu", "role": "user" },
    {
      "content": "Could you please provide more context or specify what you're looking for related to San Jose State University (SJSU)? Are you interested in rental listings near the university, information about the campus, or something else?",
      "role": "assistant"
    },
    { "content": "5km,1000-1500 usd", "role": "user" },
    {
      "content": "Could you please provide me with the physical location (city or address) you want me to search for rental listings within a 5 km radius, and any specific sorting preferences you have (e.g., price, distance, etc.)?",
      "role": "assistant"
    },
    { "content": "San Jose,price is sorted from low to high", "role": "user" }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "find",
        "description": "Call the facebook maketspace rental api",
        "parameters": {
          "type": "object",
          "properties": {
            "location": { "description": "physical location", "type": "string" },
            "radius": { "description": "Radius range", "type": "string" },
            "sortord": {
              "description": "sortord,suport Recommended|Price: Low to High|Price: High to Low|Distance: Near to Far|Distance: Far to Near|Date: New to Old",
              "type": "string"
            },
            "lowestPrice": { "description": "lowest price", "type": "string" },
            "highestPrice": { "description": "highest price", "type": "string" }
          },
          "required": ["location", "radius", "lowestPrice", "highestPrice"]
        }
      }
    }
  ],
  "model": "gpt-4o-mini"
}

response

{
  "id": "chatcmpl-9nLlJGRzyquD0CRpiZeZcoVQrAWir",
  "object": "chat.completion",
  "created": 1721548753,
  "model": "gpt-4o-mini-2024-07-18",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_QKzPD9xz6hRFIYHt6BaUZhSJ",
            "type": "function",
            "function": {
              "name": "find",
              "arguments": "{\"location\":\"San Jose\",\"radius\":\"5km\",\"sortord\":\"Price: Low to High\",\"lowestPrice\":\"1000\",\"highestPrice\":\"1500\"}"
            }
          }
        ]
      },
      "logprobs": null,
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 338,
    "completion_tokens": 40,
    "total_tokens": 378
  },
  "system_fingerprint": "fp_8b761cb050"
}