问题重写

什么是问题重写

问题重写：提升对话式 AI 的上下文理解能力和增加检索结果的准确性

在构建对话式 AI 系统时，一个常见的挑战是如何准确理解用户的后续问题。这些问题通常依赖于之前的对话上下文，如果直接将其用于信息检索或进一步处理，可能会导致不准确或不相关的回答。为了解决这个问题，我们可以使用一种称为"问题重写"的技术。

什么是问题重写？问题重写是一个将用户的后续问题转化为独立、自包含问题的过程。这个过程考虑了之前的对话历史，确保重写后的问题包含足够的上下文信息，使得 AI 系统能够准确理解和回答。

为什么需要问题重写？

提高准确性：通过将上下文信息融入问题中，AI 系统能更准确地理解用户的意图。增强连贯性：重写后的问题能够与之前的对话自然衔接，提供更流畅的对话体验。改善信息检索：独立的问题更适合用于搜索引擎或知识库查询，提高信息检索的相关性。减少歧义：明确的问题表述可以减少歧义，降低 AI 误解的可能性。

实现问题重写的代码

让我们通过一个具体的实例来了解如何实现问题重写：

测试代码

import java.util.ArrayList;
import java.util.List;

import org.junit.Test;

import com.litongjava.jfinal.aop.Aop;
import com.litongjava.open.chat.config.EnjoyEngineConfig;
import com.litongjava.openai.chat.ChatMessage;
import com.litongjava.tio.utils.environment.EnvUtils;

public class RewriteQuestionServiceTest {

  @Test
  public void test() {
    EnvUtils.load();
    new EnjoyEngineConfig().config();
    List<ChatMessage> messages = new ArrayList<>();
    messages.add(new ChatMessage("user", "which professor is good for CS46a"));
    messages.add(new ChatMessage("assistant", "Tong Li is a good professor for CS46A"));
    String prompt = Aop.get(RewriteQuestionService.class).rewrite("tell me more about him", messages);
    System.out.println(prompt);
  }
}

这段代码创建了一个测试用例，模拟了一个简单的对话场景，并调用 RewriteQuestionService 来重写后续问题。

EnjoyEngineConfig 配置

import com.jfinal.template.Engine;
import com.litongjava.tio.utils.environment.EnvUtils;

public class EnjoyEngineConfig {

  private final String RESOURCE_BASE_PATH = "/prompt/";

  public void config() {
    Engine engine = Engine.use();
    engine.setBaseTemplatePath(RESOURCE_BASE_PATH);
    engine.setToClassPathSourceFactory();
    if (EnvUtils.isDev()) {
      // 支持模板热加载，绝大多数生产环境下也建议配置成 true，除非是极端高性能的场景
      engine.setDevMode(true);
    }

    // 配置极速模式，性能提升 13%
    Engine.setFastMode(true);
    // jfinal 4.9.02 新增配置：支持中文表达式、中文变量名、中文方法名、中文模板函数名
    Engine.setChineseExpression(true);

  }

}

这个类配置了 Enjoy 模板引擎，用于渲染问题重写的提示模板。

RewriteQuestionService 实现

import java.util.HashMap;
import java.util.List;
import java.util.Map;

import com.jfinal.template.Engine;
import com.jfinal.template.Template;
import com.litongjava.openai.chat.ChatMessage;
import com.litongjava.openai.chat.ChatResponseVo;
import com.litongjava.openai.client.OpenAiClient;

import lombok.extern.slf4j.Slf4j;

@Slf4j
public class RewriteQuestionService {

  public String rewrite(String question, List<ChatMessage> messages) {

    // 1.渲染模版
    Engine engine = Engine.use();
    Template template = engine.getTemplate("rewrite_question_prompt.txt");

    Map<String, Object> values = new HashMap<>();
    values.put("messages", messages);
    values.put("query", question);
    String prompt = template.renderToString(values);

    log.info("prompt:{}", prompt);

    // 2.大模型推理
    ChatResponseVo chatCompletions = OpenAiClient.chatCompletions("system", prompt);
    String content = chatCompletions.getChoices().get(0).getMessage().getContent();
    // 3.返回推理结果
    return content;
  }
}

这个服务类负责执行问题重写的核心逻辑：

使用 Enjoy 模板引擎渲染提示模板。
调用 OpenAI API 进行问题重写。
返回重写后的问题。

提示模板 rewrite_question_prompt.txt

You will be given a conversation below and a follow up question. You need to rephrase the follow-up question if needed so it is a standalone question that can be used by the LLM to search the web for information.
If it is a writing task or a simple hi, hello rather than a question, you need to return `not_needed` as the response.

Example:
1. Follow up question: What is the capital of France?
Rephrased: Capital of france

2. Follow up question: What is the population of New York City?
Rephrased: Population of New York City

3. Follow up question: What is Docker?
Rephrased: What is Docker

Conversation:
#for(message : messages)
#(message.role): #(message.content)
#end

Follow up question: #(query)
Rephrased question:

这个模板为 AI 提供了重写问题的指导和上下文信息。

渲染后的问题模版

You will be given a conversation below and a follow up question. You need to rephrase the follow-up question if needed so it is a standalone question that can be used by the LLM to search the web for information.
If it is a writing task or a simple hi, hello rather than a question, you need to return `not_needed` as the response.

Example:
1. Follow up question: What is the capital of France?
Rephrased: Capital of france

2. Follow up question: What is the population of New York City?
Rephrased: Population of New York City

3. Follow up question: What is Docker?
Rephrased: What is Docker

Conversation:
user: which professor is good for CS46a
assistant: Tong Li is a good professor for CS46A

Follow up question: tell me more about him
Rephrased question:

推理的输出结果

Tell me more about Tong Li professor for CS46A

结果分析在给定的例子中：

原始对话：用户询问 CS46A 课程的好教授，系统回答 Tong Li 是一个好选择。后续问题："tell me more about him" 重写后的问题："Tell me more about Tong Li professor for CS46A"

可以看到，重写后的问题成功地将上下文中的关键信息（Tong Li、教授、CS46A）整合进了问题中，使其成为一个独立的、可搜索的查询。

总结

整个流程可以概括为：

1.准备对话历史和后续问题
2.使用模板引擎生成包含上下文的提示
3.将提示发送给大语言模型（如 OpenAI 的 GPT）
4.大语言模型根据提示重写问题
5.返回重写后的问题，这个问题现在包含了必要的上下文信息