Skip to main content

Memory management

聊天机器人的一个关键特性是它们可以利用之前对话的内容作为上下文。这种状态管理可以采用几种形式,包括:

  • 将先前的消息简单地填充到聊天模型提示中。
  • 上述内容,但修剪旧消息以减少模型需处理的干扰信息量。
  • 更复杂的修改,如为长时间进行中的对话合成摘要。

我们将在下面详细介绍一些技巧!

Setup

您需要安装一些软件包,并将您的OpenAI API密钥设置为一个名为OPENAI_API_KEY的环境变量。

%pip install --upgrade --quiet langchain langchain-openai

# Set env var OPENAI_API_KEY or load from a .env file:
import dotenv

dotenv.load_dotenv()
WARNING: You are using pip version 22.0.4; however, version 23.3.2 is available.
You should consider upgrading via the '/Users/jacoblee/.pyenv/versions/3.10.5/bin/python -m pip install --upgrade pip' command.
Note: you may need to restart the kernel to use updated packages.
True

让我们还建立一个用于以下示例的聊天模型。

from langchain_openai import ChatOpenAI

chat = ChatOpenAI(model="gpt-3.5-turbo-1106")

Message passing

最简单的记忆形式就是将聊天历史消息传递到一个链中。以下是一个例子:

from langchain_core.messages import AIMessage, HumanMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant. Answer all questions to the best of your ability.",
),
MessagesPlaceholder(variable_name="messages"),
]
)

chain = prompt | chat

chain.invoke(
{
"messages": [
HumanMessage(
content="Translate this sentence from English to French: I love programming."
),
AIMessage(content="J'adore la programmation."),
HumanMessage(content="What did you just say?"),
],
}
)
AIMessage(content='I said "J\'adore la programmation," which means "I love programming" in French.')

我们可以看到,通过将之前的对话传入到一个链中,它可以用作回答问题的语境。这是支撑聊天机器人记忆的基本概念 - 本指南的其余部分将演示传递或重新格式化消息的便捷技术。

Chat history

可以直接将消息存储和传递为数组,但我们也可以使用 LangChain 内建的消息历史类来存储和加载消息。该类的实例负责从持久存储中存储和加载聊天消息。LangChain 集成了许多提供者 - 您可以在这里看到集成列表 - 但在这个演示中,我们将使用一个临时演示类。

这里是一个 API 的示例:

from langchain.memory import ChatMessageHistory

demo_ephemeral_chat_history = ChatMessageHistory()

demo_ephemeral_chat_history.add_user_message(
"Translate this sentence from English to French: I love programming."
)

demo_ephemeral_chat_history.add_ai_message("J'adore la programmation.")

demo_ephemeral_chat_history.messages
[HumanMessage(content='Translate this sentence from English to French: I love programming.'),
AIMessage(content="J'adore la programmation.")]

我们可以直接使用它来存储我们的对话转折。

demo_ephemeral_chat_history = ChatMessageHistory()

input1 = "Translate this sentence from English to French: I love programming."

demo_ephemeral_chat_history.add_user_message(input1)

response = chain.invoke(
{
"messages": demo_ephemeral_chat_history.messages,
}
)

demo_ephemeral_chat_history.add_ai_message(response)

input2 = "What did I just ask you?"

demo_ephemeral_chat_history.add_user_message(input2)

chain.invoke(
{
"messages": demo_ephemeral_chat_history.messages,
}
)
AIMessage(content='You asked me to translate the sentence "I love programming" from English to French.')

Automatic history management

先前的示例通过明确地向链传递消息。这是一个完全可以接受的方法,但它需要外部管理新消息。LangChain还包括一个可以自动处理此过程的 LCEL 链的包装器,称为 RunnableWithMessageHistory。

为了展示它的运作方式,让我们稍微修改上面的提示,接受一个最终输入变量,该变量将填充一个HumanMessage模板,填充在聊天历史记录之后。这意味着我们期望一个chat_history参数,其中包含所有当前消息之前的所有消息。

prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant. Answer all questions to the best of your ability.",
),
MessagesPlaceholder(variable_name="chat_history"),
("human", "{input}"),
]
)

chain = prompt | chat

我们将把最新输入传递到对话中,并让 RunnableWithMessageHistory 类来包装我们的链条,并完成将该 input 变量附加到聊天记录的工作。

接下来,让我们声明我们包装的链条:

from langchain_core.runnables.history import RunnableWithMessageHistory

demo_ephemeral_chat_history_for_chain = ChatMessageHistory()

chain_with_message_history = RunnableWithMessageHistory(
chain,
lambda session_id: demo_ephemeral_chat_history_for_chain,
input_messages_key="input",
history_messages_key="chat_history",
)

这个类除了我们想要包装的链条之外,还需要一些参数。

  • 一个工厂函数,返回给定会话 id 的消息记录。这使您的链能够同时处理多个用户,通过为不同对话加载不同的消息。
  • 一个 input_messages_key,用于指定应跟踪和存储在聊天记录中的输入的哪个部分。在这个例子中,我们希望跟踪作为输入传递的字符串。
  • 一个 history_messages_key,指定前面的消息应该被注入到提示中。我们的提示有一个 MessagesPlaceholder 名为 chat_history,所以我们指定这个属性来匹配。
  • (对于具有多个输出的链)一个 output_messages_key ,指定要存储为历史记录的输出。这是 input_messages_key 的相反。

我们可以像平常一样调用这个新的链,其中有一个额外的可配置字段,用于指定传递给工厂函数的特定 session_id。在演示中未使用此功能,但在实际的生产链中,您会希望返回与传递的会话相对应的聊天记录。

chain_with_message_history.invoke(
{"input": "Translate this sentence from English to French: I love programming."},
{"configurable": {"session_id": "unused"}},
)
AIMessage(content='The translation of "I love programming" in French is "J\'adore la programmation."')
chain_with_message_history.invoke(
{"input": "What did I just ask you?"}, {"configurable": {"session_id": "unused"}}
)
AIMessage(content='You just asked me to translate the sentence "I love programming" from English to French.')

Modifying chat history

修改存储的聊天消息可以帮助您的聊天机器人处理各种情况。以下是一些例子:

Trimming messages

LLM和聊天模型具有有限的上下文窗口,即使您没有直接达到限制,您也可能希望限制模型需要处理的干扰量。一种解决方案是仅加载和存储最近 n 条消息。让我们以一个包含一些预加载消息的示例历史记录为例:

demo_ephemeral_chat_history = ChatMessageHistory()

demo_ephemeral_chat_history.add_user_message("Hey there! I'm Nemo.")
demo_ephemeral_chat_history.add_ai_message("Hello!")
demo_ephemeral_chat_history.add_user_message("How are you today?")
demo_ephemeral_chat_history.add_ai_message("Fine thanks!")

demo_ephemeral_chat_history.messages
[HumanMessage(content="Hey there! I'm Nemo."),
AIMessage(content='Hello!'),
HumanMessage(content='How are you today?'),
AIMessage(content='Fine thanks!')]

让我们使用上面声明的 RunnableWithMessageHistory 链和此消息历史记录:

prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant. Answer all questions to the best of your ability.",
),
MessagesPlaceholder(variable_name="chat_history"),
("human", "{input}"),
]
)

chain = prompt | chat

chain_with_message_history = RunnableWithMessageHistory(
chain,
lambda session_id: demo_ephemeral_chat_history,
input_messages_key="input",
history_messages_key="chat_history",
)

chain_with_message_history.invoke(
{"input": "What's my name?"},
{"configurable": {"session_id": "unused"}},
)
AIMessage(content='Your name is Nemo.')

我们可以看到链表记住了预加载的名称。

但假设我们有一个非常小的上下文窗口,并且我们想要将传递给链的消息数量限制为仅最近的两条。我们可以使用 clear 方法来删除消息然后重新将其添加到历史记录中。虽然不是必须的,但让我们将这个方法放在链的最前面以确保它总是被调用:

from langchain_core.runnables import RunnablePassthrough


def trim_messages(chain_input):
stored_messages = demo_ephemeral_chat_history.messages
if len(stored_messages) <= 2:
return False

demo_ephemeral_chat_history.clear()

for message in stored_messages[-2:]:
demo_ephemeral_chat_history.add_message(message)

return True


chain_with_trimming = (
RunnablePassthrough.assign(messages_trimmed=trim_messages)
| chain_with_message_history
)

让我们称之为这条新的链,并在之后检查消息。

chain_with_trimming.invoke(
{"input": "Where does P. Sherman live?"},
{"configurable": {"session_id": "unused"}},
)
AIMessage(content="P. Sherman's address is 42 Wallaby Way, Sydney.")
demo_ephemeral_chat_history.messages
[HumanMessage(content="What's my name?"),
AIMessage(content='Your name is Nemo.'),
HumanMessage(content='Where does P. Sherman live?'),
AIMessage(content="P. Sherman's address is 42 Wallaby Way, Sydney.")]

我们可以看到,我们的历史记录已删除了两条最旧的信息,同时在最后添加了最新的对话。下一次调用链时,将再次调用 trim_messages,只有最近的两条消息将传递给模型。在这种情况下,这意味着下一次调用模型时,模型将忘记我们给它的名字。

chain_with_trimming.invoke(
{"input": "What is my name?"},
{"configurable": {"session_id": "unused"}},
)
AIMessage(content="I'm sorry, I don't have access to your personal information.")
demo_ephemeral_chat_history.messages
[HumanMessage(content='Where does P. Sherman live?'),
AIMessage(content="P. Sherman's address is 42 Wallaby Way, Sydney."),
HumanMessage(content='What is my name?'),
AIMessage(content="I'm sorry, I don't have access to your personal information.")]

Summary memory

我们也可以用同样的模式进行其他方式的操作。例如,我们可以使用另一个LLM调用来生成对话摘要,然后再调用我们的链。让我们重新创建我们的聊天历史和聊天机器人链:

demo_ephemeral_chat_history = ChatMessageHistory()

demo_ephemeral_chat_history.add_user_message("Hey there! I'm Nemo.")
demo_ephemeral_chat_history.add_ai_message("Hello!")
demo_ephemeral_chat_history.add_user_message("How are you today?")
demo_ephemeral_chat_history.add_ai_message("Fine thanks!")

demo_ephemeral_chat_history.messages
[HumanMessage(content="Hey there! I'm Nemo."),
AIMessage(content='Hello!'),
HumanMessage(content='How are you today?'),
AIMessage(content='Fine thanks!')]

我们将略微修改提示,让LLM意识到他将收到一个简短总结而不是聊天记录:

prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant. Answer all questions to the best of your ability. The provided chat history includes facts about the user you are speaking with.",
),
MessagesPlaceholder(variable_name="chat_history"),
("user", "{input}"),
]
)

chain = prompt | chat

chain_with_message_history = RunnableWithMessageHistory(
chain,
lambda session_id: demo_ephemeral_chat_history,
input_messages_key="input",
history_messages_key="chat_history",
)

现在,让我们创建一个函数,将之前的交互汇总为一个摘要。我们也可以将这个函数添加到链的开头:

def summarize_messages(chain_input):
stored_messages = demo_ephemeral_chat_history.messages
if len(stored_messages) == 0:
return False
summarization_prompt = ChatPromptTemplate.from_messages(
[
MessagesPlaceholder(variable_name="chat_history"),
(
"user",
"Distill the above chat messages into a single summary message. Include as many specific details as you can.",
),
]
)
summarization_chain = summarization_prompt | chat

summary_message = summarization_chain.invoke({"chat_history": stored_messages})

demo_ephemeral_chat_history.clear()

demo_ephemeral_chat_history.add_message(summary_message)

return True


chain_with_summarization = (
RunnablePassthrough.assign(messages_summarized=summarize_messages)
| chain_with_message_history
)

让我们看看它是否记得我们给它起的名字:

chain_with_summarization.invoke(
{"input": "What did I say my name was?"},
{"configurable": {"session_id": "unused"}},
)
AIMessage(content='You introduced yourself as Nemo. How can I assist you today, Nemo?')
demo_ephemeral_chat_history.messages
[AIMessage(content='The conversation is between Nemo and an AI. Nemo introduces himself and the AI responds with a greeting. Nemo then asks the AI how it is doing, and the AI responds that it is fine.'),
HumanMessage(content='What did I say my name was?'),
AIMessage(content='You introduced yourself as Nemo. How can I assist you today, Nemo?')]

请注意,再次调用链将生成另一个摘要,该摘要是从初始摘要加上新消息生成的,依此类推。您也可以设计一种混合方法,在其中保留一定数量的消息作为聊天记录,而其他消息则进行摘要。


Help us out by providing feedback on this documentation page: