Routing
有时我们可能有不同领域的多个索引,针对不同问题我们想要查询这些索引的不同子集。例如,假设我们有一个针对所有LangChain Python文档的向量存储索引,以及一个针对所有LangChain js文档的索引。对于有关LangChain使用的问题,我们希望推断问题所涉及的语言,并查询相应的文档。 查询路由是对查询应在哪个索引或索引子集上执行的分类过程。
Setup
Install dependencies
%pip install -qU langchain-core langchain-openai
Set environment variables
我们将在这个示例中使用 OpenAI。
import getpass
import os
os.environ["OPENAI_API_KEY"] = getpass.getpass()
# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.
# os.environ["LANGCHAIN_TRACING_V2"] = "true"
# os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()
Routing with function calling models
使用函数调用模型很容易将模型用于分类,这就是路由归结为的内容。
from typing import Literal
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_openai import ChatOpenAI
class RouteQuery(BaseModel):
"""Route a user query to the most relevant datasource."""
datasource: Literal["python_docs", "js_docs", "golang_docs"] = Field(
...,
description="Given a user question choose which datasource would be most relevant for answering their question",
)
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)
structured_llm = llm.with_structured_output(RouteQuery)
system = """You are an expert at routing a user question to the appropriate data source.
Based on the programming language the question is referring to, route it to the relevant data source."""
prompt = ChatPromptTemplate.from_messages(
[
("system", system),
("human", "{question}"),
]
)
router = prompt | structured_llm
/Users/bagatur/langchain/libs/core/langchain_core/_api/beta_decorator.py:86: LangChainBetaWarning: The function `with_structured_output` is in beta. It is actively being worked on, so the API may change.
warn_beta(
question = """Why doesn't the following code work:
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages(["human", "speak in {language}"])
prompt.invoke("french")
"""
router.invoke({"question": question})
RouteQuery(datasource='python_docs')
question = """Why doesn't the following code work:
import { ChatPromptTemplate } from "@langchain/core/prompts";
const chatPrompt = ChatPromptTemplate.fromMessages([
["human", "speak in {language}"],
]);
const formattedChatPrompt = await chatPrompt.invoke({
input_language: "french"
});
"""
router.invoke({"question": question})
RouteQuery(datasource='js_docs')
Routing to multiple indexes
如果我们想要查询多个索引,我们也可以这样做,通过更新我们的模式来接受一个数据源列表:
from typing import List
class RouteQuery(BaseModel):
"""Route a user query to the most relevant datasource."""
datasources: List[Literal["python_docs", "js_docs", "golang_docs"]] = Field(
...,
description="Given a user question choose which datasources would be most relevant for answering their question",
)
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)
structured_llm = llm.with_structured_output(RouteQuery)
router = prompt | structured_llm
router.invoke(
{
"question": "is there feature parity between the Python and JS implementations of OpenAI chat models"
}
)
RouteQuery(datasources=['python_docs', 'js_docs'])