Quickstart

LANGCHAIN有多个组件，旨在帮助构建问答应用程序和更广泛的RAG应用程序。为了熟悉这些组件，我们将在文本数据源上构建一个简单的问答应用程序。在此过程中，我们将介绍典型的问答架构，讨论相关的LANGCHAIN组件，并强调更高级问答技术的附加资源。我们还将看到LANGSMITH如何帮助我们跟踪和理解我们的应用程序。随着我们的应用程序变得越来越复杂，LANGSMITH将变得越来越有帮助。

Architecture

我们将按照问答介绍中概述的内容创建一个典型的RAG应用程序，该应用程序有两个主要组件：

索引：：用于从源数据摄取和索引数据的流程。通常是离线的。

检索和生成：实际的RAG链在运行时获取用户查询并从索引中检索相关数据，然后将其传递给模型。

从原始数据到答案的完整序列如下：

Indexing

加载：首先我们需要加载数据。我们将使用 DocumentLoaders。
拆分：文本分割器将大型文档分割成较小的块。这对于索引数据和将其传递到模型都很有用，因为大块数据更难搜索，并且不适合模型的有限上下文窗口。
我们需要一个地方来存储和索引我们的分割，以便以后可以进行搜索。通常使用 VectorStore 和 Embeddings 模型来完成这一操作。

Retrieval and generation

检索：给定用户输入，使用“检索器”从存储中检索相关的拆分。
生成：一个 ChatModel / LLM 使用包含问题和检索到的数据的提示来生成答案。

Setup

Dependencies

我们将在本教程中使用OpenAI聊天模型和嵌入以及Chroma向量存储，但这里展示的所有内容都适用于任何ChatModel或LLM、Embeddings以及VectorStore或Retriever。

我们将使用以下软件包：

%pip install --upgrade --quiet  langchain langchain-community langchainhub langchain-openai chromadb bs4

我们需要为嵌入式模型设置环境变量 OPENAI_API_KEY，可以直接设置或者从 .env 文件中加载。

import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()

# import dotenv

# dotenv.load_dotenv()

LangSmith

许多你使用 LangChain 构建的应用程序将包含多个步骤和多次调用 LLM 调用。随着这些应用变得越来越复杂，能够检查链或代理内部发生的情况变得至关重要。最好的方法是使用 LangSmith。

请注意，LangSmith不是必需的，但它是有帮助的。如果您想要使用LangSmith，在上面的链接注册后，请确保设置您的环境变量来开始记录追踪：

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()

Preview

在这个指南中，我们将基于Lilian Weng撰写的以LLM为动力的自动代理博客文章构建一个QA应用程序，这让我们可以询问有关文章内容的问题。

我们可以创建一个简单的索引管道和RAG链，来用大约20行代码实现这个功能：

import bs4
from langchain import hub
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

               
              

Install dependencies

pip install -qU langchain-openai

Set environment variables

import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo-0125")

Install dependencies

pip install -qU langchain-anthropic

Set environment variables

import getpass
import os

os.environ["ANTHROPIC_API_KEY"] = getpass.getpass()

from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic(model="claude-3-sonnet-20240229")

Install dependencies

pip install -qU langchain-fireworks

Set environment variables

import getpass
import os

os.environ["FIREWORKS_API_KEY"] = getpass.getpass()

from langchain_fireworks import ChatFireworks

llm = ChatFireworks(model="accounts/fireworks/models/mixtral-8x7b-instruct")

Install dependencies

pip install -qU langchain-mistralai

Set environment variables

import getpass
import os

os.environ["MISTRAL_API_KEY"] = getpass.getpass()

from langchain_mistralai import ChatMistralAI

llm = ChatMistralAI(model="mistral-large-latest")

Install dependencies

pip install -qU langchain-google-genai

Set environment variables

import getpass
import os

os.environ["GOOGLE_API_KEY"] = getpass.getpass()

from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-pro")

Install dependencies

pip install -qU langchain-openai

Set environment variables

import getpass
import os

os.environ["TOGETHER_API_KEY"] = getpass.getpass()

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://api.together.xyz/v1",
    api_key=os.environ["TOGETHER_API_KEY"],
    model="mistralai/Mixtral-8x7B-Instruct-v0.1",)

# Load, chunk and index the contents of the blog.
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

               
              

rag_chain.invoke("What is Task Decomposition?")

'Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It can be done through prompting techniques like Chain of Thought or Tree of Thoughts, or by using task-specific instructions or human inputs. Task decomposition helps agents plan ahead and manage complicated tasks more effectively.'

# cleanup
vectorstore.delete_collection()

请检查 LangSmith 的跟踪。

Detailed walkthrough

让我们逐步审视上面的代码，真正理解正在发生的事情。

1. Indexing: Load

我们需要首先加载博客文章内容。我们可以使用DocumentLoaders进行这个操作，这些是从源加载数据并返回一个Documents列表的对象。一个Document是一个带有page_content（str）和metadata（dict）的对象。

在这种情况下，我们会使用`WebBaseLoader`，它使用`urllib`从web URL加载HTML，并使用`BeautifulSoup`将其解析为文本。我们可以通过向`BeautifulSoup`解析器传递参数来自定义HTML到文本的解析，可以通过`bs_kwargs`进行设置（请参阅`BeautifulSoup`文档）。在这种情况下，只有带有“post-content”、“post-title”或“post-header”类的HTML标签是相关的，因此我们将删除所有其他标签。

import bs4
from langchain_community.document_loaders import WebBaseLoader

# Only keep post title, headers, and content from the full HTML.
bs4_strainer = bs4.SoupStrainer(class_=("post-title", "post-header", "post-content"))
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs={"parse_only": bs4_strainer},
)
docs = loader.load()

               
              

len(docs[0].page_content)

print(docs[0].page_content[:500])

      LLM Powered Autonomous Agents

Date: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng

Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.
Agent System Overview#
In

Go deeper

文档加载器：从源加载数据作为文档列表的对象。

文档：关于如何使用 DocumentLoaders 的详细文档。
集成：160+ 种可供选择的集成。
界面：基本界面的 API 参考。

2. Indexing: Split

我们的加载文档超过了42k个字符。这对许多模型的上下文窗口来说太长了。即使对于那些可以将完整帖子放入其上下文窗口的模型来说，也可能很难在非常长的输入中找到信息。

将文件分成块来进行嵌入和向量存储。这样可以帮助我们在运行时仅检索博客文章中最相关的部分。

在这种情况下，我们将将我们的文档分割成每个1000个字符的块，块之间有200个字符的重叠。重叠帮助减轻将一个语句与与之相关的重要上下文分开的可能性。我们使用RecursiveCharacterTextSplitter，它将递归地使用常见分隔符（如换行符）拆分文档，直到每个块的大小合适。这是一种推荐的用于通用文本案例的文本分割器。

我们设置 `add_start_index=True`，以便保留初始文档中每个分割文档开始的字符索引作为元数据属性 “start_index”。

from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200, add_start_index=True
)
all_splits = text_splitter.split_documents(docs)

len(all_splits)

len(all_splits[0].page_content)

all_splits[10].metadata

{'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/',
 'start_index': 7056}

Go deeper

文本拆分器：将一组文档拆分为较小块的对象。是文档转换器的子类。

探索上下文感知的分割器，保持原始文档中每个分割的位置("上下文")：- Markdown 文件
代码（py或js）
科学论文
接口 : 用于基本接口的 API 参考。

文档转换器：执行对文档列表进行转换的对象。

文档：关于如何使用DocumentTransformers的详细文档。
集成
接口 : 基础接口的API参考。

3. Indexing: Store

现在我们需要对66个文本块进行索引，以便在运行时进行搜索。最常见的做法是嵌入每个文档拆分的内容，并将这些嵌入插入到向量数据库（或向量存储）中。当我们想要搜索我们的拆分时，我们将进行文本搜索查询，对其进行嵌入，并执行某种形式的“相似性”搜索，以识别与我们查询嵌入最相似的存储拆分。最简单的相似度度量是余弦相似度 - 我们测量每对嵌入之间的角度的余弦值（这些嵌入是高维向量）。

我们可以使用 Chroma 向量存储和 OpenAIEmbeddings 模型来嵌入并存储所有文档分割的内容。

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())

Go deeper

嵌入：文本嵌入模型的包装器，用于将文本转换为嵌入。

文档：关于如何使用嵌入的详细文档。
集成 : 选择30多种集成。
接口：用于基本接口的API参考。

矢量存储：封装了矢量数据库，用于存储和查询嵌入。

文档：关于如何使用矢量存储的详细文档。
集成：有40多种集成可供选择。
接口：基础接口的 API 参考。

这完成了流水线中的索引部分。此时，我们有一个可查询的向量存储，其中包含我们博客文章内容的分块。给定用户的问题，我们应该能够理想地返回回答问题的博客文章摘录。

4. Retrieval and Generation: Retrieve

现在让我们编写实际的应用逻辑。我们想要创建一个简单的应用程序，该应用程序接收用户问题，搜索与问题相关的文档，将检索到的文档和初始问题传递给模型，然后返回一个答案。

首先，我们需要定义搜索文档的逻辑。LangChain 定义了一个检索器接口，它包装了一个可以根据字符串查询返回相关文档的索引。

最常见的Retriever类型是VectorStoreRetriever，它使用向量存储的相似性搜索功能来实现检索。任何VectorStore都可以通过VectorStore.as_retriever()轻松转换为Retriever。

retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})

retrieved_docs = retriever.invoke("What are the approaches to Task Decomposition?")

len(retrieved_docs)

print(retrieved_docs[0].page_content)

Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Go deeper

向量存储通常用于检索，但也有其他检索方法。

检索器: 一个返回给定文本查询的文档的对象

进一步文档：关于界面和内置检索技术的进一步文档。其中一些包括：多查询检索器生成输入问题的变体以提高检索命中率。多向量检索器（下图）相反地生成嵌入的变体，也是为了提高检索命中率。最大边际相关性选择相关性和多样性以避免传递重复上下文的检索文档之间的选择。在矢量存储器检索期间，可以使用元数据过滤器对文档进行过滤，例如自查询检索器。
集成：与检索服务集成。
界面：基础接口的 API 参考。

5. Retrieval and Generation: Generate

把这些内容整合到一个链条中，该链条接收一个问题，检索相关文档，构建提示，传递给模型，并解析输出结果。

我们将使用gpt-3.5-turbo的OpenAI聊天模型，但可以替换为任何LangChain的LLM或ChatModel。

Install dependencies

pip install -qU langchain-openai

Set environment variables

import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo-0125")

Install dependencies

pip install -qU langchain-anthropic

Set environment variables

import getpass
import os

os.environ["ANTHROPIC_API_KEY"] = getpass.getpass()

from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic("model="claude-3-sonnet-20240229", temperature=0.2, max_tokens=1024")

Install dependencies

pip install -qU langchain-fireworks

Set environment variables

import getpass
import os

os.environ["FIREWORKS_API_KEY"] = getpass.getpass()

from langchain_fireworks import ChatFireworks

llm = ChatFireworks(model="accounts/fireworks/models/mixtral-8x7b-instruct")

Install dependencies

pip install -qU langchain-mistralai

Set environment variables

import getpass
import os

os.environ["MISTRAL_API_KEY"] = getpass.getpass()

from langchain_mistralai import ChatMistralAI

llm = ChatMistralAI(model="mistral-large-latest")

Install dependencies

pip install -qU langchain-google-genai

Set environment variables

import getpass
import os

os.environ["GOOGLE_API_KEY"] = getpass.getpass()

from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-pro")

Install dependencies

pip install -qU langchain-openai

Set environment variables

import getpass
import os

os.environ["TOGETHER_API_KEY"] = getpass.getpass()

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://api.together.xyz/v1",
    api_key=os.environ["TOGETHER_API_KEY"],
    model="mistralai/Mixtral-8x7B-Instruct-v0.1",)

我们会使用一种为 RAG 专用的提示，该提示已经上传到 LangChain 提示中心（这里）。

from langchain import hub

prompt = hub.pull("rlm/rag-prompt")

example_messages = prompt.invoke(
    {"context": "filler context", "question": "filler question"}
).to_messages()
example_messages

[HumanMessage(content="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: filler question \nContext: filler context \nAnswer:")]

print(example_messages[0].content)

You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: filler question
Context: filler context
Answer:

我们将使用 LCEL Runnable 协议来定义链，这样我们就可以透明地将组件和函数串联在一起，自动在 LangSmith 中跟踪我们的链，直接获得流式传输、异步和批处理调用。

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

for chunk in rag_chain.stream("What is Task Decomposition?"):
    print(chunk, end="", flush=True)

Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It involves transforming big tasks into multiple manageable tasks, allowing for easier interpretation and execution by autonomous agents or models. Task decomposition can be done through various methods, such as using prompting techniques, task-specific instructions, or human inputs.

查看LangSmith跟踪

Go deeper

Choosing a model

聊天模型：支持LLM的聊天模型。接收一系列消息并返回一条消息。

文档
整合：选择超过25种整合。
接口 : 基础接口的API参考。

LLM ：一个文本输入文本输出的LLM。接收一个字符串并返回一个字符串。

文档
集成：可选择75个以上的集成。
接口 : 基础接口的API参考。

查看本地运行模型的RAG指南请点击这里。

Customizing the prompt

如上所示，我们可以从提示集中加载提示（例如，此RAG提示）。提示也可以很容易地定制：

from langchain_core.prompts import PromptTemplate

template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible.
Always say "thanks for asking!" at the end of the answer.

{context}

Question: {question}

Helpful Answer:"""
custom_rag_prompt = PromptTemplate.from_template(template)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | custom_rag_prompt
    | llm
    | StrOutputParser()
)

rag_chain.invoke("What is Task Decomposition?")

               
              

'Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It involves transforming big tasks into multiple manageable tasks, allowing for a more systematic and organized approach to problem-solving. Thanks for asking!'

请查看LangSmith跟踪

Next steps

我们在短时间内涵盖了很多内容。每个部分都有很多功能、集成和扩展值得探索。除了上述提到的材料之外，好的下一步包括：【深入了解】。

返回来源资料：学习如何返回源文件
流式处理：学习如何流式传输输出和中间步骤
添加聊天记录历史：学习如何将聊天记录添加到您的应用程序中。

Quickstart

Architecture ​

Indexing ​

Retrieval and generation ​

Setup ​

Dependencies ​

LangSmith ​

Preview ​

Install dependencies

Set environment variables

Install dependencies

Set environment variables

Install dependencies

Set environment variables

Install dependencies

Set environment variables

Install dependencies

Set environment variables

Install dependencies

Set environment variables

Detailed walkthrough ​

1. Indexing: Load ​

Go deeper ​

2. Indexing: Split ​

Go deeper ​

3. Indexing: Store ​

Go deeper ​

4. Retrieval and Generation: Retrieve ​

Go deeper ​

5. Retrieval and Generation: Generate ​

Install dependencies

Set environment variables

Install dependencies

Set environment variables

Install dependencies

Set environment variables

Install dependencies

Set environment variables

Install dependencies

Set environment variables

Install dependencies

Set environment variables

Go deeper ​

Choosing a model ​

Customizing the prompt ​

Next steps ​

Help us out by providing feedback on this documentation page:

Architecture

Indexing

Retrieval and generation

Setup

Dependencies

LangSmith

Preview

Detailed walkthrough

1. Indexing: Load

Go deeper

2. Indexing: Split

Go deeper

3. Indexing: Store

Go deeper

4. Retrieval and Generation: Retrieve

Go deeper

5. Retrieval and Generation: Generate

Go deeper

Choosing a model

Customizing the prompt

Next steps