Cobus Greyling • 2024-01-30

LlamaIndex 的 Agentic RAG

Agentic RAG 主题探讨了如何将代理集成到现有的 RAG 流水线中，以增强对话式搜索和检索能力。

引言

考虑到下面的架构，很明显 Agentic RAG 如何创建易于扩展的实现。可以添加新文档，每套新文档都由一个子代理管理。

下图显示了 LlamaIndex 的 Agentic RAG 方法的基本结构，其中摄取了大量文档，在本例中限制为 100 份。

大量数据语料库被分解成更小的文档。为每个文档创建一个代理，并且每个文档代理都具有通过嵌入进行搜索和总结响应的能力。

在文档代理集合之上创建一个顶级代理。元代理/顶级代理执行工具检索，然后使用思维链回答用户的问题。

Rerank（重排）端点计算查询和每个文档的相关性得分，并返回从最相关到最不相关的文档排序列表。

笔记本示例

您可以在此找到一个 Colab 笔记本，其中包含此实现的完整工作和执行示例。

要运行此笔记本，您需要 OpenAI 和 Cohere API 密钥……

import os
import openai
import cohere

os.environ["OPENAI_API_KEY"] = "Your API Key goes here"
os.environ["COHERE_API_KEY"] = "Your API Key goes here"

对于每个文档代理，系统提示是

You are a specialized agent designed to answer queries about the 
`{file_base}.html` part of the LlamaIndex docs.

You must ALWAYS use at least one of the tools provided when answering a 
question; do NOT rely on prior knowledge.

元代理/顶级代理的系统提示

 You are an agent designed to answer queries about the documentation.
 Please always use the tools provided to answer a question. 
Do not rely on prior knowledge

工作示例

我向代理提出了以下问题

告诉我关于 LlamaIndex 连接器的事情

下面是代理生成的详细答案

思维：我需要使用工具来帮助我回答问题。

动作：tool_latest_search 动作输入：{‘input’: ‘LlamaIndex connectors’}

已将用户消息添加到内存：LlamaIndex 连接器

=== 调用函数 ===
调用函数：vector_tool_latest_search 参数：{
“input”: “LlamaIndex connectors”
}
获得输出：LlamaIndex 连接器可以在 /content/docs.llamaindex.ai/en/latest/search.html 页面上找到。========================

观察：在 LlamaIndex 文档的 `latest_search.html` 部分没有具体提及 LlamaIndex 连接器。您可能需要查看文档的其他部分以获取有关 LlamaIndex 连接器的详细信息。

思维：我需要在另一个工具中搜索有关 LlamaIndex 连接器的信息。

动作：tool_latest_index 动作输入：{‘input’: ‘LlamaIndex connectors’}

已将用户消息添加到内存：LlamaIndex 连接器

=== 调用函数 ===
调用函数：vector_tool_latest_index 参数：{
“input”: “LlamaIndex connectors”
}

获得输出：LlamaIndex 提供数据连接器，允许您从各种来源和格式摄取现有数据。这些连接器可以与 API、PDF、SQL 等协作。通过使用 LlamaIndex 连接器，您可以轻松地将数据带入 LlamaIndex 生态系统，并使其可以通过自然语言进行访问和检索。========================

观察：LlamaIndex 提供数据连接器，使您能够从各种来源和格式导入现有数据。这些连接器兼容 API、PDF、SQL 等。利用 LlamaIndex 连接器，您可以将数据无缝集成到 LlamaIndex 生态系统中，使其可以随时通过自然语言进行访问和检索。

思维：我可以在不使用其他工具的情况下回答。

答案：LlamaIndex 连接器用于将现有数据从各种来源和格式导入 LlamaIndex 生态系统。这些连接器兼容 API、PDF、SQL 等，允许无缝集成数据以进行自然语言访问和检索。

下面是 Colab 笔记本中的一个片段