Sherlock Xu • 2024-01-03

使用 LlamaIndex 和 OpenLLM 构建智能问答系统

过去一年，像 GPT-4 这样的大型语言模型（LLM）不仅改变了我们与机器互动的方式，还重新定义了自然语言处理（NLP）领域的可能性。这一演变的一个显著趋势是开源 LLM 的日益普及，如 Llama 2、Falcon、OPT 和 Yi。在可访问性、数据安全性和隐私、定制潜力、成本以及供应商依赖性方面，一些人可能更倾向于选择它们而非商业产品。在 LLM 领域越来越受欢迎的工具中，OpenLLM 和 LlamaIndex 是两个强大的平台，结合使用时，可以为构建 AI 驱动的应用程序解锁新的用例。

OpenLLM 是一个开源平台，用于在生产环境中部署和运行任何开源 LLM。其灵活性和易用性使其成为寻求利用 LLM 力量的 AI 应用开发人员的理想选择。您可以轻松地在各种创新和实用应用中微调、提供服务、部署和监控 LLM。

LlamaIndex 提供了一个全面的框架，用于管理和检索私有及领域特定数据。它充当了 LLM 广泛知识与特定应用程序独特、上下文相关数据需求之间的桥梁。

OpenLLM 对各种开源 LLM 的支持以及 LlamaIndex 无缝集成自定义数据源的能力，为这两个社区的开发者提供了极大的定制性。这种组合使他们能够创建既高度智能又恰当适应特定数据上下文的 AI 解决方案，这对于问答系统非常重要。

在这篇博文中，我将解释如何利用 OpenLLM 和 LlamaIndex 的联合优势来构建一个智能问答系统。该系统可以通过访问自定义语料库来理解、处理和响应查询。

设置环境

第一步是在您的机器上创建一个虚拟环境，这有助于避免与您可能正在处理的其他 Python 项目发生冲突。我们称之为 llamaindex-openllm 并激活它。

python -m venv llamaindex-openllm
source llamaindex-openllm/bin/activate

安装所需的软件包。此命令将安装 OpenLLM 以及可选的 vllm 组件（稍后我会解释）。

pip install "openllm[vllm]" llama-index llama-index-llms-openllm llama-index-embeddings-huggingface

为了处理请求，您需要有一个 LLM 服务器。这里，我使用以下命令在 https://:3000 启动一个 Llama 2 7B 本地服务器。您可以随意选择任何适合您需求的模型。如果您已经有一个远程 LLM 服务器，可以跳过此步骤。

openllm start meta-llama/Llama-2-7b-chat-hf --backend vllm

OpenLLM 会自动为模型选择最适合的运行时实现。对于支持 vLLM 的模型，OpenLLM 默认使用 vLLM。否则，它会回退到 PyTorch。vLLM 是一个用于 LLM 的高吞吐量且内存高效的推理和服务引擎。根据这份报告，使用 vLLM 可以实现 23 倍的 LLM 推理吞吐量，同时降低 P50 延迟。

注意：要使用 vLLM 后端，您需要至少具备 Ampere 架构（或更新）的 GPU 和 CUDA 版本 11.8。本演示使用了配备 Ampere A100–80G GPU 的机器。如果您的机器有兼容的 GPU，您也可以选择 vLLM。否则，只需在之前的命令中安装标准的 OpenLLM 包（pip install openllm）即可。

v1：创建一个简单的补全服务

在构建问答系统之前，让我们先熟悉 OpenLLM 和 LlamaIndex 的集成，并用它创建一个简单的补全服务。

该集成提供了两个与 LLM 交互的 API

1. OpenLLM：这可以直接启动一个本地 LLM 服务器，无需使用 openllm start 等命令单独启动。使用方法如下：

from llama_index.llms.openllm import OpenLLM
llm = OpenLLM('meta-llama/Llama-2-7b-chat-hf')

2. OpenLLMAPI：这可用于与托管在其他地方的服务器交互，例如我之前启动的 Llama 2 7B 模型。

让我们尝试 complete 端点，看看 Llama 2 7B 模型是否能通过补全句子“OpenLLM is an open source tool for”来解释 OpenLLM 是什么。

from llama_index.llms.openllm import OpenLLMAPI

remote_llm = OpenLLMAPI(address="https://:3000")

completion_response = remote_llm.complete("OpenLLM is an open source tool for", max_new_tokens=1024)
print(completion_response)

运行此脚本，输出如下

learning lifelong learning models. It is designed to be easy to use, even for those without extensive knowledge of machine learning. OpenLLM allows users to train, evaluate, and deploy lifelong learning models using a variety of datasets and algorithms.

OpenLLM provides a number of features that make it useful for learning lifelong learning models. Some of these features include:

1. Easy-to-use interface: OpenLLM provides an easy-to-use interface that makes it simple to train, evaluate, and deploy lifelong learning models.
2. Support for a variety of datasets: OpenLLM supports a variety of datasets, including images, text, and time-series data.
3. Support for a variety of algorithms: OpenLLM supports a variety of algorithms for lifelong learning, including neural networks, decision trees, and support vector machines.
4. Evaluation tools: OpenLLM provides a number of evaluation tools that allow users to assess the performance of their lifelong learning models.
5. Deployment tools: OpenLLM provides a number of deployment tools that allow users to deploy their lifelong learning models in a variety of environments.

OpenLLM is written in Python and is available under an open source license. It is designed to be used in a variety of settings, including research, education, and industry.

Some potential use cases for OpenLLM include:

1. Training lifelong learning models for image classification: OpenLLM could be used to train a lifelong learning model to classify images based on their content.
2. Training lifelong learning models for natural language processing: OpenLLM could be used to train a lifelong learning model to process and analyze natural language text.
3. Training lifelong learning models for time-series data: OpenLLM could be used to train a lifelong learning model to predict future values in a time-series dataset.
4. Deploying lifelong learning models in a production environment: OpenLLM could be used to deploy a lifelong learning model in a production environment, such as a recommendation system or a fraud detection system.

Overall, OpenLLM is a powerful tool for learning lifelong learning models. Its ease of use, flexibility, and support for a variety of datasets and algorithms make it a valuable resource for researchers and practitioners in a variety of fields.

显然，模型并没有正确解释 OpenLLM，还出现了一些幻觉🤣。尽管如此，代码运行良好，服务器对请求输出了响应。这是我们继续构建系统的一个好的开始。

v2：使用问答系统进行增强

初始版本暴露了一个关键限制：模型缺乏关于 OpenLLM 的具体知识。一个解决方案是向模型喂入领域特定信息，使其能够根据主题相关查询进行学习和响应。这就是 LlamaIndex 发挥作用的地方，它可以让您构建一个包含相关信息的本地知识库。具体来说，您创建一个目录（例如 data），并为该文件夹中的所有文档构建索引。

创建一个文件夹，然后将 OpenLLM 的 GitHub README 文件导入该文件夹

mkdir data
cd data
wget https://github.com/bentoml/OpenLLM/blob/main/README.md

返回上一级目录并创建一个名为 starter.py 的脚本，内容如下：

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
from llama_index.llms.openllm import OpenLLMAPI
from llama_index.core.node_parser import SentenceSplitter

# Change the address to your OpenLLM server
llm = OpenLLMAPI(address="https://:3000")

# Break down the document into manageable chunks (each of size 1024 characters, with a 20-character overlap)
text_splitter = SentenceSplitter(chunk_size=1024, chunk_overlap=20)

# Create a ServiceContext with the custom model and all the configurations
service_context = ServiceContext.from_defaults(
    llm=llm,
    embed_model="local",
    text_splitter=text_splitter,
    context_window=8192,
    num_output=4096,
)

# Load documents from the data directory
documents = SimpleDirectoryReader("data").load_data()

# Build an index over the documents using the customized LLM in the ServiceContext
index = VectorStoreIndex.from_documents(documents, service_context=service_context)

# Query your data using the built index
query_engine = index.as_query_engine()
response = query_engine.query("What is OpenLLM?")
print(response)

为了提高响应质量，我建议您定义一个 SentenceSplitter，以便更精细地控制输入处理，从而产生更好的输出质量。

此外，您可以设置 streaming=True 来流式传输您的响应

query_engine = index.as_query_engine(streaming=True)
response = query_engine.query("What is OpenLLM?")
response.print_response_stream()

您的目录结构现在应该如下所示：

├── starter.py
└── data
    └── README.md

运行 starter.py 来测试问答系统。输出应该与 OpenLLM README 的内容一致。以下是我收到的响应：

OpenLLM 是一个开源平台，用于在各种环境（包括本地、云和边缘设备）中部署和管理大型语言模型（LLM）。它提供了一整套全面的工具和功能，用于微调、提供服务、部署和监控 LLM，从而简化了 LLM 的端到端部署工作流程。

结论

本文的探索强调了根据特定需求定制 AI 工具的重要性。通过使用 OpenLLM 进行 LLM 的灵活部署和 LlamaIndex 进行数据管理，我展示了如何创建一个 AI 驱动的系统。它不仅能够理解和处理查询，还能根据独特的知识库提供响应。我希望这篇博文能激励您探索 OpenLLM 和 LlamaIndex 的更多功能和用例。编码愉快！⌨️

使用 UpTrain 评估增强您的 LlamaIndex RAG 管道
2024-03-19
LlamaIndex 新闻通讯 2024-03-19
2024-03-19
LlamaIndex 新闻通讯 2024-03-05
2024-03-05
使用 llama-index-networks 查询知识网络
2024-02-27

使用 LlamaIndex 和 OpenLLM 构建智能问答系统

设置环境

v1：创建一个简单的补全服务

v2：使用问答系统进行增强

结论

相关文章