
LlamaIndex • 2024-05-21
在 LlamaIndex 中使用 Azure Container Apps 动态会话安全执行代码
LLM 众多令人惊叹的能力之一是生成可执行代码。这可用于解决需要计算和固定逻辑的各种复杂问题,而这些是传统计算擅长但 LLM 可能难以直接执行的。构建用于执行复杂任务的代理时,为您的代理配备代码执行作为可用工具可能是一个强大的策略。
然而,这个策略有一个主要的缺点:可执行代码可能存在缺陷甚至执行起来很危险,而且在执行之前检测代码是否有问题,这可以说就是停机问题的一种表现,因此无法保证检测成功。
解决方案是沙箱,用于将潜在问题的代码与主机环境隔离。现在,得益于 Azure Container Apps 中的动态会话,直接从 LlamaIndex 中执行由 LLM 生成的沙箱代码变得简单。它被实现为一个工具,可供任何 LlamaIndex 代理使用。
在这篇博文中,我们将向您展示如何精确使用新的 Azure Code Interpreter 工具,并通过几个示例向您讲解如何充分利用它。您可以在此 notebook 中查看完整代码,并在 LlamaHub 的工具文档以及 learn.microsoft.com 上阅读更多内容。
设置 Azure Container Apps 动态会话
首先,安装我们的 Python 包,包括该工具
pip install llama-index
pip install llama-index-llms-azure
pip install llama-index-tools-azure-code-interpreter
在该 notebook 中,我们使用托管在 Azure 上的 GPT 3.5 Turbo 作为 LLM,但您可以使用任何能够使用工具的 LLM
from llama_index.llms.azure_openai import AzureOpenAI
llm = AzureOpenAI(
model="gpt-35-turbo",
deployment_name="gpt-35-deploy",
api_key=api_key,
azure_endpoint=azure_endpoint,
api_version=api_version,
)
设置好 LLM 后,您需要创建一个会话池来托管您的执行。这样做会为您提供一个会话池管理终结点 URL,您可以将其提供给 LlamaIndex,如下所示
# Import the AzureCodeInterpreterToolSpec from llama_index
from llama_index.tools.azure_code_interpreter import (
AzureCodeInterpreterToolSpec,
)
# Create the AzureCodeInterpreterToolSpec with the pool_managment_endpoint set to your session management endpoint
# It is optional to set the local_save_path, but it is recommended to set it to a path where the tool can automatically save any intermediate data generated from Python code's output.
azure_code_interpreter_spec = AzureCodeInterpreterToolSpec(
pool_managment_endpoint="your-pool-management-endpoint",
local_save_path="local-file-path-to-save-intermediate-data",
)
这就设置了一个可用于 LlamaIndex 的工具。现在您已准备好设置您的代理了
# Import the ReActAgent
from llama_index.core.agent import ReActAgent
# Create the ReActAgent and inject the tools defined in the AzureDynamicSessionsToolSpec
agent = ReActAgent.from_tools(
azure_code_interpreter_spec.to_tool_list(), llm=llm, verbose=True
)
在这个示例中,我们只提供了一个工具,但您可以将任何其他喜欢的工具提供给您的 ReAct 代理。现在您有了一个代理,就可以让它执行任务了!
动态会话代码解释器实际应用
在我们的第一个示例中,我们将询问代理西雅图的时间。这对于 LLM 通常是一个棘手的任务,因为它们并不知道世界任何地方的时间!
# Test the agent with simple answers that could leverage Python codes
print(agent.chat("Tell me the current time in Seattle."))
代理生成 Python 代码来确定时间并将其转换为正确的时区。它将此代码传递给 Azure Container Apps 动态会话,后者执行代码并返回答案
Thought: To provide the current time in Seattle, I need to calculate it based on the current UTC time and adjust for Seattle's time zone, which is Pacific Daylight Time (PDT) during daylight saving time and Pacific Standard Time (PST) outside of daylight saving time. PDT is UTC-7, and PST is UTC-8. I can use the code interpreter tool to get the current UTC time and adjust it accordingly.
Action: code_interpreter
Action Input: {'python_code': "from datetime import datetime, timedelta; import pytz; utc_now = datetime.now(pytz.utc); seattle_time = utc_now.astimezone(pytz.timezone('America/Los_Angeles')); seattle_time.strftime('%Y-%m-%d %H:%M:%S %Z%z')"}
Observation: {'$id': '1', 'status': 'Success', 'stdout': '', 'stderr': '', 'result': '2024-05-04 13:54:09 PDT-0700', 'executionTimeInMilliseconds': 120}
Thought: I can answer without using any more tools. I'll use the user's language to answer.
Answer: The current time in Seattle is 2024-05-04 13:54:09 PDT.
The current time in Seattle is 2024-05-04 13:54:09 PDT.
您还可以使用该工具安全地检查和操作数据,例如在本例中,我们让它打开一个 CSV 文件并回答有关它的问题
# Upload a sample temperature file of a day in Redmond Washington and ask a question about it
res = azure_code_interpreter_spec.upload_file(
local_file_path="./TemperatureData.csv"
)
if len(res) != 0:
print(
agent.chat("Find the highest temperature in the file that I uploaded.")
)
它不仅读取 CSV 中的数据,还会对其进行数学计算以确定最高温度
Thought: I need to use the list_files tool to get the metadata for the uploaded file, and then use python to read the file and find the highest temperature.
Action: list_files
Action Input: {}
Observation: [RemoteFileMetadata(filename='TemperatureData.csv', size_in_bytes=514, file_full_path='/mnt/data/TemperatureData.csv')]
Thought: I have the metadata for the file. I need to use python to read the file and find the highest temperature.
Action: code_interpreter
Action Input: {'python_code': "import csv\n\nwith open('/mnt/data/TemperatureData.csv', 'r') as f:\n reader = csv.reader(f)\n next(reader)\n highest_temp = float('-inf')\n for row in reader:\n temp = float(row[1])\n if temp > highest_temp:\n highest_temp = temp\nprint(highest_temp)"}
Observation: {'$id': '1', 'status': 'Success', 'stdout': '12.4\n', 'stderr': '', 'result': '', 'executionTimeInMilliseconds': 26}
Thought: I have the highest temperature. I can answer the question.
Answer: The highest temperature in the file is 12.4 degrees.
The highest temperature in the file is 12.4 degrees.
该工具还可以安全地修改您上传的文件,例如在本例中,我们让它对 CSV 中的数据进行排序
# Ask the LLM to perform modifications on the sample temperature file
print(
agent.chat(
"Rearrange the temperature data in a descending order and save it back to the original csv file."
)
)
在这个示例中,您会看到 LLM 最初生成的代码有语法错误,但代理足够智能,能够找出问题所在并自行纠正
Thought: To proceed, I need to first list the files in the session pool to identify the uploaded temperature data file.
Action: list_files
Action Input: {}
Observation: [RemoteFileMetadata(filename='TemperatureCurve.png', size_in_bytes=35512, file_full_path='/mnt/data/TemperatureCurve.png'), RemoteFileMetadata(filename='TemperatureData.csv', size_in_bytes=514, file_full_path='/mnt/data/TemperatureData.csv')]
Thought: The temperature data file is identified as 'TemperatureData.csv'. I need to read its contents, rearrange the temperature data in descending order, and then save the changes back to the original CSV file.
Action: code_interpreter
Action Input: {'python_code': "import pandas as pd\n\ndf = pd.read_csv('/mnt/data/TemperatureData.csv')\ndf_sorted = df.sort_values(by='Temperature', ascending=False)\ndf_sorted.to_csv('/mnt/data/TemperatureData.csv', index=False)\n'filename': 'TemperatureData.csv'"}
Observation: {'$id': '1', 'status': 'Failure', 'stdout': '', 'stderr': "illegal target for annotation (3407605186.py, line 6) Cell In[2363], line 6\n 'filename': 'TemperatureData.csv'\n ^\nSyntaxError: illegal target for annotation\n\n", 'result': None, 'executionTimeInMilliseconds': 9}
Thought: There was a syntax error in the Python code due to an incorrect line. I need to correct the code and execute it again to sort the temperature data and save it back to the CSV file.
Action: code_interpreter
Action Input: {'python_code': "import pandas as pd\n\ndf = pd.read_csv('/mnt/data/TemperatureData.csv')\ndf_sorted = df.sort_values(by='Temperature', ascending=False)\ndf_sorted.to_csv('/mnt/data/TemperatureData.csv', index=False)"}
Observation: {'$id': '1', 'status': 'Success', 'stdout': '', 'stderr': '', 'result': '', 'executionTimeInMilliseconds': 91}
Thought: I can answer without using any more tools. I'll use the user's language to answer.
Answer: The temperature data has been successfully rearranged in descending order and saved back to the original CSV file, 'TemperatureData.csv'.
The temperature data has been successfully rearranged in descending order and saved back to the original CSV file, 'TemperatureData.csv'.
如果修改后无法检索文件,那么修改文件就没有用处,而检索是这样完成的
# Download the modified file
azure_code_interpreter_spec.download_file_to_local(
remote_file_path="TemperatureData.csv",
local_file_path="/.../SortedTemperatureData.csv",
)
无限可能
使用沙箱代码执行可以完成的任务范围与编程本身一样广泛,并且保证了安全执行,使您可以放心地将以前可能犹豫的任务交给代理。我们认为这是对我们 LLM 代理能力的一个惊人补充,我们很高兴看到您会用它构建出什么。