
Laurie Voss • 2024-07-17
构建多智能体礼宾系统
为何构建此系统?
到目前为止,交互式聊天机器人已成为客户服务的常用解决方案,而智能体则是聊天机器人实现的常见组成部分。它们提供记忆、自省、工具使用以及一个胜任的机器人所需的其他功能。
我们开始对更大规模的聊天机器人感兴趣:那些能够完成数十个任务、其中一些任务相互依赖并使用数百种工具的聊天机器人。这样的智能体会是什么样子?它将拥有一个庞大的系统提示和大量的工具可供选择,这可能会让智能体感到困惑。
想象一家银行正在实施一个能够做到以下事项的系统
- 查询特定股票的价格
- 验证用户身份
- 查询您的账户余额
- 这要求用户已通过身份验证
- 在账户之间转账
- 这要求用户已通过身份验证
- 并且要求用户首先查询其账户余额
这些顶层任务中的每一个都包含子任务,例如
- 查询股票价格可能需要先查询股票代码
- 用户身份验证需要收集用户名和密码
- 账户余额查询需要知道要查询用户的哪个账户
为所有这些任务和子任务设计一个单一的主要提示将非常复杂。因此,我们转而设计了一个多智能体系统,其中每个顶层任务都由一个智能体负责,此外还有一个“礼宾”智能体,负责将用户引导到正确的智能体。
我们构建了什么
我们构建了一个智能体系统来完成上述任务。它是开源的!有四种基本的“任务”智能体
- 股票查询智能体(负责处理查找代码等子任务)
- 身份验证智能体(负责询问用户名和密码)
- 账户余额智能体(负责处理选择账户等子任务)
- 资金转账智能体(负责处理询问转账到哪个账户以及转账金额等任务)
还有三种“元”智能体
- 一个礼宾智能体:这个智能体负责在用户初次到来时与用户互动,告知他们可用的任务类型,并在任务完成后提供反馈。
- 一个编排智能体:这个智能体从不直接向用户提供输出。相反,它会查看用户当前试图完成的任务,并回复应处理该任务的智能体的纯文本名称。然后代码将路由到该智能体。
- 一个继续智能体:有时需要将多个智能体链接起来才能完成任务。例如,要查询您的账户余额,您需要先通过身份验证。身份验证智能体不知道您是仅仅为了验证身份,还是它属于一个链条的一部分,它也不需要知道。当身份验证智能体完成后,继续智能体会检查聊天历史记录以查看原始任务是什么,如果还有更多需要完成的事项,它会向编排智能体重新发起请求,以便无需进一步用户输入即可引导您完成任务。
一个全局状态跟踪用户及其当前状态,并在所有智能体之间共享。
系统的流程大致如下所示

系统运行示例
为了了解它在实际中如何工作,这里提供了包含有用调试语句的示例输出。通常会显示给用户的输出带有两个 >>
,而用户输入带有一个 >
。
对话开始时,还没有任何操作发生,因此您会被路由到礼宾智能体
No current speaker, asking orchestration agent to decide
Concierge agent selected
>> Hi there! How can I assist you today? Here are some things I can help you with:
>> Looking up a stock price
>> Authenticating you
>> Checking an account balance (requires authentication first)
>> Transferring money between accounts (requires authentication and checking an account balance first)
>> What would you like to do?
> Transfer money
“转账”任务需要身份验证。编排智能体在决定如何路由您时会检查您是否已通过身份验证(出于某种原因,它检查了两次,这是一个演示!)
No current speaker, asking orchestration agent to decide
Orchestrator is checking if authenticated
Orchestrator is checking if authenticated
Auth agent selected
它正确地判断您未通过身份验证,因此将您路由到身份验证智能体
>> To transfer money, I need to authenticate you first. Could you please provide your username and password?
> seldo
这是一个有趣的部分:您提供了输入,但不足以完成任务(您没有提供密码)。因此,当流程回到编排智能体时,全局状态表明“验证”智能体已经在运行并且尚未完成,因此它会路由回身份验证智能体,并再次要求输入密码
There's already a speaker: authenticate
Auth agent selected
Recording username
>> Thank you! Now, could you please provide your password?
> monkey
There's already a speaker: authenticate
Auth agent selected
Logging in seldo
Checking if authenticated
Authentication is complete
现在,身份验证智能体调用了一个 done()
函数,向全局状态指示其已完成任务。因此,流程现在转到继续智能体,它会查看聊天历史记录,发现用户正在尝试转账。因此,它生成一个提示,就像由用户说出一样,并将其发送给编排智能体
>> You have been successfully authenticated. Another agent will assist you with transferring money.
Asking the continuation agent to decide what to do next
Continuation agent said "I would like to transfer money."
No current speaker, asking orchestration agent to decide
Orchestrator checking if account has a balance
Orchestrator checking if account has a balance
Account balance agent selected
现在您已通过身份验证,但您尚未查询余额,编排智能体知道这对于转账是必需的。因此它将您路由到账户余额智能体(出于某种原因,检查了两次之后)
>> Before you can transfer money, you need to check your account balance. Let's start by looking up your account balance. Could you please provide the name of the account you're interested in?
> Checking
There's already a speaker: account_balance
Account balance agent selected
Looking up account ID for Checking
Looking up account balance for 1234567890
Account balance lookup is complete
>> Your Checking account has a balance of $1000. Another agent will assist you with transferring money.
Asking the continuation agent to decide what to do next
Continuation agent said "I would like to transfer money."
No current speaker, asking orchestration agent to decide
Transfer money agent selected
账户余额智能体询问您要查询哪个账户,使用工具获取该账户的 ID,然后将自己标记为完成。继续智能体再次启动,看到您仍未完成原始的转账任务,因此再次提示编排智能体。不幸的是,编排智能体稍微有些困惑,循环了两次后才最终将您路由到转账智能体
Money transfer is complete
>> Another agent will assist you with transferring money.
Asking the continuation agent to decide what to do next
Continuation agent said "I would like to transfer money."
No current speaker, asking orchestration agent to decide
Transfer money agent selected
Money transfer is complete
>> Another agent will assist you with transferring money.
Asking the continuation agent to decide what to do next
Continuation agent said "I would like to transfer money."
No current speaker, asking orchestration agent to decide
Orchestrator checking if account has a balance
Transfer money agent selected
>> You have already checked your account balance. Please provide the following details to proceed with the money transfer:
>> 1. The account ID to which you want to transfer the money.
>> 2. The amount you want to transfer.
> To account ID 1234324
There's already a speaker: transfer_money
Transfer money agent selected
>> How much would you like to transfer to account ID 1234324?
> 500
There's already a speaker: transfer_money
Transfer money agent selected
Checking if balance is sufficient
Transferring 500 from 1234567890 account 1234324
Money transfer is complete
>> The transfer of $500 to account ID 1234324 has been successfully completed. If you need any further assistance, feel free to ask!
Asking the continuation agent to decide what to do next
Continuation agent said no_further_tasks
任务已完成!继续智能体发现没有进一步的任务,并将您路由回礼宾智能体。
代码
现在让我们看看实现这一切的代码的一些重点。系统的核心是一个永远运行的中心循环。其核心是一个非常简单的块,它只是询问编排智能体谁应该下一个发言,并设置包含在所有智能体之间传递的状态对象中的 next_speaker
值。请注意,如果已经有子智能体正在发言,则该智能体将继续发言。
current_history = root_memory.get()
# who should speak next?
if (state["current_speaker"]):
print(f"There's already a speaker: {state['current_speaker']}")
next_speaker = state["current_speaker"]
else:
print("No current speaker, asking orchestration agent to decide")
orchestration_response = orchestration_agent_factory(state).chat(
user_msg_str,
chat_history=current_history
)
next_speaker = str(orchestration_response).strip()
编排智能体有一个非常严格的提示;其输出仅发送给其他机器。它包括智能体之间依赖关系的自然语言摘要
system_prompt = (f"""
You are on orchestration agent.
Your job is to decide which agent to run based on the current state of the user and what they've asked to do. Agents are identified by short strings.
What you do is return the name of the agent to run next. You do not do anything else.
The current state of the user is:
{pprint.pformat(state, indent=4)}
If a current_speaker is already selected in the state, simply output that value.
If there is no current_speaker value, look at the chat history and the current state and you MUST return one of these strings identifying an agent to run:
* "{Speaker.STOCK_LOOKUP.value}" - if they user wants to look up a stock price (does not require authentication)
* "{Speaker.AUTHENTICATE.value}" - if the user needs to authenticate
* "{Speaker.ACCOUNT_BALANCE.value}" - if the user wants to look up an account balance
* If they want to look up an account balance, but they haven't authenticated yet, return "{Speaker.AUTHENTICATE.value}" instead
* "{Speaker.TRANSFER_MONEY.value}" - if the user wants to transfer money between accounts (requires authentication and checking an account balance first)
* If they want to transfer money, but is_authenticated returns false, return "{Speaker.AUTHENTICATE.value}" instead
* If they want to transfer money, but has_balance returns false, return "{Speaker.ACCOUNT_BALANCE.value}" instead
* "{Speaker.CONCIERGE.value}" - if the user wants to do something else, or hasn't said what they want to do, or you can't figure out what they want to do. Choose this by default.
Output one of these strings and ONLY these strings, without quotes.
NEVER respond with anything other than one of the above five strings. DO NOT be helpful or conversational.
""")
一个简单的 if-else 代码块获取编排智能体的输出,并用它来实例化下一个要运行的子智能体。此时,状态对象会被传递给每个子智能体
if next_speaker == Speaker.STOCK_LOOKUP:
print("Stock lookup agent selected")
current_speaker = stock_lookup_agent_factory(state)
state["current_speaker"] = next_speaker
elif next_speaker == Speaker.AUTHENTICATE:
print("Auth agent selected")
current_speaker = auth_agent_factory(state)
state["current_speaker"] = next_speaker
elif next_speaker == Speaker.ACCOUNT_BALANCE:
print("Account balance agent selected")
current_speaker = account_balance_agent_factory(state)
state["current_speaker"] = next_speaker
elif next_speaker == Speaker.TRANSFER_MONEY:
print("Transfer money agent selected")
current_speaker = transfer_money_agent_factory(state)
state["current_speaker"] = next_speaker
elif next_speaker == Speaker.CONCIERGE:
print("Concierge agent selected")
current_speaker = concierge_agent_factory(state)
else:
print("Orchestration agent failed to return a valid speaker; ask it to try again")
is_retry = True
continue
然后将完整的聊天历史记录作为常规聊天消息的一部分传递给新实例化的智能体
response = current_speaker.chat(user_msg_str, chat_history=current_history)
智能体读取其提示和用户输入,然后决定说什么。正如我们在第一个代码块中看到的那样,如果发言者已经被选中,则循环将继续与当前的子智能体对话。这会一直持续到子智能体完成其任务为止,此时其提示会指示它调用 done()
函数
def done() -> None:
"""When you complete your task, call this tool."""
print("Money transfer is complete")
state["current_speaker"] = None
state["just_finished"] = True
这会修改状态,将当前发言者设置为 none。这会触发外部循环运行继续智能体,以查看是否有其他需要完成的事项
elif state["just_finished"] == True:
print("Asking the continuation agent to decide what to do next")
user_msg_str = str(continuation_agent_factory(state).chat("""
Look at the chat history to date and figure out what the user was originally trying to do.
They might have had to do some sub-tasks to complete that task, but what we want is the original thing they started out trying to do.
Formulate a sentence as if written by the user that asks to continue that task.
If it seems like the user really completed their task, output "no_further_task" only.
""", chat_history=current_history))
print(f"Continuation agent said {user_msg_str}")
if user_msg_str == "no_further_task":
user_msg_str = input(">> ").strip()
state["just_finished"] = False
继续智能体的提示指示它回复时,应像用户正在请求执行某项任务一样,或者如果没有更多需要完成的事项,则输出 no_further_task
。如果有一个新任务,继续智能体的输出将成为编排器的输入,编排器会选择一个新的发言者。如果没有进一步的任务,循环将暂停等待更多用户输入。
这就是整个系统!子智能体本身可以是任意复杂的、多轮的系统,而外部循环无需知道它们如何工作,只需要知道它们之间如何相互依赖。
下一步
我们认为这里有一些新颖的东西:协调多个智能体同时“对话”,通过自然语言指令创建隐式的智能体“链条”,使用“继续”智能体来管理这些链条,以及以这种方式使用全局状态。我们很高兴看到您将如何运用我们在此提出的模式。别忘了查看开源仓库!