ReAct 架构模式

ReAct（Reasoning + Acting） 是最经典、最广泛使用的 Agent 架构模式，由 Yao et al. 于 2022 年提出。其核心思想是将**推理（Reasoning）与行动（Acting）**交替进行，通过 Thought-Action-Observation 循环实现智能决策。

一、核心原理

1.1 ReAct 的设计哲学

ReAct 的核心思想源于人类的"边思考边行动"模式：

┌─────────────────────────────────────────────────────────────┐
│                    ReAct 设计哲学                            │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   传统 LLM：                                                │
│   ┌─────────┐     ┌─────────┐     ┌─────────┐             │
│   │  输入   │ ──→ │  LLM    │ ──→ │  输出   │             │
│   └─────────┘     └─────────┘     └─────────┘             │
│                   一次性生成，无法与外部交互                  │
│                                                             │
│   ReAct 模式：                                              │
│   ┌─────────┐     ┌─────────┐     ┌─────────┐             │
│   │  输入   │ ──→ │ Thought │ ──→ │ Action  │             │
│   └─────────┘     │ (思考)  │     │ (行动)  │             │
│                   └────┬────┘     └────┬────┘             │
│                        │               │                   │
│                        ↓               ↓                   │
│                   ┌─────────┐     ┌─────────┐             │
│                   │Observation│←──│  执行   │             │
│                   │ (观察)  │    │  工具   │              │
│                   └────┬────┘    └─────────┘              │
│                        │                                   │
│                        ↓                                   │
│                   ┌─────────┐                              │
│                   │ 下一个   │                              │
│                   │ Thought │                              │
│                   └─────────┘                              │
│                   循环直到完成                              │
│                                                             │
└─────────────────────────────────────────────────────────────┘

1.2 T-A-O 循环详解

ReAct 的核心是 Thought-Action-Observation（T-A-O） 循环：

阶段	英文	作用	示例
Thought	思考	分析当前状态，决定下一步	"我需要先搜索北京的天气"
Action	行动	选择并调用工具	`search("北京天气")`
Observation	观察	获取工具执行结果	"北京晴天，15-25°C"

┌─────────────────────────────────────────────────────────────┐
│                    T-A-O 循环示例                            │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   用户问题：北京明天的天气怎么样？适合户外运动吗？           │
│                                                             │
│   ┌─────────────────────────────────────────────────────┐  │
│   │ 第 1 轮 T-A-O                                       │  │
│   │                                                      │  │
│   │ Thought: 用户想知道北京明天的天气和运动建议。       │  │
│   │          我需要先搜索北京明天的天气预报。            │  │
│   │                                                      │  │
│   │ Action: search("北京明天天气")                       │  │
│   │                                                      │  │
│   │ Observation: 北京明天晴转多云，气温18-26°C，        │  │
│   │              风力3级，空气质量良好。                 │  │
│   └─────────────────────────────────────────────────────┘  │
│                          ↓                                  │
│   ┌─────────────────────────────────────────────────────┐  │
│   │ 第 2 轮 T-A-O                                       │  │
│   │                                                      │  │
│   │ Thought: 已获取天气信息：晴转多云，18-26°C。        │  │
│   │          温度适中，适合户外运动。                    │  │
│   │          我需要给出具体的运动建议。                  │  │
│   │                                                      │  │
│   │ Action: give_advice                                  │  │
│   │                                                      │  │
│   │ Observation: 生成建议完成                            │  │
│   └─────────────────────────────────────────────────────┘  │
│                          ↓                                  │
│   ┌─────────────────────────────────────────────────────┐  │
│   │ 最终回答                                             │  │
│   │                                                      │  │
│   │ 北京明天晴转多云，气温 18-26°C，风力 3 级。         │  │
│   │ 天气条件适合户外运动，建议：                         │  │
│   │ • 晨跑或散步：早上气温舒适                           │  │
│   │ • 骑行：微风助力，体感舒适                           │  │
│   │ • 户外瑜伽：多云天气避免暴晒                         │  │
│   │ 记得携带薄外套，早晚温差较大。                       │  │
│   └─────────────────────────────────────────────────────┘  │
│                                                             │
└─────────────────────────────────────────────────────────────┘

1.3 ReAct 与其他模式的对比

对比维度	ReAct	一步执行	纯规划
推理方式	交替推理	一次推理	预先规划
环境交互	实时交互	无交互	执行后交互
错误纠正	即时纠正	无法纠正	批量纠正
可解释性	高	中	中
执行效率	中	高	中
适用场景	通用场景	简单任务	复杂流程

二、工作流程

2.1 完整工作流程图

┌─────────────────────────────────────────────────────────────────────┐
│                    ReAct 完整工作流程                                │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   ┌─────────────┐                                                  │
│   │  用户输入   │                                                  │
│   │  (Query)    │                                                  │
│   └──────┬──────┘                                                  │
│          │                                                         │
│          ↓                                                         │
│   ┌─────────────────────────────────────────────────────────────┐ │
│   │                    ReAct 循环                                │ │
│   │  ┌───────────────────────────────────────────────────────┐  │ │
│   │  │                                                       │  │ │
│   │  │   ┌──────────┐      ┌──────────┐      ┌──────────┐   │  │ │
│   │  │   │  Thought │ ───→ │  Action  │ ───→ │Observation│   │  │ │
│   │  │   │  (思考)  │      │  (行动)  │      │  (观察)  │   │  │ │
│   │  │   └────┬─────┘      └────┬─────┘      └────┬─────┘   │  │ │
│   │  │        │                 │                 │         │  │ │
│   │  │        │    ┌────────────────────────────┐│         │  │ │
│   │  │        │    │       工具执行环境          ││         │  │ │
│   │  │        │    │  ┌──────┐ ┌──────┐ ┌─────┐││         │  │ │
│   │  │        │    │  │搜索  │ │计算  │ │代码 │││         │  │ │
│   │  │        │    │  │工具  │ │工具  │ │执行 │││         │  │ │
│   │  │        │    │  └──────┘ └──────┘ └─────┘││         │  │ │
│   │  │        │    └────────────────────────────┘│         │  │ │
│   │  │        │                 ↑                 │         │  │ │
│   │  │        │                 │                 │         │  │ │
│   │  │        └─────────────────┴─────────────────┘         │  │ │
│   │  │                     循环继续                          │  │ │
│   │  │                                                       │  │ │
│   │  └───────────────────────────────────────────────────────┘  │ │
│   │                            │                                 │ │
│   │                            ↓                                 │ │
│   │                    ┌───────────────┐                        │ │
│   │                    │  完成判断     │                        │ │
│   │                    │  (Finished?)  │                        │ │
│   │                    └───────┬───────┘                        │ │
│   │                            │                                 │ │
│   │              ┌─────────────┴─────────────┐                  │ │
│   │              ↓                           ↓                  │ │
│   │       ┌───────────┐               ┌───────────┐            │ │
│   │       │   是      │               │    否     │            │ │
│   │       │ 继续下一轮 │               │  继续循环 │            │ │
│   │       └─────┬─────┘               └───────────┘            │ │
│   │             │                                               │ │
│   └─────────────┼───────────────────────────────────────────────┘ │
│                 │                                                   │
│                 ↓                                                   │
│   ┌─────────────────────┐                                          │
│   │   生成最终回答      │                                          │
│   │   (Final Answer)    │                                          │
│   └─────────────────────┘                                          │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

2.2 状态转换图

┌─────────────────────────────────────────────────────────────┐
│                    ReAct 状态转换                            │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│                    ┌─────────┐                              │
│          ┌────────│  START  │────────┐                     │
│          │        └────┬────┘        │                     │
│          │             │             │                     │
│          ↓             ↓             ↓                     │
│    ┌──────────┐  ┌──────────┐  ┌──────────┐               │
│    │ THOUGHT  │  │  ACTION  │  │OBSERVATION│               │
│    │  状态    │  │  状态    │  │   状态   │                │
│    └────┬─────┘  └────┬─────┘  └────┬─────┘               │
│         │             │             │                      │
│         │             │             │                      │
│         ↓             ↓             ↓                      │
│    ┌──────────┐  ┌──────────┐  ┌──────────┐               │
│    │ 生成思考 │  │ 选择工具 │  │ 获取结果 │               │
│    │ 内容     │  │ 调用执行 │  │ 解析反馈 │               │
│    └────┬─────┘  └────┬─────┘  └────┬─────┘               │
│         │             │             │                      │
│         │             │             │                      │
│         └─────────────┼─────────────┘                      │
│                       │                                     │
│                       ↓                                     │
│              ┌─────────────────┐                           │
│              │   循环或结束    │                           │
│              └────────┬────────┘                           │
│                       │                                     │
│         ┌─────────────┼─────────────┐                      │
│         ↓             ↓             ↓                      │
│    ┌────────┐   ┌──────────┐  ┌──────────┐                │
│    │ 继续   │   │   结束   │  │  错误    │                │
│    │ 循环   │   │ 返回答案 │  │  处理    │                │
│    └────────┘   └──────────┘  └──────────┘                │
│                                                             │
└─────────────────────────────────────────────────────────────┘

三、代码实现

3.1 基础 ReAct Agent 实现

"""
ReAct Agent 基础实现
基于 LangChain 框架实现 Thought-Action-Observation 循环
"""
 
from typing import List, Optional, Dict, Any
from dataclasses import dataclass
from enum import Enum
import re
 
# 导入 LangChain 相关模块
from langchain.llms import BaseLLM
from langchain.tools import BaseTool
from langchain.prompts import PromptTemplate
 
 
class AgentState(Enum):
    """Agent 状态枚举"""
    THINKING = "thinking"      # 思考状态
    ACTING = "acting"          # 行动状态
    OBSERVING = "observing"    # 观察状态
    FINISHED = "finished"      # 完成状态
 
 
@dataclass
class AgentStep:
    """Agent 执行步骤记录"""
    thought: str              # 思考内容
    action: str               # 行动名称
    action_input: str         # 行动输入
    observation: str          # 观察结果
 
 
class ReActAgent:
    """
    ReAct Agent 实现
    核心功能：交替进行 Thought-Action-Observation 循环
    """
    
    # ReAct 提示词模板
    REACT_PROMPT = """
你是一个智能助手，使用 ReAct 模式回答问题。
 
你可以使用以下工具：
{tool_names}
 
工具描述：
{tool_descriptions}
 
使用以下格式回答：
 
Question: 用户的问题
Thought: 你应该思考做什么
Action: 要采取的行动，应该是 [{tool_names}] 中的一个
Action Input: 行动的输入
Observation: 行动的结果
... (这个 Thought/Action/Action Input/Observation 可以重复N次)
Thought: 我现在知道最终答案了
Final Answer: 对原始问题的最终答案
 
开始！
 
Question: {input}
Thought: {agent_scratchpad}
"""
    
    def __init__(
        self,
        llm: BaseLLM,
        tools: List[BaseTool],
        max_iterations: int = 10,
        verbose: bool = True
    ):
        """
        初始化 ReAct Agent
        
        Args:
            llm: 大语言模型实例
            tools: 可用工具列表
            max_iterations: 最大迭代次数
            verbose: 是否打印详细日志
        """
        self.llm = llm
        self.tools = {tool.name: tool for tool in tools}
        self.max_iterations = max_iterations
        self.verbose = verbose
        self.state = AgentState.THINKING
        self.history: List[AgentStep] = []
        
    def _build_prompt(self, input_text: str, scratchpad: str = "") -> str:
        """构建提示词"""
        tool_names = ", ".join(self.tools.keys())
        tool_descriptions = "\n".join([
            f"- {name}: {tool.description}"
            for name, tool in self.tools.items()
        ])
        
        return self.REACT_PROMPT.format(
            tool_names=tool_names,
            tool_descriptions=tool_descriptions,
            input=input_text,
            agent_scratchpad=scratchpad
        )
    
    def _parse_action(self, text: str) -> tuple[Optional[str], Optional[str]]:
        """
        解析 LLM 输出中的 Action 和 Action Input
        
        Returns:
            (action_name, action_input) 或 (None, None)
        """
        # 使用正则表达式提取 Action
        action_pattern = r"Action:\s*(.+)"
        action_match = re.search(action_pattern, text)
        
        if action_match:
            action = action_match.group(1).strip()
            
            # 提取 Action Input
            input_pattern = r"Action Input:\s*(.+?)(?=Observation:|$)"
            input_match = re.search(input_pattern, text, re.DOTALL)
            
            action_input = input_match.group(1).strip() if input_match else ""
            return action, action_input
        
        return None, None
    
    def _parse_final_answer(self, text: str) -> Optional[str]:
        """解析最终答案"""
        pattern = r"Final Answer:\s*(.+)$"
        match = re.search(pattern, text, re.DOTALL)
        return match.group(1).strip() if match else None
    
    def _execute_tool(self, action: str, action_input: str) -> str:
        """执行工具调用"""
        if action not in self.tools:
            return f"错误：未知工具 '{action}'，可用工具：{list(self.tools.keys())}"
        
        try:
            tool = self.tools[action]
            result = tool.run(action_input)
            return str(result)
        except Exception as e:
            return f"工具执行错误：{str(e)}"
    
    def _format_scratchpad(self) -> str:
        """格式化历史记录为 scratchpad"""
        scratchpad = ""
        for step in self.history:
            scratchpad += f"\nThought: {step.thought}"
            scratchpad += f"\nAction: {step.action}"
            scratchpad += f"\nAction Input: {step.action_input}"
            scratchpad += f"\nObservation: {step.observation}"
        return scratchpad
    
    def run(self, input_text: str) -> str:
        """
        运行 ReAct Agent
        
        Args:
            input_text: 用户输入的问题
            
        Returns:
            最终答案
        """
        scratchpad = ""
        
        for iteration in range(self.max_iterations):
            if self.verbose:
                print(f"\n{'='*50}")
                print(f"第 {iteration + 1} 轮迭代")
                print('='*50)
            
            # 构建提示词
            prompt = self._build_prompt(input_text, scratchpad)
            
            # 调用 LLM 生成
            response = self.llm(prompt)
            
            if self.verbose:
                print(f"\n[LLM 响应]\n{response}")
            
            # 检查是否有最终答案
            final_answer = self._parse_final_answer(response)
            if final_answer:
                if self.verbose:
                    print(f"\n[最终答案]\n{final_answer}")
                return final_answer
            
            # 解析 Action
            action, action_input = self._parse_action(response)
            
            if action is None:
                # 无法解析 Action，可能是格式错误
                scratchpad += f"\nThought: {response}\n"
                scratchpad += "Observation: 格式错误，请使用正确的 Action 格式\n"
                continue
            
            # 执行工具
            if self.verbose:
                print(f"\n[执行工具] Action: {action}, Input: {action_input}")
            
            observation = self._execute_tool(action, action_input)
            
            if self.verbose:
                print(f"\n[观察结果] {observation}")
            
            # 记录步骤
            thought_match = re.search(r"Thought:\s*(.+?)(?=Action:)", response, re.DOTALL)
            thought = thought_match.group(1).strip() if thought_match else ""
            
            step = AgentStep(
                thought=thought,
                action=action,
                action_input=action_input,
                observation=observation
            )
            self.history.append(step)
            
            # 更新 scratchpad
            scratchpad = self._format_scratchpad()
        
        return "达到最大迭代次数，未能完成任务"
 
 
# 使用示例
if __name__ == "__main__":
    from langchain.tools import Tool
    from langchain_openai import OpenAI
    
    # 定义工具
    def search_tool(query: str) -> str:
        """模拟搜索工具"""
        # 实际应用中这里会调用真实的搜索 API
        mock_results = {
            "北京天气": "北京今天晴，气温 15-25°C",
            "上海天气": "上海今天多云，气温 18-28°C",
        }
        return mock_results.get(query, f"未找到 '{query}' 的搜索结果")
    
    def calculator_tool(expression: str) -> str:
        """计算器工具"""
        try:
            result = eval(expression)
            return str(result)
        except Exception as e:
            return f"计算错误：{str(e)}"
    
    # 创建工具列表
    tools = [
        Tool(
            name="Search",
            func=search_tool,
            description="搜索互联网获取信息，输入搜索关键词"
        ),
        Tool(
            name="Calculator",
            func=calculator_tool,
            description="执行数学计算，输入数学表达式"
        )
    ]
    
    # 创建 Agent
    llm = OpenAI(temperature=0)
    agent = ReActAgent(llm=llm, tools=tools, verbose=True)
    
    # 运行
    result = agent.run("北京今天天气怎么样？")
    print(f"\n最终结果：{result}")

3.2 使用 LangChain 实现

"""
使用 LangChain Agent 实现 ReAct 模式
推荐的生产环境实现方式
"""
 
from langchain.agents import AgentExecutor, create_react_agent
from langchain_openai import ChatOpenAI
from langchain.tools import Tool
from langchain import hub
 
 
def create_react_agent_executor(
    model_name: str = "gpt-4",
    temperature: float = 0
) -> AgentExecutor:
    """
    创建 ReAct Agent 执行器
    
    Args:
        model_name: 模型名称
        temperature: 温度参数
        
    Returns:
        配置好的 AgentExecutor
    """
    
    # 初始化 LLM
    llm = ChatOpenAI(
        model=model_name,
        temperature=temperature
    )
    
    # 定义工具
    tools = [
        Tool(
            name="搜索",
            func=lambda q: f"搜索结果：{q}",
            description="搜索互联网获取实时信息"
        ),
        Tool(
            name="计算器",
            func=lambda e: str(eval(e)),
            description="执行数学计算"
        )
    ]
    
    # 获取 ReAct 提示词模板
    prompt = hub.pull("hwchase17/react")
    
    # 创建 Agent
    agent = create_react_agent(
        llm=llm,
        tools=tools,
        prompt=prompt
    )
    
    # 创建执行器
    agent_executor = AgentExecutor(
        agent=agent,
        tools=tools,
        verbose=True,
        max_iterations=10,
        handle_parsing_errors=True  # 处理解析错误
    )
    
    return agent_executor
 
 
# 使用示例
if __name__ == "__main__":
    executor = create_react_agent_executor()
    
    result = executor.invoke({
        "input": "北京和上海今天的天气对比一下"
    })
    
    print(result["output"])

3.3 使用 LangGraph 实现

"""
使用 LangGraph 实现 ReAct 模式
更灵活的状态管理和流程控制
"""
 
from typing import TypedDict, Annotated, Sequence
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langchain_openai import ChatOpenAI
from langchain.tools import tool
 
 
# 定义状态
class AgentState(TypedDict):
    """Agent 状态定义"""
    messages: list          # 消息历史
    thoughts: list          # 思考记录
    current_step: str       # 当前步骤
 
 
# 定义工具
@tool
def search(query: str) -> str:
    """搜索互联网获取信息"""
    # 模拟搜索结果
    results = {
        "Python": "Python 是一种高级编程语言，由 Guido van Rossum 于 1991 年创建",
        "AI": "人工智能（AI）是计算机科学的一个分支，致力于创建智能机器"
    }
    for key, value in results.items():
        if key.lower() in query.lower():
            return value
    return f"未找到关于 '{query}' 的信息"
 
 
@tool
def calculate(expression: str) -> str:
    """执行数学计算"""
    try:
        return str(eval(expression))
    except Exception as e:
        return f"计算错误：{str(e)}"
 
 
def create_react_graph():
    """创建 ReAct 图"""
    
    # 初始化 LLM
    llm = ChatOpenAI(model="gpt-4", temperature=0)
    
    # 绑定工具
    tools = [search, calculate]
    llm_with_tools = llm.bind_tools(tools)
    
    # 定义节点函数
    def agent_node(state: AgentState):
        """Agent 节点：生成思考或决定行动"""
        response = llm_with_tools.invoke(state["messages"])
        return {"messages": [response]}
    
    def should_continue(state: AgentState):
        """判断是否继续"""
        last_message = state["messages"][-1]
        if last_message.tool_calls:
            return "tools"
        return END
    
    # 创建图
    workflow = StateGraph(AgentState)
    
    # 添加节点
    workflow.add_node("agent", agent_node)
    workflow.add_node("tools", ToolNode(tools))
    
    # 设置入口
    workflow.set_entry_point("agent")
    
    # 添加边
    workflow.add_conditional_edges(
        "agent",
        should_continue,
        {
            "tools": "tools",
            END: END
        }
    )
    workflow.add_edge("tools", "agent")
    
    return workflow.compile()
 
 
# 使用示例
if __name__ == "__main__":
    graph = create_react_graph()
    
    result = graph.invoke({
        "messages": [
            {"role": "user", "content": "Python 是什么？它是什么时候创建的？"}
        ],
        "thoughts": [],
        "current_step": "start"
    })
    
    print(result["messages"][-1].content)

四、适用场景

4.1 最佳适用场景

场景类型	具体示例	ReAct 优势
知识问答	实时信息查询	可搜索获取最新信息
多步计算	复杂数学问题	逐步推理，中间验证
网页导航	自动化测试	实时观察页面状态
数据分析	数据查询分析	交互式探索数据
代码调试	Bug 定位修复	逐步排查问题

4.2 场景详解

┌─────────────────────────────────────────────────────────────┐
│                    ReAct 适用场景详解                        │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  1. 知识问答场景                                            │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ 用户问题："特斯拉最新股价是多少？"                    │   │
│  │                                                      │   │
│  │ Thought: 需要搜索特斯拉最新股价                      │   │
│  │ Action: search("特斯拉股价 今日")                     │   │
│  │ Observation: TSLA 当前股价 $248.50                    │   │
│  │ Thought: 已获取股价信息                              │   │
│  │ Final Answer: 特斯拉(TSLA)当前股价为 $248.50         │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  2. 多步计算场景                                            │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ 用户问题："小明有100元，买了3本书每本15元，         │   │
│  │          还剩多少钱？"                               │   │
│  │                                                      │   │
│  │ Thought: 需要计算总花费                              │   │
│  │ Action: calculate("3 * 15")                          │   │
│  │ Observation: 45                                      │   │
│  │ Thought: 需要计算剩余金额                            │   │
│  │ Action: calculate("100 - 45")                        │   │
│  │ Observation: 55                                      │   │
│  │ Final Answer: 小明还剩 55 元                         │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  3. 数据分析场景                                            │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ 用户问题："分析销售数据，找出销量最高的产品"         │   │
│  │                                                      │   │
│  │ Thought: 需要先读取销售数据文件                      │   │
│  │ Action: read_file("sales.csv")                       │   │
│  │ Observation: [数据内容...]                            │   │
│  │ Thought: 需要对数据按产品分组统计                    │   │
│  │ Action: python_execute("groupby analysis")           │   │
│  │ Observation: 产品A: 1000, 产品B: 1500, 产品C: 800    │   │
│  │ Final Answer: 销量最高的是产品B，销量1500            │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
└─────────────────────────────────────────────────────────────┘

4.3 不适用场景

场景	原因	推荐替代方案
批量处理	效率低，每次都要推理	DAG 工作流
固定流程	不需要动态决策	状态机架构
实时响应	循环次数多导致延迟	一步执行模式
大规模系统	单 Agent 能力有限	多 Agent 协作

五、局限性与优化

5.1 主要局限性

局限性	具体表现	影响
循环开销	每步都需要 LLM 调用	响应延迟
上下文爆炸	历史记录越来越长	Token 消耗
规划能力弱	只能一步一步思考	复杂任务效率低
错误传播	前序错误影响后续	任务失败率高

5.2 优化策略

┌─────────────────────────────────────────────────────────────┐
│                    ReAct 优化策略                            │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  1. 上下文压缩                                              │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ 问题：历史记录过长导致 Token 消耗大                   │   │
│  │                                                      │   │
│  │ 优化方案：                                           │   │
│  │ • 只保留最近 N 轮的 Thought-Observation             │   │
│  │ • 使用摘要模型压缩历史记录                           │   │
│  │ • 关键信息提取后丢弃原始文本                         │   │
│  │                                                      │   │
│  │ 示例：                                               │   │
│  │ 原始：Thought: xxx... Observation: 长文本...         │   │
│  │ 压缩：[已搜索北京天气，结果：晴天15-25°C]            │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  2. 提前终止                                                │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ 问题：不必要的循环浪费资源                           │   │
│  │                                                      │   │
│  │ 优化方案：                                           │   │
│  │ • 设置置信度阈值，达到即可终止                       │   │
│  │ • 检测到重复 Action 时强制终止                       │   │
│  │ • 用户意图已满足时提前返回                           │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  3. 工具调用优化                                            │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ 问题：工具调用失败率高                               │   │
│  │                                                      │   │
│  │ 优化方案：                                           │   │
│  │ • 工具描述更详细，减少误用                           │   │
│  │ • 参数校验和自动修复                                 │   │
│  │ • 工具调用缓存，避免重复调用                         │   │
│  │ • 并行调用无依赖的工具                               │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  4. 错误恢复                                                │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ 问题：执行错误导致任务失败                           │   │
│  │                                                      │   │
│  │ 优化方案：                                           │   │
│  │ • 错误信息作为 Observation 反馈给 LLM               │   │
│  │ • 提供替代工具建议                                   │   │
│  │ • 自动重试机制                                       │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
└─────────────────────────────────────────────────────────────┘

5.3 性能优化代码示例

class OptimizedReActAgent(ReActAgent):
    """
    优化版 ReAct Agent
    添加上下文压缩、提前终止、缓存等优化
    """
    
    def __init__(
        self,
        llm,
        tools,
        max_iterations: int = 10,
        max_history: int = 5,      # 最多保留的历史轮数
        enable_cache: bool = True,  # 启用缓存
        confidence_threshold: float = 0.9  # 置信度阈值
    ):
        super().__init__(llm, tools, max_iterations)
        self.max_history = max_history
        self.enable_cache = enable_cache
        self.confidence_threshold = confidence_threshold
        self.tool_cache = {}  # 工具调用缓存
        
    def _compress_history(self) -> str:
        """压缩历史记录"""
        if len(self.history) <= self.max_history:
            return self._format_scratchpad()
        
        # 只保留最近的 N 轮
        recent_history = self.history[-self.max_history:]
        
        # 对早期历史进行摘要
        early_history = self.history[:-self.max_history]
        summary = self._summarize_history(early_history)
        
        # 组合返回
        compressed = f"[历史摘要]: {summary}\n"
        for step in recent_history:
            compressed += f"\nThought: {step.thought}"
            compressed += f"\nAction: {step.action}"
            compressed += f"\nObservation: {step.observation}"
        
        return compressed
    
    def _summarize_history(self, history: List[AgentStep]) -> str:
        """使用 LLM 摘要历史记录"""
        if not history:
            return ""
        
        # 构建摘要提示词
        summary_prompt = "请用一句话总结以下执行历史的关键信息：\n"
        for step in history:
            summary_prompt += f"- {step.action}: {step.observation[:50]}...\n"
        
        return self.llm(summary_prompt)
    
    def _execute_tool_with_cache(self, action: str, action_input: str) -> str:
        """带缓存的工具执行"""
        cache_key = f"{action}:{action_input}"
        
        if self.enable_cache and cache_key in self.tool_cache:
            return self.tool_cache[cache_key]
        
        result = self._execute_tool(action, action_input)
        
        if self.enable_cache:
            self.tool_cache[cache_key] = result
        
        return result
    
    def _should_early_stop(self, response: str) -> bool:
        """判断是否应该提前终止"""
        # 检测重复 Action
        if len(self.history) > 0:
            last_action = self.history[-1].action
            current_action, _ = self._parse_action(response)
            if current_action == last_action:
                return True  # 重复 Action，提前终止
        
        # 检测置信度
        if "确定" in response or "肯定" in response:
            return True
        
        return False
    
    def run(self, input_text: str) -> str:
        """优化后的运行方法"""
        scratchpad = ""
        
        for iteration in range(self.max_iterations):
            # 使用压缩后的历史
            if len(self.history) > self.max_history:
                scratchpad = self._compress_history()
            
            prompt = self._build_prompt(input_text, scratchpad)
            response = self.llm(prompt)
            
            # 检查最终答案
            final_answer = self._parse_final_answer(response)
            if final_answer:
                return final_answer
            
            # 检查是否提前终止
            if self._should_early_stop(response):
                return self._generate_best_effort_answer(response)
            
            # 解析并执行 Action
            action, action_input = self._parse_action(response)
            if action is None:
                continue
            
            # 使用缓存执行工具
            observation = self._execute_tool_with_cache(action, action_input)
            
            # 记录步骤
            self.history.append(AgentStep(
                thought="",  # 简化版
                action=action,
                action_input=action_input,
                observation=observation
            ))
            
            scratchpad = self._compress_history()
        
        return "达到最大迭代次数"

六、面试常见问题

Q1: 什么是 ReAct 模式？它的核心思想是什么？

ReAct（Reasoning + Acting） 是一种将推理和行动交替进行的 Agent 架构模式。

核心思想：

推理（Reasoning）：LLM 生成思考过程，分析当前状态
行动（Acting）：根据思考选择并执行工具
观察（Observation）：获取行动结果，更新状态
循环：重复以上过程直到任务完成

关键优势：

推理过程可解释
执行过程可追溯
错误可及时发现和纠正

Q2: ReAct 与 Chain-of-Thought（CoT）有什么区别？

对比维度	ReAct	CoT
核心机制	思考 + 行动交替	纯思考链
外部交互	支持工具调用	无外部交互
执行方式	多步循环	一次生成
适用场景	需要外部信息的任务	纯推理任务
可解释性	高（Thought 可见）	高

示例对比：

问题：北京今天天气适合户外运动吗？

CoT 方式：
思考：北京今天的天气...我需要基于常识判断...
结论：一般春季适合户外运动（缺乏实时信息）

ReAct 方式：
Thought: 需要获取北京今天的实时天气
Action: search("北京今天天气")
Observation: 北京今天晴，15-25°C，空气质量优
Thought: 天气条件很好，适合户外运动
Final Answer: 适合，建议晨跑或骑行

Q3: ReAct Agent 如何处理工具调用失败的情况？

处理策略：

def handle_tool_failure(error, agent):
    """工具调用失败处理"""
    
    # 1. 错误信息作为 Observation 反馈
    observation = f"工具调用失败：{error}"
    
    # 2. LLM 根据错误信息决定下一步
    #    - 重试（修改参数）
    #    - 更换工具
    #    - 降级处理
    #    - 报告失败
    
    return observation

实际流程：

Action: api_call("https://example.com")
Observation: 网络超时错误
Thought: API 调用失败，尝试使用备用数据源
Action: search("相关数据")
Observation: 搜索成功...

Q4: ReAct 模式的主要局限性有哪些？如何优化？

主要局限性：

局限性	说明	优化方案
循环开销	每步都要 LLM 调用	减少不必要的循环
上下文爆炸	历史记录过长	历史压缩/摘要
规划能力弱	逐步思考效率低	结合 Plan-Execute
错误传播	前序错误影响后续	错误隔离和恢复

优化代码示例：

# 上下文压缩
def compress_history(history, max_length=5):
    if len(history) <= max_length:
        return history
    return history[-max_length:]
 
# 提前终止
def should_stop(history):
    if len(history) > 2 and history[-1].action == history[-2].action:
        return True  # 重复 Action，强制终止
    return False

Q5: 如何评估 ReAct Agent 的性能？

评估指标：

指标	计算方式	说明
任务成功率	成功任务数 / 总任务数	核心指标
平均步数	总步数 / 任务数	效率指标
工具准确率	正确工具调用数 / 总调用数	决策质量
响应延迟	平均响应时间	用户体验
Token 消耗	平均 Token 数	成本指标

评估方法：

def evaluate_react_agent(agent, test_cases):
    results = []
    for case in test_cases:
        start_time = time.time()
        answer = agent.run(case["input"])
        end_time = time.time()
        
        results.append({
            "success": answer == case["expected"],
            "steps": len(agent.history),
            "time": end_time - start_time,
            "tokens": count_tokens(agent.history)
        })
    
    return {
        "success_rate": sum(r["success"] for r in results) / len(results),
        "avg_steps": sum(r["steps"] for r in results) / len(results),
        "avg_time": sum(r["time"] for r in results) / len(results)
    }

Q6: ReAct 适合处理哪些类型的任务？不适合哪些？

适合的任务类型：

✅ 知识问答（需要搜索实时信息）
✅ 多步计算（中间结果验证）
✅ 网页导航（需要观察页面状态）
✅ 数据分析（交互式探索）
✅ 代码调试（逐步排查）

不适合的任务类型：

❌ 批量处理任务（效率低）
❌ 固定流程任务（不需要动态决策）
❌ 实时响应场景（延迟要求高）
❌ 大规模系统（单 Agent 能力有限）

Q7: ReAct 如何与 RAG 结合？

结合方式：

class RAGReActAgent(ReActAgent):
    """RAG + ReAct 结合的 Agent"""
    
    def __init__(self, llm, tools, vector_store):
        super().__init__(llm, tools)
        self.vector_store = vector_store
    
    def _retrieve_context(self, query: str) -> str:
        """从向量数据库检索相关上下文"""
        docs = self.vector_store.similarity_search(query, k=3)
        return "\n".join([doc.page_content for doc in docs])
    
    def _build_prompt(self, input_text: str, scratchpad: str) -> str:
        """构建带 RAG 上下文的提示词"""
        # 检索相关上下文
        context = self._retrieve_context(input_text)
        
        prompt = f"""
相关上下文：
{context}
 
问题：{input_text}
{scratchpad}
"""
        return prompt

优势：

RAG 提供 knowledge grounding
ReAct 提供执行能力
两者结合增强 Agent 的知识应用能力

七、总结

概念	一句话总结	面试关键词
ReAct 模式	推理与行动交替进行的 Agent 架构	T-A-O 循环
Thought	LLM 生成的思考过程	推理、分析
Action	选择并调用工具执行操作	工具调用、决策
Observation	获取工具执行结果	状态更新、反馈
适用场景	需要外部信息交互的任务	知识问答、数据分析
局限性	循环开销、上下文爆炸	需要优化策略

一句话总结：ReAct 是最经典的 Agent 架构模式，通过 Thought-Action-Observation 循环实现推理与行动的交替进行，是理解 Agent 工作原理的基础。

最后更新：2026年3月18日

章节概览 Plan-and-Execute 模式