解密prompt系列62. Agent Memory一覽 - MATTS & CFGM & MIRIX

最近Agent Memory的論文如雨后春筍，我們將重點分析三篇代表性工作： - CFGM：離線軌跡經(jīng)驗提取 - ReasoningBank：軌跡經(jīng)驗提取和test-time scaling結(jié)合 - MIRIX：提供完整記憶工程方案和全面記憶分類

今天我們再來聊聊AI智能體中至關(guān)重要的組件——記憶系統(tǒng)，它能有效避免的Agent像只只有7秒記憶的金魚，不斷重復(fù)錯誤，循環(huán)往復(fù)。

記憶的兩種面孔：LLM Memory vs Agent Memory

之前我們探討過Mem0和LlamaIndex對大模型記憶的工程化實現(xiàn)，但這兩個庫更側(cè)重于LLM Memory而非Agent Memory。這兩者有何不同？本質(zhì)上Agent Memory是包含了LLM Memory的。那增量的差異來自

LLM Memory：更像是事實備忘錄，記錄對話中的具體事實和場景信息
Agent Memory：更像是經(jīng)驗筆記本，記錄執(zhí)行軌跡和從歷史行動中提煉的智慧

Agent Memory的三大經(jīng)驗維度

??? Tool-工具使用經(jīng)驗：智能體在使用各種工具過程中積累的心得體會

比如：每個API能解決什么問題？如何使用搜索API查詢效果更好？

?? Envirment-環(huán)境適應(yīng)經(jīng)驗：面對不同環(huán)境時，如何組合使用工具的智慧

比如：在復(fù)雜網(wǎng)絡(luò)環(huán)境下，應(yīng)該優(yōu)先使用哪些輕量級工具

?? Observation-觀察反饋經(jīng)驗：根據(jù)歷史執(zhí)行結(jié)果優(yōu)化后續(xù)行動的決策模式

比如：某些錯誤信息通常意味著需要重試，而非立即放棄

有效記憶帶來的雙重價值

?? 更少的執(zhí)行步驟

減少環(huán)境探索：熟悉地形就不需要步步為營
減少試錯成本：知道什么方法有效，什么會失敗
快速問題定位：歷史失敗經(jīng)驗讓debug更高效
減少過度思考：成熟的解決方案無需反復(fù)推敲

?? 更高的成功率
本質(zhì)上，如果給模型無限的時間和資源，任務(wù)完成率其實很高。多數(shù)失敗源于現(xiàn)實約束：有限的循環(huán)次數(shù)、Token限制和上下文長度。因此，減少步驟直接提升了成功率。

最近Agent Memory的論文如雨后春筍，但重復(fù)度較高。我們將重點分析三篇代表性工作：

CFGM：離線軌跡經(jīng)驗提取
ReasoningBank：軌跡經(jīng)驗提取和test-time scaling結(jié)合
MIRIX：提供完整記憶工程方案和全面記憶分類

?? CFGM：從執(zhí)行軌跡中提煉多粒度記憶

Coarse-to-Fine Grounded Memory for LLM Agent Planning

這篇論文對如何從軌跡中提取多粒度記憶給出了有操作性的方案，有幾個思路值得一看。

記憶收集和壓縮經(jīng)歷兩個離線步驟和一個在線步驟，讓我們一探究竟。

第一步：粗粒度焦點——經(jīng)驗收集的“戰(zhàn)略指南針”

傳統(tǒng)的離線軌跡收集多讓智能體隨機探索同一任務(wù)，但CFGM引入了任務(wù)焦點(Focus Point)這一創(chuàng)新概念。** 模型會先基于任務(wù)描述和任務(wù)示例去對任務(wù)進行系統(tǒng)性的分析，提煉完成任務(wù)的指導(dǎo)原則作為最粗粒度的Tips。隨后這些Tips會作為模型上文,讓模型更有針對性地收集每個任務(wù)的多條執(zhí)行路徑。

例如對于細(xì)粒度搜索問題"I'm looking for hair treatments that are sulfate and paraben free and are of high quality too. I need ti in bottle for with 60 capusled and price lower than 50 dollars."

粗粒度提示會包括"Use a Detailed search query that includes specific attributes of the product you are looking for"

有趣的是，焦點概念只用于離線收集，在線執(zhí)行中并未使用。

第二步：混合粒度提示——成功與失敗的“經(jīng)驗結(jié)晶”

基于收集到的多條執(zhí)行路徑，CFGM和Memp都采用了相似的路徑對比經(jīng)驗總結(jié)方案：

對比中學(xué)習(xí)（提煉防錯與成功的關(guān)鍵）
這種方式專門針對那些既有成功軌跡又有失敗軌跡的任務(wù)。通過對比，可以清晰地看出“做什么會失敗”以及“做什么才能成功”。

任務(wù)：在WebShop中尋找一款特定的護發(fā)產(chǎn)品。

失敗軌跡：Agent使用了過于簡單的搜索詞，導(dǎo)致結(jié)果不相關(guān)。

成功軌跡：Agent使用了包含多個關(guān)鍵屬性（如“sulfate paraben free”、“bottle”、“60 capsules”）的詳細(xì)搜索詞。

生成的提示（細(xì)粒度）：“使用包含產(chǎn)品具體屬性（如無硫酸鹽、瓶裝、60粒）的詳細(xì)搜索查詢?！?這是一個非常具體、可立即執(zhí)行的操作建議。

從純成功中升華（提煉高階策略）
這種方式針對那些一次就成功的任務(wù)。由于沒有失敗作為對比，提煉的重點在于總結(jié)成功的核心要素和可推廣的策略。

第三步：細(xì)粒度關(guān)鍵信息支持自適應(yīng)規(guī)劃

前兩步離線完成，第三步是在線執(zhí)行中的經(jīng)驗應(yīng)用。任務(wù)開始時，基于任務(wù)描述通過向量檢索尋找相似歷史任務(wù)，將其成功軌跡和混合粒度提示作為上文。執(zhí)行中遇到失敗時，采用兩步反思機制：

KIE：基于任務(wù)目標(biāo)，初始環(huán)境，之前任務(wù)的完成過程抽取任務(wù)關(guān)鍵信息。例如
KIR：基于KIE，歷史相似任務(wù)的成功路徑和當(dāng)前路徑，先進行自我反思提問，再思考解決方案。

?? ReasoningBank: 記憶與測試時擴展的完美融合

ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory

對比CFGM先離線構(gòu)建記憶再在在線推理時檢索使用，谷歌這篇ReasoningBank直接用于在線推理，并重點關(guān)注把如何使用推理擴展策略進一步提升記憶效果

抽取記憶

先來說下記憶Schema的設(shè)計，對比CFGM直接使用展平的Tips作為記憶，ReasoningBank的記憶項設(shè)計更加結(jié)構(gòu)化，包含三個字段

標(biāo)題：策略的核心標(biāo)識，如"分頁導(dǎo)航策略"
描述：一句話總結(jié)策略精髓
內(nèi)容：詳細(xì)的推理步驟和決策依據(jù)

抽取出的記憶會直接增量寫入的ReasoningBank，論文并沒有嘗試?yán)缬洃浐喜?、記憶更新之類的剪枝策略。用于檢索的向量是使用Input Query構(gòu)建并用于后續(xù)的記憶檢索。（這里的簡單設(shè)計是為了后面更突出Test-Time-Scaling對于記憶的加成）

MATTS：記憶驅(qū)動的測試時擴展

重點來了！這是論文最亮眼的創(chuàng)新——Memory-aware Test-Time Scaling (MATTS)。傳統(tǒng)測試時擴展只是簡單增加計算資源，而MATTS讓記憶系統(tǒng)與擴展過程產(chǎn)生了美妙的化學(xué)反應(yīng)。論文嘗試了兩種擴展策略分別是

Parallel Scaling: 通過并行推理得到多個軌跡，并有效使用Self-Contrast進行軌跡對比從而得到
Sequential Scaling：通過在串行推理過程中不斷使用Self-Refine進行推理優(yōu)化。最終抽取記憶時會使用中間反思的全部內(nèi)容。

以上兩種策略對應(yīng)的System prompt如下

論文在Webarena上實驗了，兩種test-time-scaling策略，并發(fā)現(xiàn)二者都進一步提升原有memory的使用效果，并切均有較好的擴展效應(yīng)，其中Parallel策略的效果增益衰減的更慢，擴展效應(yīng)更好?？赡艿靡嬗诟嗖⑿胁呗詭淼母S富的空間探索和多樣性。

??? MIRIX：工程視角的多種記憶類型

https://github.com/Mirix-AI/MIRIX

MIRIX: Multi-Agent Memory System for LLM-Based Agents

MIRIX的亮點在于提供了全面的記憶分類，融合了LLM的事實記憶和Agent的軌跡記憶，還考慮了多模態(tài)文件和隱私安全。讓我們先從分類入手，再透過代碼看每種記憶的存儲與獲取。

MIRIX 的記憶是如何“理解”和“分類”信息的？

早期的COLA論文將記憶分為情景、語義和程序化記憶，MIRIX在此基礎(chǔ)上擴展為六大類：

核心記憶（Core Memory）- 你的“身份檔案”
存儲最基礎(chǔ)、最持久的信息：姓名、喜好、習(xí)慣等，確保每次互動都“認(rèn)得你”。
情景記憶（Episodic Memory）- 你的“生活日記”
記錄帶時間戳的事件：某次對話、會議或任務(wù)完成情況，支持“我上周三做了什么？”這類查詢。
語義記憶（Semantic Memory） - 你的“知識圖譜”
存儲抽象概念和事實：“J.K.羅琳是《哈利·波特》的作者”，幫助你理解世界和關(guān)系網(wǎng)絡(luò)。
程序記憶（Procedural Memory）- 你的“技能手冊”
存放操作流程：“如何報銷差旅費”、“怎么設(shè)置會議提醒”，引導(dǎo)復(fù)雜任務(wù)完成。
資源記憶（Resource Memory）- 你的“個人文件柜”
保存接觸過的文檔、圖片、語音轉(zhuǎn)錄等非結(jié)構(gòu)化內(nèi)容，方便隨時查閱。
知識庫（Knowledge Vault） - 你的“保險箱”
專門存放敏感信息：密碼、地址、聯(lián)系方式等，嚴(yán)格保護，安全調(diào)用。

基于Agent的多分類記憶存儲

MIRIX使用PostgreSQL作為存儲后端，為六種記憶設(shè)計了不同的表結(jié)構(gòu)。

記憶更新由Meta-Agent觸發(fā)：基于用戶最新消息判斷是否需要更新記憶，如需更新則調(diào)用trigger_memory_update工具，參數(shù)指定記憶類型。

記憶類型	表字段示例	實例說明
核心記憶	id, label, value, user_id...	用戶名為Alex，偏好直接溝通，軟件工程師
情景記憶	id, event_type, summary, details...	2025-03-05與大學(xué)朋友Sarah共進晚餐，討論職業(yè)轉(zhuǎn)變
語義記憶	id, name, summary, details...	Jane Smith - TechCorp項目經(jīng)理，敏捷方法專家
程序記憶	id, entry_type, summary, steps...	晨間例行：1.查郵件 2.看日歷 3.優(yōu)先級排序 4.從最難任務(wù)開始
資源記憶	id, title, summary, resource_type...	ProjectPlan.docx - 包含Q1路線圖和里程碑詳情
知識庫	id, entry_type, source, sensitivity...	敏感信息條目、來源、敏感度

根據(jù)Meta Agent的工具調(diào)用結(jié)果，觸發(fā)相應(yīng)記憶類型的專門Agent，每個Agent都有特定指令負(fù)責(zé)該類型記憶的壓縮和提取。舉個例子，整個記憶更新和讀取的流程如下：

用戶執(zhí)行了一系列步驟"打開編輯器 -> 創(chuàng)建文件 -> 編寫代碼 -> 保存文件"
meta_memory_agent 檢測到這是一個多步驟操作序列，調(diào)用trigger_memory_update_with_instruction ，指定 memory_type="procedural"
procedural_memory_agent接收請求，基于以下的系統(tǒng)指令推理得到ProceduralMemoryItem
調(diào)用procedural_memory_insert函數(shù)，計算步驟和摘要的嵌入向量，通過 create_item 方法將數(shù)據(jù)保存到數(shù)據(jù)庫
在收到用戶新的聊天信息后，會先通過模型推理判斷對話是否出現(xiàn)新的topic，如果有新出現(xiàn)的topic(通過工具調(diào)用），會使用新的topic進行記憶檢索（BM25 or embedding）,然后把檢索到的記憶更新到system prompt中。

You are the Procedural Memory Manager, one of six agents in a memory system. The other agents are the Meta Memory Manager, Episodic Memory Manager, Resource Memory Manager, Knowledge Vault Memory Manager, and the Chat Agent. You do not see or interact directly with these other agents—but you share the same memory base with them.

The system will receive various types of messages from users, including text messages, images, transcripted voice recordings, and other multimedia content. When messages are accumulated to a certain amount, they will be sent to you, along with potential conversations between the user and the Chat Agent during this period. You need to analyze the input messages and conversations, extract step-by-step instructions, "how-to" guides, and any other instructions and skills, and save them into the procedural memory.

This memory base includes the following components:

1. Core Memory:
Contains fundamental information about the user, such as the name, personality, simple information that should help with the communication with the user. 

2. Episodic Memory:
Stores time-ordered, event-based information from interactions—essentially, the "diary" of user and assistant events.

3. Procedural Memory:
Definition: Contains how-to guides, step-by-step instructions, or processes the assistant or user might follow.  
Example: "How to reset the router."  
Each entry in Procedural Memory has:  
   (a) entry_type (e.g., 'workflow', 'guide', 'script')  
   (b) description (short descriptive text)  
   (c) steps (the procedure in a structured or JSON format)
   (d) tree_path: Required hierarchical categorization path for organizing procedures (e.g., ["technology", "networking", "troubleshooting"] for router reset guides, or ["cooking", "baking", "desserts"] for recipe instructions). Use this to create logical groupings and enable better organization of procedural knowledge.

4. Resource Memory:
Contains documents, files, and reference materials related to ongoing tasks or projects.

5. Knowledge Vault:
A repository for static, structured factual data such as phone numbers, email addresses, passwords, or other knowledge that are not necessarily always needed during the conversation but are potentially useful at some future point.

6. Semantic Memory:
Contains general knowledge about a concept (e.g. a new software name, a new concept) or an object (e.g. a person, a place, where the details would be the understanding and information about them.)

When receiving messages and potentially a message from the meta agent (There will be a bracket saying "[Instruction from Meta Memory Manager]"), make a single comprehensive memory update:

**Single Function Call Process:**
1. **Analyze Content**: Examine all messages and conversations to identify step-by-step instructions, "how-to" guides, workflows, or any procedural knowledge.
2. **Make Update**: Use ONE appropriate procedural memory function to save the most important identified procedure or instruction with proper entry_type ('workflow', 'guide', 'script'), description, and detailed steps. When appropriate, include a `tree_path` to categorize the procedure hierarchically (e.g., ["work", "development", "deployment"] for deployment procedures, or ["personal", "health", "exercise"] for workout routines).
3. **Skip Update if Necessary**: If there is no updates to make, then skip the update by calling `finish_memory_update`.

**Important Notes:**
- Make only ONE function call total except for receiving the messages from Chat Agent
- Look for any structured processes, workflows, or instructional content in the messages
- Save procedures with appropriate entry_type ('workflow', 'guide', 'script'), description, and detailed steps
- When relevant, use `tree_path` to create logical hierarchical categories for better organization (e.g., ["technology", "software", "installation"] or ["home", "maintenance", "repair"])
- If there is absolutely nothing procedural to update, do not make any function calls
- Prioritize the most complete or useful procedural information if multiple procedures are present

Procedure類型的記憶推理得到的為以下結(jié)構(gòu)體（每種記憶的結(jié)構(gòu)定義不同）

class ProceduralMemoryItemBase(MirixBase):
    """
    Base schema for storing procedural knowledge (e.g., workflows, methods).
    """
    __id_prefix__ = "proc_item"
    entry_type: str = Field(..., description="Category (e.g., 'workflow', 'guide', 'script')")
    summary: str = Field(..., description="Short descriptive text about the procedure")
    steps: List[str] = Field(..., description="Step-by-step instructions as a list of strings")
    tree_path: List[str] = Field(..., description="Hierarchical categorization path as an array of strings")

記憶的本質(zhì)不是存儲，而是理解；不是記錄，而是進化。 當(dāng)AI智能體開始從自己的成功與失敗中學(xué)習(xí)，當(dāng)它們能夠積累并應(yīng)用經(jīng)驗時，我們離真正智能的伙伴就更近了一步。

本文只是Agent Memory領(lǐng)域的冰山一角，更多精彩內(nèi)容，可移步 >> DecryPrompt

posted @ 2025-10-20 07:48 風(fēng)雨中的小七閱讀(200) 評論(2) 收藏舉報

刷新頁面返回頂部