<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Agent Memory on ICE-ICE-BEAR-BLOG</title><link>https://ice-ice-bear.github.io/tags/agent-memory/</link><description>Recent content in Agent Memory on ICE-ICE-BEAR-BLOG</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Fri, 08 May 2026 00:00:00 +0900</lastBuildDate><atom:link href="https://ice-ice-bear.github.io/tags/agent-memory/index.xml" rel="self" type="application/rss+xml"/><item><title>The OS Layer for AI Coding Agents — agentmemory and agent-skills Land the Same Day</title><link>https://ice-ice-bear.github.io/posts/2026-05-08-agent-os-layer-memory-skills/</link><pubDate>Fri, 08 May 2026 00:00:00 +0900</pubDate><guid>https://ice-ice-bear.github.io/posts/2026-05-08-agent-os-layer-memory-skills/</guid><description>&lt;img src="https://ice-ice-bear.github.io/" alt="Featured image of post The OS Layer for AI Coding Agents — agentmemory and agent-skills Land the Same Day" /&gt;&lt;h2 id="overview"&gt;Overview
&lt;/h2&gt;&lt;p&gt;Two GitHub links, dropped 30 seconds apart at the same minute. Both target ergonomic gaps in AI coding agents, but &lt;strong&gt;they target different gaps.&lt;/strong&gt; &lt;a class="link" href="https://github.com/rohitg00/agentmemory" target="_blank" rel="noopener"
 &gt;rohitg00/agentmemory&lt;/a&gt; tackles cross-session memory infrastructure; &lt;a class="link" href="https://github.com/addyosmani/agent-skills" target="_blank" rel="noopener"
 &gt;addyosmani/agent-skills&lt;/a&gt; tackles senior-engineer workflow enforcement. Read together, they sketch out an emerging OS layer for the agent era.&lt;/p&gt;
&lt;pre class="mermaid" style="visibility:hidden"&gt;graph TD
 Agent["AI coding agent"] --&gt; Memory["Memory / state layer"]
 Agent --&gt; Skills["Workflow / rules layer"]
 Agent --&gt; Model["Model layer"]
 Agent --&gt; UI["UI layer"]

 Memory --&gt; AM["agentmemory &amp;lt;br/&amp;gt; MCP + REST"]
 Skills --&gt; AS["agent-skills &amp;lt;br/&amp;gt; Markdown skill bundle"]
 Model --&gt; LLM["Claude / GPT / Gemini"]
 UI --&gt; Tools["Claude Code / Cursor / Cline"]&lt;/pre&gt;&lt;h2 id="1-agentmemory--persistent-memory-shared-across-every-agent-via-mcp"&gt;1. agentmemory — Persistent Memory Shared Across Every Agent via MCP
&lt;/h2&gt;&lt;p&gt;&lt;a class="link" href="https://github.com/rohitg00/agentmemory" target="_blank" rel="noopener"
 &gt;rohitg00/agentmemory&lt;/a&gt; brands itself as &lt;em&gt;&amp;quot;#1 Persistent memory for AI coding agents based on real-world benchmarks.&amp;quot;&lt;/em&gt; Created 2026-02-25, ~2,400 stars, Apache 2.0. Project home: &lt;a class="link" href="https://agent-memory.dev" target="_blank" rel="noopener"
 &gt;agent-memory.dev&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id="the-problem-it-solves"&gt;The problem it solves
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Re-explaining the architecture to the agent every session&lt;/li&gt;
&lt;li&gt;Rediscovering the same bug&lt;/li&gt;
&lt;li&gt;Re-teaching the same preferences (library choices, code style)&lt;/li&gt;
&lt;li&gt;Built-in memory like &lt;code&gt;CLAUDE.md&lt;/code&gt; or &lt;code&gt;.cursorrules&lt;/code&gt; is &lt;strong&gt;capped at 200 lines and goes stale fast&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="how-it-works"&gt;How it works
&lt;/h3&gt;&lt;p&gt;The agent silently captures what it does → compresses → stores as searchable memory → injects only the relevant context at the start of the next session. The key trick: stand up a single &lt;a class="link" href="https://modelcontextprotocol.io/" target="_blank" rel="noopener"
 &gt;MCP&lt;/a&gt; server and 16+ agents share the same memory.&lt;/p&gt;
&lt;p&gt;Supported clients:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://www.anthropic.com/claude-code" target="_blank" rel="noopener"
 &gt;Claude Code&lt;/a&gt; · &lt;a class="link" href="https://cursor.com/" target="_blank" rel="noopener"
 &gt;Cursor&lt;/a&gt; · &lt;a class="link" href="https://github.com/google-gemini/gemini-cli" target="_blank" rel="noopener"
 &gt;Gemini CLI&lt;/a&gt; · &lt;a class="link" href="https://openai.com/codex/" target="_blank" rel="noopener"
 &gt;Codex CLI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://cline.bot/" target="_blank" rel="noopener"
 &gt;Cline&lt;/a&gt; · &lt;a class="link" href="https://block.github.io/goose/" target="_blank" rel="noopener"
 &gt;Goose&lt;/a&gt; · &lt;a class="link" href="https://windsurf.com/" target="_blank" rel="noopener"
 &gt;Windsurf&lt;/a&gt; · &lt;a class="link" href="https://roocode.com/" target="_blank" rel="noopener"
 &gt;Roo Code&lt;/a&gt; · &lt;a class="link" href="https://opencode.ai/" target="_blank" rel="noopener"
 &gt;OpenCode&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Any agent without MCP can connect via REST (104 endpoints)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Embeddings run locally with &lt;a class="link" href="https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2" target="_blank" rel="noopener"
 &gt;&lt;code&gt;all-MiniLM-L6-v2&lt;/code&gt;&lt;/a&gt; — no API keys, free.&lt;/p&gt;
&lt;h3 id="benchmark--longmemeval-s"&gt;Benchmark — LongMemEval-S
&lt;/h3&gt;&lt;p&gt;Numbers on &lt;a class="link" href="https://arxiv.org/abs/2410.10813" target="_blank" rel="noopener"
 &gt;LongMemEval&lt;/a&gt; (ICLR 2025, 500 questions):&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Metric&lt;/th&gt;
 &lt;th&gt;agentmemory&lt;/th&gt;
 &lt;th&gt;BM25 fallback&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;R@5&lt;/td&gt;
 &lt;td&gt;95.2%&lt;/td&gt;
 &lt;td&gt;86.2%&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;R@10&lt;/td&gt;
 &lt;td&gt;98.6%&lt;/td&gt;
 &lt;td&gt;—&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;MRR&lt;/td&gt;
 &lt;td&gt;88.2%&lt;/td&gt;
 &lt;td&gt;—&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Hybrid embedding retrieval beats keyword-only BM25 by &lt;strong&gt;9 percentage points on R@5.&lt;/strong&gt;&lt;/p&gt;
&lt;h3 id="token-cost"&gt;Token cost
&lt;/h3&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Approach&lt;/th&gt;
 &lt;th&gt;Annual tokens&lt;/th&gt;
 &lt;th&gt;Annual cost&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Full context paste&lt;/td&gt;
 &lt;td&gt;19.5M+&lt;/td&gt;
 &lt;td&gt;Exceeds context window&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;LLM-summarized&lt;/td&gt;
 &lt;td&gt;~650K&lt;/td&gt;
 &lt;td&gt;~$500&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;agentmemory&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;&lt;strong&gt;~170K&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;&lt;strong&gt;~$10&lt;/strong&gt;&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;agentmemory + local embeddings&lt;/td&gt;
 &lt;td&gt;~170K&lt;/td&gt;
 &lt;td&gt;&lt;strong&gt;$0&lt;/strong&gt;&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="quick-start"&gt;Quick start
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;npx @agentmemory/agentmemory
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="what-it-really-argues"&gt;What it really argues
&lt;/h3&gt;&lt;p&gt;The bet underneath agentmemory is one sentence — &lt;strong&gt;&amp;ldquo;memory belongs in the infrastructure layer, not the agent.&amp;rdquo;&lt;/strong&gt; Instead of every agent writing its own memory, one MCP server fans out to all of them. Whatever Claude Code learns flows into the next Cursor session intact. The project started about 50 days earlier as a viral GitHub gist (1,050 stars) and is essentially that design doc rendered into code: &lt;a class="link" href="https://github.com/karpathy" target="_blank" rel="noopener"
 &gt;Karpathy&amp;rsquo;s LLM Wiki pattern&lt;/a&gt; plus confidence scoring, lifecycle, knowledge graph, and hybrid search.&lt;/p&gt;
&lt;h2 id="2-agent-skills--senior-engineer-workflow-as-a-skill-bundle"&gt;2. agent-skills — Senior-Engineer Workflow as a Skill Bundle
&lt;/h2&gt;&lt;p&gt;&lt;a class="link" href="https://github.com/addyosmani/agent-skills" target="_blank" rel="noopener"
 &gt;addyosmani/agent-skills&lt;/a&gt; calls itself &lt;em&gt;&amp;ldquo;Production-grade engineering skills for AI coding agents.&amp;rdquo;&lt;/em&gt; Created 2026-02-15, ~33,500 stars, MIT. At the same point in time it has 14× the stars of agentmemory — currently the strongest candidate for an agent-workflow standard.&lt;/p&gt;
&lt;h3 id="the-problem-it-solves-1"&gt;The problem it solves
&lt;/h3&gt;&lt;p&gt;&amp;ldquo;The agent writes code, but it doesn&amp;rsquo;t write code like a senior would.&amp;rdquo;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Skips the spec&lt;/li&gt;
&lt;li&gt;Skips the tests&lt;/li&gt;
&lt;li&gt;Doesn&amp;rsquo;t think about security&lt;/li&gt;
&lt;li&gt;Drops a giant PR all at once&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="a-six-stage-lifecycle"&gt;A six-stage lifecycle
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;DEFINE → PLAN → BUILD → VERIFY → REVIEW → SHIP
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;/spec /plan /build /test /review /ship
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Each slash command corresponds to one lifecycle stage and auto-activates the right skills.&lt;/p&gt;
&lt;h3 id="the-20-skills-by-stage"&gt;The 20 skills, by stage
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Define&lt;/strong&gt;: idea-refine, spec-driven-development&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Plan&lt;/strong&gt;: planning-and-task-breakdown&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Build&lt;/strong&gt;: incremental-implementation, test-driven-development, context-engineering, source-driven-development, frontend-ui-engineering, api-and-interface-design&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Verify&lt;/strong&gt;: browser-testing-with-devtools, debugging-and-error-recovery&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Review&lt;/strong&gt;: code-review-and-quality, code-simplification, security-and-hardening, performance-optimization&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ship&lt;/strong&gt;: git-workflow-and-versioning, ci-cd-and-automation, deprecation-and-migration, documentation-and-adrs, shipping-and-launch&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="where-it-runs"&gt;Where it runs
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://www.anthropic.com/claude-code" target="_blank" rel="noopener"
 &gt;Claude Code&lt;/a&gt; (recommended, marketplace-installed): &lt;code&gt;/plugin marketplace add addyosmani/agent-skills&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://cursor.com/" target="_blank" rel="noopener"
 &gt;Cursor&lt;/a&gt;: copy SKILL.md into &lt;code&gt;.cursor/rules/&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/google-gemini/gemini-cli" target="_blank" rel="noopener"
 &gt;Gemini CLI&lt;/a&gt; · &lt;a class="link" href="https://windsurf.com/" target="_blank" rel="noopener"
 &gt;Windsurf&lt;/a&gt; · &lt;a class="link" href="https://opencode.ai/" target="_blank" rel="noopener"
 &gt;OpenCode&lt;/a&gt; · &lt;a class="link" href="https://github.com/features/copilot" target="_blank" rel="noopener"
 &gt;GitHub Copilot&lt;/a&gt; · &lt;a class="link" href="https://kiro.dev/" target="_blank" rel="noopener"
 &gt;Kiro IDE&lt;/a&gt; · &lt;a class="link" href="https://openai.com/codex/" target="_blank" rel="noopener"
 &gt;Codex&lt;/a&gt; — &lt;strong&gt;anything that reads Markdown works&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="agent-personas"&gt;Agent personas
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;code&gt;code-reviewer&lt;/code&gt; — Senior staff engineer lens, &amp;ldquo;would a staff engineer approve this?&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;test-engineer&lt;/code&gt; — QA discipline, the Prove-It pattern&lt;/li&gt;
&lt;li&gt;&lt;code&gt;security-auditor&lt;/code&gt; — &lt;a class="link" href="https://owasp.org/" target="_blank" rel="noopener"
 &gt;OWASP&lt;/a&gt;, threat modeling&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="what-it-really-argues-1"&gt;What it really argues
&lt;/h3&gt;&lt;p&gt;agent-skills&amp;rsquo; bet is &lt;strong&gt;&amp;ldquo;the difference between agents isn&amp;rsquo;t model weight — it&amp;rsquo;s how strictly the workflow is enforced.&amp;rdquo;&lt;/strong&gt; TDD here isn&amp;rsquo;t &amp;ldquo;you can do TDD&amp;rdquo; — it&amp;rsquo;s &amp;ldquo;no Red-Green-Refactor, no code.&amp;rdquo; Code review isn&amp;rsquo;t a vibe; it&amp;rsquo;s five-axis review, 100-line size limits, explicit Nit/Optional/FYI severity labels. By expressing all of this in Markdown, it stays &lt;strong&gt;agent-agnostic&lt;/strong&gt; — the same skill bundle works in Claude, Cursor, and Gemini. 33K stars is the market saying this is the closest thing to a workflow standard right now.&lt;/p&gt;
&lt;h2 id="3-side-by-side"&gt;3. Side by side
&lt;/h2&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Dimension&lt;/th&gt;
 &lt;th&gt;agentmemory&lt;/th&gt;
 &lt;th&gt;agent-skills&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Author&lt;/td&gt;
 &lt;td&gt;rohitg00&lt;/td&gt;
 &lt;td&gt;addyosmani&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Form&lt;/td&gt;
 &lt;td&gt;TypeScript library + MCP server&lt;/td&gt;
 &lt;td&gt;Markdown skill bundle&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;License&lt;/td&gt;
 &lt;td&gt;Apache 2.0&lt;/td&gt;
 &lt;td&gt;MIT&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Stars (2026-05)&lt;/td&gt;
 &lt;td&gt;~2,400&lt;/td&gt;
 &lt;td&gt;~33,500&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Created&lt;/td&gt;
 &lt;td&gt;2026-02-25&lt;/td&gt;
 &lt;td&gt;2026-02-15&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Domain&lt;/td&gt;
 &lt;td&gt;Memory / state infrastructure&lt;/td&gt;
 &lt;td&gt;Engineering workflow&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Decoupling mechanism&lt;/td&gt;
 &lt;td&gt;MCP standard&lt;/td&gt;
 &lt;td&gt;Markdown standard&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="4-the-combined-picture--an-os-layer-for-agents"&gt;4. The Combined Picture — An OS Layer for Agents
&lt;/h2&gt;&lt;pre class="mermaid" style="visibility:hidden"&gt;flowchart LR
 M["Memory / state"] --&gt; AM["agentmemory"]
 W["Workflow / rules"] --&gt; AS["agent-skills"]
 Mo["Model"] --&gt; LLM["Claude / GPT / Gemini"]
 UI["UI"] --&gt; Tools["Claude Code / Cursor / Cline"]&lt;/pre&gt;&lt;p&gt;Three or four years ago &amp;ldquo;which IDE do you use?&amp;rdquo; was the deciding question. Now it&amp;rsquo;s becoming &lt;strong&gt;&amp;ldquo;what&amp;rsquo;s your memory and skills setup?&amp;rdquo;&lt;/strong&gt; Both projects deliberately decouple from any one model — &lt;a class="link" href="https://modelcontextprotocol.io/" target="_blank" rel="noopener"
 &gt;MCP&lt;/a&gt; for one, plain Markdown for the other — designed so that &lt;strong&gt;models can be swapped, but the memory and skills accumulate.&lt;/strong&gt;&lt;/p&gt;
&lt;h2 id="insights"&gt;Insights
&lt;/h2&gt;&lt;p&gt;The headline of this digest isn&amp;rsquo;t either tool individually — it&amp;rsquo;s that two links shared 30 seconds apart fill exactly two distinct slots in the agent OS layer. agentmemory pulls &lt;strong&gt;state&lt;/strong&gt; down into the infrastructure; agent-skills pulls &lt;strong&gt;process&lt;/strong&gt; down into the infrastructure. The fact that both decouple from models in similar ways — one MCP server, one Markdown bundle — is the same bet from two angles: models are interchangeable but memory and skills must compound. The 33K vs 2.4K stars gap probably isn&amp;rsquo;t about timing; it&amp;rsquo;s a signal that the workflow-standard candidate is consolidating faster than the memory-infrastructure candidate. &lt;strong&gt;Two open questions for next quarter&lt;/strong&gt; — does memory standardize on MCP, and do skill bundles like agent-skills become a new SaaS category inside IDE marketplaces? The decision point has already started shifting from IDE choice to memory and skill setup.&lt;/p&gt;
&lt;h2 id="references"&gt;References
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Core repos&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/rohitg00/agentmemory" target="_blank" rel="noopener"
 &gt;rohitg00/agentmemory&lt;/a&gt; · home: &lt;a class="link" href="https://agent-memory.dev" target="_blank" rel="noopener"
 &gt;agent-memory.dev&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/addyosmani/agent-skills" target="_blank" rel="noopener"
 &gt;addyosmani/agent-skills&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Related agents and clients&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://www.anthropic.com/claude-code" target="_blank" rel="noopener"
 &gt;Claude Code&lt;/a&gt; · &lt;a class="link" href="https://cursor.com/" target="_blank" rel="noopener"
 &gt;Cursor&lt;/a&gt; · &lt;a class="link" href="https://cline.bot/" target="_blank" rel="noopener"
 &gt;Cline&lt;/a&gt; · &lt;a class="link" href="https://windsurf.com/" target="_blank" rel="noopener"
 &gt;Windsurf&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/google-gemini/gemini-cli" target="_blank" rel="noopener"
 &gt;Gemini CLI&lt;/a&gt; · &lt;a class="link" href="https://openai.com/codex/" target="_blank" rel="noopener"
 &gt;Codex&lt;/a&gt; · &lt;a class="link" href="https://opencode.ai/" target="_blank" rel="noopener"
 &gt;OpenCode&lt;/a&gt; · &lt;a class="link" href="https://block.github.io/goose/" target="_blank" rel="noopener"
 &gt;Goose&lt;/a&gt; · &lt;a class="link" href="https://roocode.com/" target="_blank" rel="noopener"
 &gt;Roo Code&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/features/copilot" target="_blank" rel="noopener"
 &gt;GitHub Copilot&lt;/a&gt; · &lt;a class="link" href="https://kiro.dev/" target="_blank" rel="noopener"
 &gt;Kiro IDE&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Protocols and standards&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://modelcontextprotocol.io/" target="_blank" rel="noopener"
 &gt;Model Context Protocol (MCP)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://owasp.org/" target="_blank" rel="noopener"
 &gt;OWASP&lt;/a&gt; — basis for the security-auditor persona&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Benchmarks and embeddings&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Paper: &lt;a class="link" href="https://arxiv.org/abs/2410.10813" target="_blank" rel="noopener"
 &gt;LongMemEval (arXiv:2410.10813, ICLR 2025)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2" target="_blank" rel="noopener"
 &gt;&lt;code&gt;sentence-transformers/all-MiniLM-L6-v2&lt;/code&gt;&lt;/a&gt; — local embedding model used by agentmemory&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Three arxiv Papers That Drifted Through the Chat — Multiagent Debate, MIA, Husserlian Phenomenology</title><link>https://ice-ice-bear.github.io/posts/2026-05-06-arxiv-papers-pick-multiagent-debate-mia-husserl/</link><pubDate>Wed, 06 May 2026 00:00:00 +0900</pubDate><guid>https://ice-ice-bear.github.io/posts/2026-05-06-arxiv-papers-pick-multiagent-debate-mia-husserl/</guid><description>&lt;img src="https://ice-ice-bear.github.io/" alt="Featured image of post Three arxiv Papers That Drifted Through the Chat — Multiagent Debate, MIA, Husserlian Phenomenology" /&gt;&lt;h2 id="overview"&gt;Overview
&lt;/h2&gt;&lt;p&gt;Three &lt;a class="link" href="https://arxiv.org/" target="_blank" rel="noopener"
 &gt;arxiv&lt;/a&gt; papers landed within a few days of each other. Different eras, different topics, different methods — but read together they answer one question, &lt;strong&gt;&amp;ldquo;where do further gains in AI agent reasoning come from?&amp;rdquo;&lt;/strong&gt;, from three angles: cooperation, persistence, and structure. Right at the moment when single-model reasoning gains are visibly plateauing, this is a useful tour of where the next round&amp;rsquo;s keywords are coming from.&lt;/p&gt;
&lt;pre class="mermaid" style="visibility:hidden"&gt;graph TD
 Q["Where do reasoning gains come from?"] --&gt; Coop["Cooperation"]
 Q --&gt; Pers["Persistence"]
 Q --&gt; Struct["Structure"]

 Coop --&gt; P1["Multiagent Debate &amp;lt;br/&amp;gt; 2305.14325 (2023)"]
 Pers --&gt; P2["Memory Intelligence Agent &amp;lt;br/&amp;gt; 2604.04503 (2026)"]
 Struct --&gt; P3["Husserl + Active Inference &amp;lt;br/&amp;gt; 2208.09058 (2022)"]&lt;/pre&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;#&lt;/th&gt;
 &lt;th&gt;Paper&lt;/th&gt;
 &lt;th&gt;Year&lt;/th&gt;
 &lt;th&gt;One-line summary&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;1&lt;/td&gt;
 &lt;td&gt;&lt;a class="link" href="https://arxiv.org/abs/2305.14325" target="_blank" rel="noopener"
 &gt;Multiagent Debate&lt;/a&gt;&lt;/td&gt;
 &lt;td&gt;2023&lt;/td&gt;
 &lt;td&gt;Multiple LLM instances debating each other improve reasoning&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;2&lt;/td&gt;
 &lt;td&gt;&lt;a class="link" href="https://arxiv.org/abs/2604.04503" target="_blank" rel="noopener"
 &gt;Memory Intelligence Agent (MIA)&lt;/a&gt;&lt;/td&gt;
 &lt;td&gt;2026&lt;/td&gt;
 &lt;td&gt;Deep Research Agents need an evolving memory system&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;3&lt;/td&gt;
 &lt;td&gt;&lt;a class="link" href="https://arxiv.org/abs/2208.09058" target="_blank" rel="noopener"
 &gt;Husserlian Phenomenology + Active Inference&lt;/a&gt;&lt;/td&gt;
 &lt;td&gt;2022&lt;/td&gt;
 &lt;td&gt;The phenomenology of consciousness can be mapped to a computational model&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="1-multiagent-debate--230514325"&gt;1. Multiagent Debate — 2305.14325
&lt;/h2&gt;&lt;p&gt;&lt;a class="link" href="https://yilundu.github.io/" target="_blank" rel="noopener"
 &gt;Yilun Du&lt;/a&gt;, Shuang Li, &lt;a class="link" href="https://groups.csail.mit.edu/vision/torralbalab/" target="_blank" rel="noopener"
 &gt;Antonio Torralba&lt;/a&gt;, &lt;a class="link" href="https://cocosci.mit.edu/josh" target="_blank" rel="noopener"
 &gt;Joshua B. Tenenbaum&lt;/a&gt;, &lt;a class="link" href="https://research.google/people/igor-mordatch/" target="_blank" rel="noopener"
 &gt;Igor Mordatch&lt;/a&gt; — &lt;a class="link" href="https://www.mit.edu/" target="_blank" rel="noopener"
 &gt;MIT&lt;/a&gt; (2023-05). Accepted at &lt;a class="link" href="https://iclr.cc/Conferences/2025" target="_blank" rel="noopener"
 &gt;ICLR 2025&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id="the-idea"&gt;The idea
&lt;/h3&gt;&lt;p&gt;Instead of asking one LLM to reason harder, &lt;strong&gt;have several LLM instances propose answers and debate.&lt;/strong&gt; Across multiple rounds they converge on a shared answer. It is essentially &lt;a class="link" href="https://en.wikipedia.org/wiki/Marvin_Minsky" target="_blank" rel="noopener"
 &gt;Marvin Minsky&lt;/a&gt;&amp;rsquo;s &lt;a class="link" href="https://en.wikipedia.org/wiki/Society_of_Mind" target="_blank" rel="noopener"
 &gt;Society of Mind&lt;/a&gt; approach ported to LLMs.&lt;/p&gt;
&lt;h3 id="contribution"&gt;Contribution
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;A multi-agent debate framework that improves mathematical and strategic reasoning&lt;/li&gt;
&lt;li&gt;Reduces hallucinations, improves factual validity&lt;/li&gt;
&lt;li&gt;Works on black-box LLMs as-is with the same prompt for every task — no fine-tuning required&lt;/li&gt;
&lt;li&gt;The first clean result that lifts reasoning by &lt;strong&gt;inter-instance cooperation&lt;/strong&gt; rather than single-model scaling&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="why-now"&gt;Why now
&lt;/h3&gt;&lt;p&gt;Although it is a May 2023 paper, the 2026 vantage point makes it more relevant. Single-model reasoning gains are visibly plateauing, and this dovetails with the &lt;strong&gt;parallel tool call&lt;/strong&gt; push in &lt;a class="link" href="https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api" target="_blank" rel="noopener"
 &gt;GPT-Realtime-2&lt;/a&gt;. It is also the theoretical justification for why infrastructure tools like agent-skills are designed assuming &lt;strong&gt;many agents running concurrently.&lt;/strong&gt;&lt;/p&gt;
&lt;h2 id="2-memory-intelligence-agent-mia--260404503"&gt;2. Memory Intelligence Agent (MIA) — 2604.04503
&lt;/h2&gt;&lt;p&gt;Jingyang Qiao et al. (2026-04). A memory architecture paper aimed squarely at the &lt;a class="link" href="https://openai.com/index/introducing-deep-research/" target="_blank" rel="noopener"
 &gt;Deep Research Agent&lt;/a&gt; family.&lt;/p&gt;
&lt;h3 id="the-idea-1"&gt;The idea
&lt;/h3&gt;&lt;p&gt;The weak link in Deep Research Agents — LLM reasoning combined with external tools — is memory. Conventional approaches (retrieving past trajectories) are inefficient, with storage and retrieval costs blowing up. MIA solves it with a &lt;strong&gt;Manager-Planner-Executor&lt;/strong&gt; three-tier architecture, plus non-parametric memory and two parametric agents.&lt;/p&gt;
&lt;pre class="mermaid" style="visibility:hidden"&gt;flowchart LR
 M["Manager &amp;lt;br/&amp;gt; (memory compression/management)"] --&gt; P["Planner &amp;lt;br/&amp;gt; (search planning)"]
 P --&gt; E["Executor &amp;lt;br/&amp;gt; (information analysis)"]
 E --&gt;|"trajectory"| M
 M -.-&gt;|"non-parametric ↔ parametric"| P
 M -.-&gt;|"non-parametric ↔ parametric"| E&lt;/pre&gt;&lt;h3 id="contribution-1"&gt;Contribution
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Non-parametric memory storing &lt;strong&gt;compressed search trajectories&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Alternating reinforcement learning&lt;/strong&gt; — Planner and Executor are reinforced in alternation, separating search-plan synthesis from information analysis&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Test-time learning&lt;/strong&gt; — the Planner updates on-the-fly without pausing inference&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Bidirectional conversion between parametric and non-parametric memory&lt;/strong&gt; for efficient memory evolution&lt;/li&gt;
&lt;li&gt;Strong results across eleven benchmarks&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="why-now-1"&gt;Why now
&lt;/h3&gt;&lt;p&gt;This is the academic background for tools like &lt;a class="link" href="https://github.com/elder-plinius/agentmemory" target="_blank" rel="noopener"
 &gt;agentmemory&lt;/a&gt;. The fact that agentmemory and this paper landed within days of each other reflects the industry consensus that &lt;strong&gt;memory is the key differentiator for the next round of agents.&lt;/strong&gt; The Manager-Planner-Executor split looks like a strong candidate for a de facto standard pattern in future multi-agent frameworks. It should be read alongside the rise of standard tool interfaces like &lt;a class="link" href="https://modelcontextprotocol.io/" target="_blank" rel="noopener"
 &gt;MCP&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="3-husserlian-phenomenology--active-inference--220809058"&gt;3. Husserlian Phenomenology + Active Inference — 2208.09058
&lt;/h2&gt;&lt;p&gt;Mahault Albarracin, Riddhi J. Pitliya, &lt;a class="link" href="https://maxwelljdramstead.com/" target="_blank" rel="noopener"
 &gt;Maxwell J. D. Ramstead&lt;/a&gt;, Jeffrey Yoshimi (2022-08). A mapping of &lt;a class="link" href="https://www.fil.ion.ucl.ac.uk/~karl/" target="_blank" rel="noopener"
 &gt;Karl Friston&lt;/a&gt;&amp;rsquo;s &lt;a class="link" href="https://en.wikipedia.org/wiki/Free_energy_principle" target="_blank" rel="noopener"
 &gt;active inference&lt;/a&gt; framework onto &lt;a class="link" href="https://plato.stanford.edu/entries/husserl/" target="_blank" rel="noopener"
 &gt;Edmund Husserl&lt;/a&gt;&amp;rsquo;s &lt;a class="link" href="https://plato.stanford.edu/entries/phenomenology/" target="_blank" rel="noopener"
 &gt;phenomenology&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id="the-idea-2"&gt;The idea
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Phenomenology&lt;/strong&gt; is the rigorous descriptive study of conscious experience. The paper maps Husserl&amp;rsquo;s descriptions of consciousness onto the mathematical building blocks of &lt;strong&gt;active inference&lt;/strong&gt; — the neuroscience framework in which the brain predicts the world through a generative model.&lt;/p&gt;
&lt;h3 id="contribution-2"&gt;Contribution
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Connects Husserl&amp;rsquo;s theory of time consciousness — retention/protention — to active inference&lt;/li&gt;
&lt;li&gt;A theoretical bridge between phenomenological description and computational neuroscience models&lt;/li&gt;
&lt;li&gt;Reinterprets the structure of consciousness as components of a &lt;strong&gt;generative model&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;A push for &lt;strong&gt;computational phenomenology&lt;/strong&gt; as an interdisciplinary field&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="why-now-2"&gt;Why now
&lt;/h3&gt;&lt;p&gt;This is the most abstract of the three but possibly the most interesting. As AI agents acquire &amp;ldquo;memory&amp;rdquo; and &amp;ldquo;reasoning,&amp;rdquo; &lt;strong&gt;how an agent structures its experience&lt;/strong&gt; becomes a philosophical question again.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;MIA&amp;rsquo;s evolving memory ≈ Husserl&amp;rsquo;s retention/protention?&lt;/li&gt;
&lt;li&gt;Multiagent debate ≈ the self-reflective structure of consciousness?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The paper was shared as a direct PDF link (&lt;code&gt;/pdf/&lt;/code&gt;), which suggests &lt;strong&gt;somebody actually read the full text.&lt;/strong&gt; Probably one senior in the chat is making the bet that the next move for AI agents comes from cognitive science.&lt;/p&gt;
&lt;h2 id="reading-the-three-together"&gt;Reading the three together
&lt;/h2&gt;&lt;p&gt;The three papers point in the same direction: &lt;strong&gt;single-LLM limits → inter-instance cooperation + evolving memory + borrowed structure of consciousness.&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Axis&lt;/th&gt;
 &lt;th&gt;Answer&lt;/th&gt;
 &lt;th&gt;Paper&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Cooperation&lt;/td&gt;
 &lt;td&gt;Multi-instance debate&lt;/td&gt;
 &lt;td&gt;Multiagent Debate (2023)&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Persistence&lt;/td&gt;
 &lt;td&gt;Compressed/evolving memory&lt;/td&gt;
 &lt;td&gt;MIA (2026)&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Structure&lt;/td&gt;
 &lt;td&gt;Time consciousness → generative model&lt;/td&gt;
 &lt;td&gt;Husserl + Active Inference (2022)&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The chat&amp;rsquo;s pick of the week accidentally forms a clean three-layer stack. Set alongside agentmemory + agent-skills (previous post), it shows that &lt;strong&gt;research, tooling, and practice are converging in the same direction.&lt;/strong&gt;&lt;/p&gt;
&lt;h2 id="insights"&gt;Insights
&lt;/h2&gt;&lt;p&gt;The three papers come from different years and different topics, but read together they point at the same consensus — the way past the single-LLM reasoning plateau is not one more size class of model, but &lt;strong&gt;inter-instance cooperation, evolving memory, and explicit modeling of the structure of experience.&lt;/strong&gt; Multiagent Debate is the first clean answer to &amp;ldquo;how do we get instances to cooperate&amp;rdquo;; MIA answers &amp;ldquo;how do we accumulate that cooperation across time&amp;rdquo;; the Husserl + Active Inference mapping throws a longer-range coordinate for &amp;ldquo;what structure that accumulation should ultimately resemble.&amp;rdquo; The fact that practical tools like &lt;a class="link" href="https://github.com/elder-plinius/agentmemory" target="_blank" rel="noopener"
 &gt;agentmemory&lt;/a&gt; and agent-skills surface alongside these three papers within days is itself a signal — &lt;strong&gt;research, tooling, and practice are converging in the same direction.&lt;/strong&gt; The differentiator in the next round is much more likely to be cooperation topology, memory evolution policy, and experience-structure modeling than raw model size.&lt;/p&gt;
&lt;h2 id="references"&gt;References
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Papers&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://arxiv.org/abs/2305.14325" target="_blank" rel="noopener"
 &gt;Improving Factuality and Reasoning in Language Models through Multiagent Debate (2305.14325)&lt;/a&gt; — Du, Li, Torralba, Tenenbaum, Mordatch (&lt;a class="link" href="https://www.mit.edu/" target="_blank" rel="noopener"
 &gt;MIT&lt;/a&gt;, 2023)&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://arxiv.org/abs/2604.04503" target="_blank" rel="noopener"
 &gt;Memory Intelligence Agent (2604.04503)&lt;/a&gt; — Qiao et al. (2026)&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://arxiv.org/abs/2208.09058" target="_blank" rel="noopener"
 &gt;Mapping Husserlian Phenomenology onto Active Inference (2208.09058)&lt;/a&gt; — Albarracin, Pitliya, Ramstead, Yoshimi (2022)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Related concepts&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://en.wikipedia.org/wiki/Society_of_Mind" target="_blank" rel="noopener"
 &gt;Society of Mind&lt;/a&gt; — &lt;a class="link" href="https://en.wikipedia.org/wiki/Marvin_Minsky" target="_blank" rel="noopener"
 &gt;Marvin Minsky&lt;/a&gt;&amp;rsquo;s multi-agent theory of cognition&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openai.com/index/introducing-deep-research/" target="_blank" rel="noopener"
 &gt;Deep Research Agent&lt;/a&gt; — OpenAI&amp;rsquo;s tool-using agent system&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://en.wikipedia.org/wiki/Free_energy_principle" target="_blank" rel="noopener"
 &gt;Active Inference / Free Energy Principle&lt;/a&gt; — &lt;a class="link" href="https://www.fil.ion.ucl.ac.uk/~karl/" target="_blank" rel="noopener"
 &gt;Karl Friston&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://plato.stanford.edu/entries/husserl/" target="_blank" rel="noopener"
 &gt;Husserlian phenomenology (SEP)&lt;/a&gt; · &lt;a class="link" href="https://plato.stanford.edu/entries/phenomenology/" target="_blank" rel="noopener"
 &gt;Phenomenology (SEP)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://modelcontextprotocol.io/" target="_blank" rel="noopener"
 &gt;Model Context Protocol (MCP)&lt;/a&gt; — emerging tool-interface standard&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://iclr.cc/Conferences/2025" target="_blank" rel="noopener"
 &gt;ICLR 2025&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Background reading&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://arxiv.org/" target="_blank" rel="noopener"
 &gt;arxiv.org&lt;/a&gt; — preprint server&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://yilundu.github.io/" target="_blank" rel="noopener"
 &gt;Yilun Du&lt;/a&gt; · &lt;a class="link" href="https://cocosci.mit.edu/josh" target="_blank" rel="noopener"
 &gt;Joshua Tenenbaum&lt;/a&gt; · &lt;a class="link" href="https://groups.csail.mit.edu/vision/torralbalab/" target="_blank" rel="noopener"
 &gt;Antonio Torralba&lt;/a&gt; · &lt;a class="link" href="https://research.google/people/igor-mordatch/" target="_blank" rel="noopener"
 &gt;Igor Mordatch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://maxwelljdramstead.com/" target="_blank" rel="noopener"
 &gt;Maxwell J. D. Ramstead&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api" target="_blank" rel="noopener"
 &gt;GPT-Realtime-2 (parallel tool calls)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item></channel></rss>