<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Planning on ICE-ICE-BEAR-BLOG</title><link>https://ice-ice-bear.github.io/tags/planning/</link><description>Recent content in Planning on ICE-ICE-BEAR-BLOG</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Tue, 07 Apr 2026 00:00:00 +0900</lastBuildDate><atom:link href="https://ice-ice-bear.github.io/tags/planning/index.xml" rel="self" type="application/rss+xml"/><item><title>Claude Code Power User Guide — Ultra Planning, Karpathy's Obsidian RAG, and Context Optimization</title><link>https://ice-ice-bear.github.io/posts/2026-04-07-claude-code-power-user/</link><pubDate>Tue, 07 Apr 2026 00:00:00 +0900</pubDate><guid>https://ice-ice-bear.github.io/posts/2026-04-07-claude-code-power-user/</guid><description>&lt;img src="https://ice-ice-bear.github.io/" alt="Featured image of post Claude Code Power User Guide — Ultra Planning, Karpathy's Obsidian RAG, and Context Optimization" /&gt;&lt;h2 id="overview"&gt;Overview
&lt;/h2&gt;&lt;p&gt;Claude Code has rapidly evolved from a simple terminal-based coding assistant into a sophisticated development environment. This post covers four key developments that, taken together, represent a shift in how power users interact with AI coding agents: ultra plan mode for web-based planning, Karpathy&amp;rsquo;s surprisingly simple Obsidian RAG system, self-evolving memory for coding agents, and practical rules for context window optimization. These aren&amp;rsquo;t just incremental improvements — they address fundamental bottlenecks in the AI-assisted development workflow.&lt;/p&gt;
&lt;h2 id="ultra-plan-mode--planning-at-web-speed"&gt;Ultra Plan Mode — Planning at Web Speed
&lt;/h2&gt;&lt;p&gt;The first major development is &amp;ldquo;ultra plan mode,&amp;rdquo; which offloads Claude Code&amp;rsquo;s planning phase to the web interface. The core insight is simple but powerful: planning and implementation have fundamentally different computational profiles.&lt;/p&gt;
&lt;p&gt;When you plan locally in the terminal, Claude Code must work within the constraints of the CLI environment — sequential token generation, limited visual output, and the same context window that will later be used for implementation. Ultra plan mode breaks this coupling.&lt;/p&gt;
&lt;h3 id="how-it-works"&gt;How It Works
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Initiate planning&lt;/strong&gt; in the terminal as usual&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Planning transfers to Claude Code on the web&lt;/strong&gt;, where it runs with a dedicated context&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Web UI presents structured output&lt;/strong&gt;: context summaries, architecture diagrams, new file specifications, and modification plans&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Interactive review&lt;/strong&gt;: leave emoji reactions and comments on individual plan elements&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Approve the plan&lt;/strong&gt; on the web, which teleports execution back to the terminal&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The speed difference is significant — roughly 1 minute on the web versus 4+ minutes locally. But speed isn&amp;rsquo;t the only benefit. The web interface enables a richer planning format that the terminal simply cannot display well. You get visual structure, expandable sections, and the ability to annotate specific parts of the plan before implementation begins.&lt;/p&gt;
&lt;h3 id="why-this-matters"&gt;Why This Matters
&lt;/h3&gt;&lt;p&gt;This is an early example of &lt;strong&gt;multi-surface AI workflows&lt;/strong&gt; — the idea that different phases of a task should happen in different environments optimized for that phase. Planning is a visual, iterative activity that benefits from rich UI. Implementation is a sequential, file-system-oriented activity that belongs in the terminal. Ultra plan mode respects this distinction.&lt;/p&gt;
&lt;h2 id="karpathys-obsidian-rag--the-anti-rag"&gt;Karpathy&amp;rsquo;s Obsidian RAG — The Anti-RAG
&lt;/h2&gt;&lt;p&gt;Andrej Karpathy&amp;rsquo;s approach to personal knowledge management with LLMs is notable for what it doesn&amp;rsquo;t use: no vector database, no embeddings, no chunking strategy, no retrieval pipeline. Instead, it uses Obsidian as a structured file system and Claude Code as the query layer.&lt;/p&gt;
&lt;h3 id="the-architecture"&gt;The Architecture
&lt;/h3&gt;&lt;pre class="mermaid" style="visibility:hidden"&gt;flowchart TD
 A["External Data Sources&amp;lt;br/&amp;gt;Articles, Papers, Repos"] --&gt; B["Data Ingestion&amp;lt;br/&amp;gt;Scripts and Automation"]
 B --&gt; C["Obsidian Vault&amp;lt;br/&amp;gt;Raw Directory"]
 C --&gt; D["Organized Notes&amp;lt;br/&amp;gt;Structured Markdown"]
 D --&gt; E["Claude Code&amp;lt;br/&amp;gt;Query and Reasoning"]
 E --&gt; F["Synthesized Knowledge&amp;lt;br/&amp;gt;Connections and Insights"]
 F -.-&gt; D

 style A fill:#f9f,stroke:#333
 style C fill:#bbf,stroke:#333
 style E fill:#bfb,stroke:#333&lt;/pre&gt;&lt;h3 id="why-it-works-without-embeddings"&gt;Why It Works Without Embeddings
&lt;/h3&gt;&lt;p&gt;Traditional RAG systems solve a specific problem: given a query, find the most relevant chunks from a large corpus. This requires embeddings to create a semantic search space. But Karpathy&amp;rsquo;s system sidesteps this entirely by relying on two things:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;File system structure as implicit indexing&lt;/strong&gt; — a well-organized directory tree with descriptive filenames and folders acts as a human-readable index. Claude Code can traverse this structure and read file names to narrow down relevant content without embeddings.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;LLM context windows are large enough&lt;/strong&gt; — with 200K+ token context windows, you can feed substantial amounts of raw text directly to the model. The LLM itself performs the &amp;ldquo;retrieval&amp;rdquo; by reading and reasoning over the content.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This approach is essentially free to run, requires no infrastructure, and produces comparable results to traditional RAG for personal-scale knowledge bases. The tradeoff is that it doesn&amp;rsquo;t scale to millions of documents — but for a solo developer or small team, that&amp;rsquo;s rarely necessary.&lt;/p&gt;
&lt;h3 id="the-key-insight"&gt;The Key Insight
&lt;/h3&gt;&lt;p&gt;The file system is an underrated data structure for LLM interaction. A thoughtfully organized directory with clear naming conventions provides enough structure for an LLM to navigate efficiently. You don&amp;rsquo;t need a database when your file system &lt;em&gt;is&lt;/em&gt; the database.&lt;/p&gt;
&lt;h2 id="self-evolving-agent-memory"&gt;Self-Evolving Agent Memory
&lt;/h2&gt;&lt;p&gt;Building on Karpathy&amp;rsquo;s knowledge base concept, the second video explores applying the same pattern to Claude Code&amp;rsquo;s own memory — but with a critical twist. Instead of ingesting external data, the system captures and structures internal data from coding conversations.&lt;/p&gt;
&lt;h3 id="from-external-data-to-internal-knowledge"&gt;From External Data to Internal Knowledge
&lt;/h3&gt;&lt;p&gt;Karpathy&amp;rsquo;s original pattern:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Input&lt;/strong&gt;: articles, papers, repos (external)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Storage&lt;/strong&gt;: Obsidian vault&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Query&lt;/strong&gt;: Claude Code reads the vault&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The adapted pattern for coding agents:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Input&lt;/strong&gt;: conversation history, decisions made, patterns discovered (internal)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Storage&lt;/strong&gt;: structured memory files in the project&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Query&lt;/strong&gt;: Claude Code reads its own memory on startup&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is fundamentally different from CLAUDE.md, which is a static instruction file. Self-evolving memory updates itself based on what happens during development sessions. When Claude Code discovers that a particular approach works well for your codebase, or learns about an architectural decision, that knowledge persists across sessions.&lt;/p&gt;
&lt;h3 id="practical-implementation"&gt;Practical Implementation
&lt;/h3&gt;&lt;p&gt;The memory system mirrors Karpathy&amp;rsquo;s vault structure:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Raw captures&lt;/strong&gt; from conversations (what was discussed, what was decided)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Structured notes&lt;/strong&gt; organized by topic (architecture decisions, debugging patterns, user preferences)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cross-references&lt;/strong&gt; between related pieces of knowledge&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The result is a coding agent that genuinely gets better at working with your specific codebase over time, rather than starting fresh with each conversation.&lt;/p&gt;
&lt;h2 id="context-optimization--the-12-rules"&gt;Context Optimization — The 12 Rules
&lt;/h2&gt;&lt;p&gt;Context window management is the most underappreciated skill in AI-assisted development. Every file read, every tool call, every message consumes tokens. When context fills up with noise, the model&amp;rsquo;s attention degrades and output quality drops.&lt;/p&gt;
&lt;h3 id="the-context-bloat-problem"&gt;The Context Bloat Problem
&lt;/h3&gt;&lt;pre class="mermaid" style="visibility:hidden"&gt;flowchart LR
 A["Fresh Context&amp;lt;br/&amp;gt;100% Available"] --&gt; B["File Reads&amp;lt;br/&amp;gt;Large Files Consume Space"]
 B --&gt; C["Tool Outputs&amp;lt;br/&amp;gt;Error Messages and Logs"]
 C --&gt; D["Conversation History&amp;lt;br/&amp;gt;Back-and-Forth Messages"]
 D --&gt; E["Degraded Context&amp;lt;br/&amp;gt;Attention Spread Thin"]

 style A fill:#bfb,stroke:#333
 style E fill:#f99,stroke:#333&lt;/pre&gt;&lt;h3 id="key-rules-worth-highlighting"&gt;Key Rules Worth Highlighting
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Rule 1: Shorten CLAUDE.md&lt;/strong&gt; — The difference between a 910-line CLAUDE.md and a 33-line one is approximately 4% of the context window. That sounds small, but it&amp;rsquo;s loaded on every single conversation. Over hundreds of sessions, that overhead compounds. Keep CLAUDE.md focused on what the agent needs to know for &lt;em&gt;every&lt;/em&gt; task, and move specialized knowledge into topic-specific files that are loaded on demand.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rule 2: The 50% Threshold&lt;/strong&gt; — Add an instruction telling Claude to suggest starting a new conversation or using sub-agents when context exceeds 50%. This is counterintuitive — most users try to push through in a single session. But a fresh context with a clear, specific task consistently outperforms a bloated context trying to handle everything.&lt;/p&gt;
&lt;h3 id="the-mental-model"&gt;The Mental Model
&lt;/h3&gt;&lt;p&gt;Think of context as working memory, not storage. You wouldn&amp;rsquo;t try to hold an entire codebase in your head while debugging a single function. Similarly, an LLM works best when its context contains only what&amp;rsquo;s relevant to the current task.&lt;/p&gt;
&lt;p&gt;The 12 rules collectively push toward a principle: &lt;strong&gt;make the agent actively work to keep its context clean&lt;/strong&gt;, rather than passively accumulating everything it touches.&lt;/p&gt;
&lt;h2 id="connecting-the-dots"&gt;Connecting the Dots
&lt;/h2&gt;&lt;p&gt;These four topics form a coherent system:&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Component&lt;/th&gt;
 &lt;th&gt;Problem Solved&lt;/th&gt;
 &lt;th&gt;Mechanism&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Ultra Plan Mode&lt;/td&gt;
 &lt;td&gt;Planning is slow and limited in terminal&lt;/td&gt;
 &lt;td&gt;Multi-surface workflow&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Obsidian RAG&lt;/td&gt;
 &lt;td&gt;Knowledge retrieval is overengineered&lt;/td&gt;
 &lt;td&gt;File system as database&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Self-Evolving Memory&lt;/td&gt;
 &lt;td&gt;Agent forgets between sessions&lt;/td&gt;
 &lt;td&gt;Structured conversation capture&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Context Optimization&lt;/td&gt;
 &lt;td&gt;Context fills with noise&lt;/td&gt;
 &lt;td&gt;Active context management&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The common thread is &lt;strong&gt;simplicity through structure&lt;/strong&gt;. Karpathy doesn&amp;rsquo;t need a vector database because his file system is well-organized. Ultra plan mode doesn&amp;rsquo;t need a complex orchestration system because it cleanly separates planning from implementation. Context optimization doesn&amp;rsquo;t need fancy token management because a few clear rules keep things lean.&lt;/p&gt;
&lt;p&gt;For developers building AI-assisted workflows, the takeaway is clear: before reaching for complex infrastructure, ask whether better organization of what you already have might solve the problem.&lt;/p&gt;
&lt;h2 id="references"&gt;References
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://www.youtube.com/watch?v=example1" target="_blank" rel="noopener"
 &gt;Planning In Claude Code Just Got a Huge Upgrade&lt;/a&gt; — nate herk&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://www.youtube.com/watch?v=example2" target="_blank" rel="noopener"
 &gt;I Built Self-Evolving Claude Code Memory w/ Karpathy&amp;rsquo;s LLM Knowledge Bases&lt;/a&gt; — nate herk&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://www.youtube.com/watch?v=example3" target="_blank" rel="noopener"
 &gt;Karpathy Just Replaced RAG With Obsidian + Claude Code&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://www.youtube.com/watch?v=example4" target="_blank" rel="noopener"
 &gt;How I Save Over 50% of My Claude Code Context (12 Rules)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item></channel></rss>