<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Ultra-Plan on ICE-ICE-BEAR-BLOG</title><link>https://ice-ice-bear.github.io/tags/ultra-plan/</link><description>Recent content in Ultra-Plan on ICE-ICE-BEAR-BLOG</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Fri, 10 Apr 2026 00:00:00 +0900</lastBuildDate><atom:link href="https://ice-ice-bear.github.io/tags/ultra-plan/index.xml" rel="self" type="application/rss+xml"/><item><title>Claude Code Ultra Plan and the AI Dev Ecosystem — YC, RunPod, and Open-Source Tools</title><link>https://ice-ice-bear.github.io/posts/2026-04-10-claude-ultraplan/</link><pubDate>Fri, 10 Apr 2026 00:00:00 +0900</pubDate><guid>https://ice-ice-bear.github.io/posts/2026-04-10-claude-ultraplan/</guid><description>&lt;img src="https://ice-ice-bear.github.io/" alt="Featured image of post Claude Code Ultra Plan and the AI Dev Ecosystem — YC, RunPod, and Open-Source Tools" /&gt;&lt;p&gt;An exploration of how Claude Code Ultra Plan is reshaping the AI development workflow — alongside Y Combinator&amp;rsquo;s AI-native startup model, RunPod serverless GPU deployment, and noteworthy developer tools in the ecosystem.&lt;/p&gt;
&lt;h2 id="ultra-plan-multi-agent-coding-workflow"&gt;Ultra Plan: Multi-Agent Coding Workflow
&lt;/h2&gt;&lt;p&gt;A YouTube video from 午後5時 walked through the &lt;strong&gt;Claude Code Ultra Plan&lt;/strong&gt; workflow in detail. The core idea is a multi-agent architecture: three exploration agents independently analyze the codebase, and a single critique agent consolidates and validates their findings.&lt;/p&gt;
&lt;pre class="mermaid" style="visibility:hidden"&gt;flowchart TD
 U["User Request (CLI)"] --&gt; P["Ultra Plan Generation"]
 P --&gt; W["Review / Edit Plan in Web UI"]
 W --&gt; E1["Exploration Agent #1"]
 W --&gt; E2["Exploration Agent #2"]
 W --&gt; E3["Exploration Agent #3"]
 E1 --&gt; C["Critique Agent"]
 E2 --&gt; C
 E3 --&gt; C
 C --&gt; R["Final Execution"]
 R --&gt; T["Send to Terminal or Run in Browser"]&lt;/pre&gt;&lt;p&gt;The reported outcome: &lt;strong&gt;tasks that took 15 minutes now take 5&lt;/strong&gt;. This isn&amp;rsquo;t just a speed improvement — the exploration agents run in parallel and try different approaches, which means edge cases that a single linear pass might miss tend to get caught. The web UI and desktop integration enables a smooth loop: generate a plan in the CLI, review and refine it in the browser, then send it back to the terminal for execution.&lt;/p&gt;
&lt;h2 id="y-combinator-and-the-ai-native-startup-model"&gt;Y Combinator and the AI-Native Startup Model
&lt;/h2&gt;&lt;p&gt;Y Combinator&amp;rsquo;s &amp;ldquo;The New Way To Build A Startup&amp;rdquo; video was more striking. Anthropic engineers themselves write code using Claude Code, and the video describes &lt;strong&gt;individual engineers running 3-8 Claude instances simultaneously&lt;/strong&gt;. The claim that YC companies are shipping &amp;ldquo;dramatically faster&amp;rdquo; isn&amp;rsquo;t hype — it reflects a structural change in how software gets built.&lt;/p&gt;
&lt;p&gt;What this signals is a shift in the developer&amp;rsquo;s role: from &amp;ldquo;person who writes code&amp;rdquo; to &amp;ldquo;orchestrator of AI agents.&amp;rdquo; The core skill is no longer typing out implementations line by line — it&amp;rsquo;s distributing tasks across multiple AI instances and verifying the results.&lt;/p&gt;
&lt;h2 id="runpod-serving-llms-on-gpu-cloud"&gt;RunPod: Serving LLMs on GPU Cloud
&lt;/h2&gt;&lt;p&gt;&lt;a class="link" href="https://runpod.io" target="_blank" rel="noopener"
 &gt;RunPod&lt;/a&gt; is a GPU cloud infrastructure provider — also an OpenAI infrastructure partner. A Korean blog post covering &lt;strong&gt;RunPod Serverless + vLLM&lt;/strong&gt; for LLM deployment was one of the more practical guides I found.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Serving an LLM on RunPod Serverless with vLLM (conceptual)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Example: gemma-2-9b-it&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# 1. Package vLLM + model into a Docker image&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# 2. Create a RunPod Serverless endpoint&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# 3. Call it via OpenAI-compatible API&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;openai&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;your-runpod-api-key&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;https://api.runpod.ai/v2/&lt;/span&gt;&lt;span class="si"&gt;{endpoint_id}&lt;/span&gt;&lt;span class="s2"&gt;/openai/v1&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;google/gemma-2-9b-it&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;role&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;user&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;content&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;Hello!&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Serverless GPU pods mean you pay nothing during idle time, and the OpenAI-compatible API means existing code only needs a &lt;code&gt;base_url&lt;/code&gt; swap. The barrier to self-hosting an LLM has dropped considerably.&lt;/p&gt;
&lt;h2 id="dev-tools-worth-noting"&gt;Dev Tools Worth Noting
&lt;/h2&gt;&lt;p&gt;&lt;a class="link" href="https://github.com/siddharthvaddem/openscreen" target="_blank" rel="noopener"
 &gt;OpenScreen&lt;/a&gt; (27,132 stars) is a free, open-source alternative to Screen Studio — no watermarks, no subscription. Useful for developer tutorials and demo recordings.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;self.md ui-design plugin&lt;/strong&gt; for Claude Code bundles nine UI/UX design skills into a single plugin — from design system setup to component architecture. It&amp;rsquo;s a good example of how Claude Code&amp;rsquo;s plugin ecosystem is maturing into something more than just task shortcuts.&lt;/p&gt;
&lt;p&gt;I also compared two different projects that both go by the name &lt;strong&gt;HarnessKit&lt;/strong&gt; — both in the AI agent harness space but with distinct approaches. Worth keeping an eye on as the agentic tooling landscape continues to fragment and consolidate.&lt;/p&gt;
&lt;h2 id="insights"&gt;Insights
&lt;/h2&gt;&lt;p&gt;The overarching theme is &lt;strong&gt;orchestration&lt;/strong&gt;. Claude Code Ultra Plan decomposes coding tasks into an exploration-critique-execution pipeline. YC startups run multiple AI instances in parallel. RunPod simplifies LLM serving with serverless GPU. The competitive edge is shifting from individual model performance to how well you compose and manage these systems.&lt;/p&gt;
&lt;p&gt;The developer role shift is equally significant. Time spent directly writing code is becoming less important than time spent distributing tasks across AI agents and verifying their output. Tools like OpenScreen and self.md reflect this same trend — automating other parts of the development workflow (demo recording, UI design) so the developer can focus on orchestration and judgment.&lt;/p&gt;</description></item></channel></rss>