<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Devlog on ICE-ICE-BEAR-BLOG</title><link>https://ice-ice-bear.github.io/categories/devlog/</link><description>Recent content in Devlog on ICE-ICE-BEAR-BLOG</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Mon, 29 Jun 2026 00:00:00 +0900</lastBuildDate><atom:link href="https://ice-ice-bear.github.io/categories/devlog/index.xml" rel="self" type="application/rss+xml"/><item><title>hybrid-image-search-demo #22 — Inpaint Editor, Implied-Person Model Injection, and Login Gating</title><link>https://ice-ice-bear.github.io/posts/2026-06-29-hybrid-search-dev22/</link><pubDate>Mon, 29 Jun 2026 00:00:00 +0900</pubDate><guid>https://ice-ice-bear.github.io/posts/2026-06-29-hybrid-search-dev22/</guid><description>&lt;img src="https://ice-ice-bear.github.io/" alt="Featured image of post hybrid-image-search-demo #22 — Inpaint Editor, Implied-Person Model Injection, and Login Gating" /&gt;&lt;h2 id="overview"&gt;Overview
&lt;/h2&gt;&lt;p&gt;This cycle pushed the image-generation demo from a &amp;ldquo;one-shot generator&amp;rdquo; toward an &amp;ldquo;editable workstation.&amp;rdquo; I added a rectangle mask tool to the inpaint editor, taught the LLM to detect people the user only &lt;em&gt;implied&lt;/em&gt; (not named) and auto-inject the right model reference, and introduced allowlist-based login gating so the demo could be opened to external users. These 54 commits group into six threads: inpaint, model injection, block prompts, auth, library UX, and infra.&lt;/p&gt;
&lt;p&gt;&lt;a class="link" href="https://ice-ice-bear.github.io/posts/2026-05-29-hybrid-search-dev21/" &gt;Previous: #21 — hybrid-image-search-demo dev21&lt;/a&gt;&lt;/p&gt;
&lt;pre class="mermaid" style="visibility:hidden"&gt;graph TD
 U["User prompt"] --&gt; LLM["LLM intent analysis (intent_person)"]
 LLM --&gt;|"implied person detected"| INJ["auto-inject model reference"]
 INJ --&gt; GEN["generation engine (OpenAI / Gemini)"]
 GEN --&gt; NORM["PNG normalization + byte-format filenames"]
 NORM --&gt; GAL["library / gallery"]
 GAL --&gt; EDIT["inpaint editor (rectangle mask)"]
 EDIT --&gt; GEN
 AUTH["account allowlist gating"] -.-&gt;|"only if login passes"| GAL&lt;/pre&gt;&lt;hr&gt;
&lt;h2 id="inpaint-editor-a-mask-tool-for-partial-regeneration"&gt;Inpaint Editor: A Mask Tool for Partial Regeneration
&lt;/h2&gt;&lt;p&gt;The biggest feature was the inpaint editor. Redrawing only part of a generated image requires a mask that says &lt;em&gt;where&lt;/em&gt; to redraw, so this cycle added a &lt;strong&gt;rectangle mask tool&lt;/strong&gt;. I also placed an inpaint entry button in the action bar, so you can pick an image from the gallery and jump straight into the partial-edit flow.&lt;/p&gt;
&lt;h3 id="debugging"&gt;Debugging
&lt;/h3&gt;&lt;p&gt;Wiring up the inpaint entry button also surfaced a &lt;strong&gt;re-upload bug&lt;/strong&gt; — re-uploading an image didn&amp;rsquo;t cleanly reset the previous state. At one point an unbalanced JSX block in &lt;code&gt;InpaintEditor&lt;/code&gt; broke the build (&lt;code&gt;fix(frontend): close unbalanced InpaintEditor JSX block&lt;/code&gt;), a side effect of the component growing more conditional render branches. I then refined the inpaint comparison view and gallery controls so the original and the inpainted result sit side by side as an A/B pair.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="implied-person-model-injection-trusting-llm-intent"&gt;Implied-Person Model Injection: Trusting LLM Intent
&lt;/h2&gt;&lt;p&gt;The most &amp;ldquo;AI-native&amp;rdquo; work this cycle was &lt;strong&gt;implied-person model injection&lt;/strong&gt;. When a user doesn&amp;rsquo;t explicitly name a model — saying &amp;ldquo;sitting in a café&amp;rdquo; rather than &amp;ldquo;place this person in…&amp;rdquo; — the LLM reads that intent (&lt;code&gt;intent_person&lt;/code&gt;) and automatically injects the appropriate model reference into the prompt.&lt;/p&gt;
&lt;h3 id="implementation"&gt;Implementation
&lt;/h3&gt;&lt;p&gt;It evolved in three steps. First, the base logic to auto-inject a model reference for implied-person prompts (&lt;code&gt;feat: auto-inject model ref for implied-person prompts&lt;/code&gt;). Then I changed it to &lt;strong&gt;trust the LLM&amp;rsquo;s &lt;code&gt;intent_person&lt;/code&gt; verdict directly&lt;/strong&gt; (&lt;code&gt;fix: trust LLM intent_person for model injection&lt;/code&gt;) — an early heuristic re-check was occasionally overriding the LLM, so I leaned on trust instead. Finally I broadened the &lt;code&gt;intent_person&lt;/code&gt; definition itself so more phrasings of an implied person get recognized (&lt;code&gt;fix: broaden LLM intent_person definition&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;The lesson was the trade-off between trusting an LLM&amp;rsquo;s structured output versus filtering it again with rules. Trusting the model&amp;rsquo;s intent verdict — while widening the definition — produced the more natural UX.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="block-based-prompts-and-dual-engine-preview"&gt;Block-Based Prompts and Dual-Engine Preview
&lt;/h2&gt;&lt;p&gt;Prompt authoring changed substantially too. &lt;strong&gt;Block-based prompts&lt;/strong&gt; treat a prompt not as one long string but as a composition of reusable blocks (&lt;code&gt;feat(model-gen): block-based prompts + dual-engine preview/select&lt;/code&gt;), with simultaneous OpenAI and Gemini previews the user chooses between.&lt;/p&gt;
&lt;p&gt;Follow-ups added a block-as-template UX, Korean localization, and a background generation lifecycle (&lt;code&gt;block-as-template UX + KO localization + bg lifecycle&lt;/code&gt;), surfaced the fully &lt;em&gt;resolved&lt;/em&gt; prompt on the preview wait screen (&lt;code&gt;show resolved prompt on preview wait screen&lt;/code&gt;), grouped resumed previews into per-pair summary blocks, and made each pair&amp;rsquo;s resolved blocks a collapsible summary.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="login-gating-opening-the-demo-to-outsiders"&gt;Login Gating: Opening the Demo to Outsiders
&lt;/h2&gt;&lt;p&gt;To open the demo to external users, I reworked auth. The core is &lt;strong&gt;allowlist-based login gating&lt;/strong&gt;: Google login itself is allowed, but only accounts on an explicit allowlist actually enter the app (&lt;code&gt;feat(auth): gate Google login behind an explicit account allowlist&lt;/code&gt;). Adding a specific designer account (joonghodesign) to the allowlist came along with it.&lt;/p&gt;
&lt;p&gt;Alongside, I collapsed a two-account structure (&lt;code&gt;get rid of two accounts&lt;/code&gt;), added a confirmed logout flow (&lt;code&gt;feat(ui): add confirmed logout flow&lt;/code&gt;), and removed a default Korean race directive baked into the UI (&lt;code&gt;chore: remove default Korean race directive&lt;/code&gt;) so the demo no longer assumes a particular region/race as the default.&lt;/p&gt;
&lt;pre class="mermaid" style="visibility:hidden"&gt;graph LR
 A["Google login attempt"] --&gt; B{"account on allowlist?"}
 B --&gt;|Yes| C["enter app"]
 B --&gt;|No| D["access blocked"]&lt;/pre&gt;&lt;hr&gt;
&lt;h2 id="library--gallery-ux-and-infra-cleanup"&gt;Library / Gallery UX and Infra Cleanup
&lt;/h2&gt;&lt;p&gt;The library moved out of a separate panel into a &lt;strong&gt;tab switcher inside General mode&lt;/strong&gt; (&lt;code&gt;merge library panel into General mode with tab switcher&lt;/code&gt;), letting model and product references coexist in one General tab. I added prompt search over generated images and a flow to regenerate failed model generations in place from the library.&lt;/p&gt;
&lt;p&gt;On infra, I made the S3 reference-key cache build in the background at startup so the &lt;strong&gt;app no longer hangs on launch&lt;/strong&gt; (&lt;code&gt;build S3 ref-key cache in background so startup doesn't hang&lt;/code&gt;), and added telemetry instrumenting the Gemini semaphore&amp;rsquo;s acquire-wait time. Generated images are now named by their real byte format rather than a hardcoded extension, and all are normalized to PNG. On cost, I lowered OpenAI image quality from high to medium to tune cost/latency.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="commit-log"&gt;Commit Log
&lt;/h2&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Area&lt;/th&gt;
 &lt;th&gt;Key work&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Inpaint&lt;/td&gt;
 &lt;td&gt;rectangle mask tool, action-bar entry + re-upload fix, comparison/gallery controls&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Model injection&lt;/td&gt;
 &lt;td&gt;auto-inject model ref for implied persons, trust LLM &lt;code&gt;intent_person&lt;/code&gt;, broaden definition&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Block prompts&lt;/td&gt;
 &lt;td&gt;block-based prompts + dual-engine preview, template UX, per-pair summaries&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Auth&lt;/td&gt;
 &lt;td&gt;allowlist-based Google login gating, confirmed logout, remove default race directive&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Library UX&lt;/td&gt;
 &lt;td&gt;General-tab consolidation, prompt search, regenerate failed gens, liked-gallery sync&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Infra&lt;/td&gt;
 &lt;td&gt;background S3 cache build, Gemini semaphore telemetry, PNG normalization, OpenAI quality tuning&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 id="insights"&gt;Insights
&lt;/h2&gt;&lt;p&gt;Two currents run through this cycle. One is &lt;strong&gt;editability&lt;/strong&gt;: inpaint masks, block prompts, in-place regeneration of failures — all moving away from &amp;ldquo;generate once, done&amp;rdquo; toward &amp;ldquo;grab the result and keep refining it.&amp;rdquo; That mirrors how the value of generative tools shifts from single-shot quality to the smoothness of the iterative edit loop.&lt;/p&gt;
&lt;p&gt;The other is &lt;strong&gt;robustness for going public&lt;/strong&gt;: allowlist gating, confirmed logout, non-blocking background S3 caching, and semaphore-wait instrumentation are exactly what you need crossing from &amp;ldquo;a demo I use&amp;rdquo; to &amp;ldquo;a demo I show others.&amp;rdquo; The decision to trust the LLM&amp;rsquo;s &lt;code&gt;intent_person&lt;/code&gt; — accepting the model&amp;rsquo;s structured verdict instead of re-filtering with a heuristic — is a clean example of a trade-off you keep meeting in modern agent design.&lt;/p&gt;</description></item></channel></rss>