<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Searxng on ICE-ICE-BEAR-BLOG</title><link>https://ice-ice-bear.github.io/tags/searxng/</link><description>Recent content in Searxng on ICE-ICE-BEAR-BLOG</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Tue, 07 Apr 2026 00:00:00 +0900</lastBuildDate><atom:link href="https://ice-ice-bear.github.io/tags/searxng/index.xml" rel="self" type="application/rss+xml"/><item><title>Running a Free, Private AI Assistant — Gemma 4 + SearXNG + OpenClaw Setup Guide</title><link>https://ice-ice-bear.github.io/posts/2026-04-07-gemma4-openclaw/</link><pubDate>Tue, 07 Apr 2026 00:00:00 +0900</pubDate><guid>https://ice-ice-bear.github.io/posts/2026-04-07-gemma4-openclaw/</guid><description>&lt;img src="https://ice-ice-bear.github.io/" alt="Featured image of post Running a Free, Private AI Assistant — Gemma 4 + SearXNG + OpenClaw Setup Guide" /&gt;&lt;h2 id="overview"&gt;Overview
&lt;/h2&gt;&lt;p&gt;Google just released Gemma 4, a family of open source models closely related to the Gemini 3 paid service and the Nano Banana image generation system. When combined with SearXNG for private web search and OpenClaw for agentic orchestration, you get a fully self-hosted AI assistant that rivals cloud offerings — completely free and with zero data leaving your machine.&lt;/p&gt;
&lt;p&gt;This post walks through the full setup: which Gemma 4 model to pick, how to run SearXNG locally, and how to wire everything into OpenClaw for an agentic AI workflow with web search capabilities.&lt;/p&gt;
&lt;h2 id="the-gemma-4-model-family"&gt;The Gemma 4 Model Family
&lt;/h2&gt;&lt;p&gt;Google released four new open source models under the Gemma 4 umbrella. They split into two tiers based on size and modality support:&lt;/p&gt;
&lt;h3 id="small-models-mobile-capable"&gt;Small Models (Mobile-Capable)
&lt;/h3&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Model&lt;/th&gt;
 &lt;th&gt;Parameters&lt;/th&gt;
 &lt;th&gt;Modalities&lt;/th&gt;
 &lt;th&gt;Target Hardware&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;E2B&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;~2B&lt;/td&gt;
 &lt;td&gt;Text, Image, Video, Audio&lt;/td&gt;
 &lt;td&gt;Mobile phones&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;E4B&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;~4B&lt;/td&gt;
 &lt;td&gt;Text, Image, Video, Audio&lt;/td&gt;
 &lt;td&gt;Mobile phones&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="large-models-desktopserver"&gt;Large Models (Desktop/Server)
&lt;/h3&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Model&lt;/th&gt;
 &lt;th&gt;Parameters&lt;/th&gt;
 &lt;th&gt;Modalities&lt;/th&gt;
 &lt;th&gt;Target Hardware&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;26B&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;~26B&lt;/td&gt;
 &lt;td&gt;Text, Image&lt;/td&gt;
 &lt;td&gt;Desktop GPU, server&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;31B&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;~31B&lt;/td&gt;
 &lt;td&gt;Text, Image&lt;/td&gt;
 &lt;td&gt;Desktop GPU, server&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The smaller E2B and E4B models are remarkable for their multimodal breadth — text, image, video, and audio processing in a package small enough for a phone. The larger 26B and 31B models trade audio/video support for deeper reasoning on text and image tasks.&lt;/p&gt;
&lt;p&gt;For OpenClaw&amp;rsquo;s agentic tool-calling workflow, the &lt;strong&gt;E4B model&lt;/strong&gt; stands out. Despite its small size, it handles structured function calls and multi-step reasoning with surprising competence. If you have the VRAM for the 26B or 31B, those will give better results on complex reasoning, but E4B is the sweet spot for most setups.&lt;/p&gt;
&lt;h2 id="architecture-how-the-pieces-fit-together"&gt;Architecture: How the Pieces Fit Together
&lt;/h2&gt;&lt;pre class="mermaid" style="visibility:hidden"&gt;graph TD
 User["User Query"] --&gt; OpenClaw["OpenClaw &amp;lt;br/&amp;gt; Agentic Orchestrator"]
 OpenClaw --&gt; Gemma["Gemma 4 Model &amp;lt;br/&amp;gt; Local Inference"]
 OpenClaw --&gt; SearXNG["SearXNG &amp;lt;br/&amp;gt; Private Web Search"]
 Gemma --&gt; ToolCall["Tool Call Decisions"]
 ToolCall --&gt; SearXNG
 SearXNG --&gt; Results["Search Results &amp;lt;br/&amp;gt; No Data Leaves Machine"]
 Results --&gt; Gemma
 Gemma --&gt; Response["Final Response"]
 Response --&gt; User

 style OpenClaw fill:#4a9eff,stroke:#333,color:#fff
 style Gemma fill:#34a853,stroke:#333,color:#fff
 style SearXNG fill:#ff6d3a,stroke:#333,color:#fff&lt;/pre&gt;&lt;p&gt;The flow is straightforward:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;User&lt;/strong&gt; sends a query to &lt;strong&gt;OpenClaw&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;OpenClaw routes the query to the &lt;strong&gt;Gemma 4&lt;/strong&gt; model running locally&lt;/li&gt;
&lt;li&gt;Gemma 4 decides whether it needs web search and issues &lt;strong&gt;tool calls&lt;/strong&gt; to SearXNG&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SearXNG&lt;/strong&gt; executes the search entirely locally — scraping results from search engines without sending your query to any third-party API&lt;/li&gt;
&lt;li&gt;Results feed back into Gemma 4 for synthesis&lt;/li&gt;
&lt;li&gt;The final response returns to the user&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;At no point does your data leave your machine. SearXNG acts as a meta-search engine proxy, and Gemma 4 runs entirely on local hardware.&lt;/p&gt;
&lt;h2 id="step-1-install-and-run-a-local-gemma-4-model"&gt;Step 1: Install and Run a Local Gemma 4 Model
&lt;/h2&gt;&lt;p&gt;You need a local inference server. The most common options are &lt;strong&gt;Ollama&lt;/strong&gt; and &lt;strong&gt;llama.cpp&lt;/strong&gt;. Ollama is simpler to set up:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Install Ollama (macOS/Linux)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;curl -fsSL https://ollama.com/install.sh &lt;span class="p"&gt;|&lt;/span&gt; sh
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Pull the E4B model (recommended for most setups)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;ollama pull gemma4:e4b
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Or pull the 27B model if you have sufficient VRAM (16GB+)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;ollama pull gemma4:27b
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Verify it&amp;#39;s running&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;ollama list
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Ollama exposes an OpenAI-compatible API at &lt;code&gt;http://localhost:11434&lt;/code&gt; by default. OpenClaw can connect to this directly.&lt;/p&gt;
&lt;h3 id="vram-requirements"&gt;VRAM Requirements
&lt;/h3&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Model&lt;/th&gt;
 &lt;th&gt;Quantization&lt;/th&gt;
 &lt;th&gt;Minimum VRAM&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;E2B&lt;/td&gt;
 &lt;td&gt;Q4_K_M&lt;/td&gt;
 &lt;td&gt;~2 GB&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;E4B&lt;/td&gt;
 &lt;td&gt;Q4_K_M&lt;/td&gt;
 &lt;td&gt;~3 GB&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;26B&lt;/td&gt;
 &lt;td&gt;Q4_K_M&lt;/td&gt;
 &lt;td&gt;~16 GB&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;31B&lt;/td&gt;
 &lt;td&gt;Q4_K_M&lt;/td&gt;
 &lt;td&gt;~20 GB&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;For Apple Silicon Macs, unified memory counts as VRAM. A 16GB M-series Mac can comfortably run E4B and potentially the 26B model with aggressive quantization.&lt;/p&gt;
&lt;h2 id="step-2-set-up-searxng-for-private-search"&gt;Step 2: Set Up SearXNG for Private Search
&lt;/h2&gt;&lt;p&gt;SearXNG is a free, open source meta-search engine. It aggregates results from Google, Bing, DuckDuckGo, and dozens of other engines without ever sharing your queries with those services directly in a trackable way.&lt;/p&gt;
&lt;p&gt;The easiest deployment method is Docker:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Clone the SearXNG Docker setup&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;git clone https://github.com/searxng/searxng-docker.git
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; searxng-docker
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Edit the .env file to set your hostname&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# For local-only use, localhost is fine&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;cp .env.example .env
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Start SearXNG&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;docker compose up -d
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;SearXNG will be available at &lt;code&gt;http://localhost:8080&lt;/code&gt;. You can verify it works by opening it in a browser and running a test search.&lt;/p&gt;
&lt;h3 id="key-searxng-configuration"&gt;Key SearXNG Configuration
&lt;/h3&gt;&lt;p&gt;Edit &lt;code&gt;searxng/settings.yml&lt;/code&gt; to enable the JSON API, which OpenClaw needs:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;secret_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;your-random-secret-key&amp;#34;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;limiter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# Disable rate limiting for local use&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;search&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;formats&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;html&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;json &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# Required for API access&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Restart the container after editing:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;docker compose restart
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="step-3-wire-everything-into-openclaw"&gt;Step 3: Wire Everything into OpenClaw
&lt;/h2&gt;&lt;p&gt;OpenClaw is an agentic framework that connects local LLMs with tools. Configure it to use your local Gemma 4 instance and SearXNG:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c"&gt;# openclaw config&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;ollama&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;gemma4:e4b&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;http://localhost:11434&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;web_search&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;searxng&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;http://localhost:8080&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;json&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;categories&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;general&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;news&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;science&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Once configured, launch OpenClaw and you have a fully functional AI assistant with web search — entirely self-hosted.&lt;/p&gt;
&lt;h2 id="performance-observations"&gt;Performance Observations
&lt;/h2&gt;&lt;p&gt;After running this setup, a few things stand out:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;E4B Tool Calling is Surprisingly Good.&lt;/strong&gt; For a 4B parameter model, E4B handles agentic workflows well. It correctly decides when to search, formulates reasonable queries, and synthesizes results coherently. It is not at the level of GPT-4o or Claude for complex multi-step reasoning, but for a free, private, local model, the quality is impressive.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SearXNG Latency is Acceptable.&lt;/strong&gt; Search queries typically return in 1-3 seconds. The bottleneck is usually the LLM inference, not the search.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Privacy is Genuine.&lt;/strong&gt; Running &lt;code&gt;tcpdump&lt;/code&gt; during a session confirms that no query data is sent to external AI APIs. SearXNG does make outbound requests to search engines, but these are standard web requests without persistent identifiers tied to your queries.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The 26B/31B Models Are Noticeably Better&lt;/strong&gt; for complex reasoning tasks, but the E4B model is the right default for most people. The jump from E4B to 26B requires significantly more hardware but doesn&amp;rsquo;t always produce proportionally better results for straightforward Q&amp;amp;A with search.&lt;/p&gt;
&lt;h2 id="when-to-use-this-vs-cloud-ai"&gt;When to Use This vs. Cloud AI
&lt;/h2&gt;&lt;p&gt;This setup is ideal when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Privacy is non-negotiable&lt;/strong&gt; — legal, medical, or financial queries you don&amp;rsquo;t want logged by any third party&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;You want zero recurring costs&lt;/strong&gt; — no API fees, no subscriptions&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;You&amp;rsquo;re on a restricted network&lt;/strong&gt; — environments where cloud AI services are blocked&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;You enjoy self-hosting&lt;/strong&gt; — the tinkering is part of the appeal&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Stick with cloud AI when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;You need state-of-the-art reasoning on complex tasks&lt;/li&gt;
&lt;li&gt;You&amp;rsquo;re working with very long documents that exceed local model context windows&lt;/li&gt;
&lt;li&gt;Uptime and reliability matter more than privacy&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="conclusion"&gt;Conclusion
&lt;/h2&gt;&lt;p&gt;The Gemma 4 + SearXNG + OpenClaw stack represents a meaningful milestone for self-hosted AI. A year ago, running a capable agentic AI assistant with web search locally would have required expensive hardware and produced mediocre results. Today, a laptop with 8GB of RAM can run E4B with SearXNG and get genuinely useful results — for free, with complete privacy.&lt;/p&gt;
&lt;p&gt;The setup takes about 15 minutes if you already have Docker and a package manager. For anyone who has been waiting for local AI to reach a practical threshold, this combination is worth trying.&lt;/p&gt;
&lt;h2 id="references"&gt;References
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://ai.google.dev/gemma" target="_blank" rel="noopener"
 &gt;Gemma 4 Model Card — Google&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://docs.searxng.org/" target="_blank" rel="noopener"
 &gt;SearXNG Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/openclaw" target="_blank" rel="noopener"
 &gt;OpenClaw GitHub Repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://ollama.com/" target="_blank" rel="noopener"
 &gt;Ollama — Local LLM Runner&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item></channel></rss>