Featured image of post GPT-5.4 Prompt Guide Essentials — Lock Down the Contract Before Tuning Reasoning

GPT-5.4 Prompt Guide Essentials — Lock Down the Contract Before Tuning Reasoning

Key takeaways from OpenAI's official GPT-5.4 prompt guide. Four techniques — output contract, tool persistence rules, completeness contract, and verification loop — outperform raw reasoning effort in cost-effectiveness.

Overview

GPT-5.4 excels at maintaining long context, multi-step agentic tasks, and grounded synthesis. But these strengths don’t emerge automatically. OpenAI’s official prompt guide makes a clear point: “reduce drift” comes before “encourage deeper thinking.”

Four Core Techniques

Raising reasoning_effort is the last fine-tuning knob. In most cases, the four prompt techniques below give better cost-effectiveness first.

1. Output Contract

Prevents GPT-5.4 from breaking your format by inserting helpful-sounding explanations.

<output_contract>
- Return exactly the sections requested, in the requested order.
- If a format is required (JSON, Markdown, SQL, XML), output only that format.
</output_contract>

Think of it like a delivery spec. Stating “the deliverable must follow this template” reduces over-helpfulness and parsing failures.

2. Tool Persistence Rules

The most common agent failure pattern: skipping a prior lookup because the answer seems obvious. Three principles to prevent this:

PrincipleDescription
Forced useWhen accuracy, grounding, or completeness is at stake, tools must be used; if results are empty, retry with a different strategy
Dependency checkBefore acting, verify whether a prior lookup is needed — never skip it even if the final state seems obvious
Parallel vs sequentialIndependent lookups run in parallel for speed; dependent steps run sequentially for accuracy

The goal: make tool use a prerequisite, not an option.

3. Completeness Contract

Solves the problem of models “halfheartedly finishing” long batch tasks.

  • Every requested item must be processed, or marked [blocked] if not
  • When a search returns empty, don’t immediately conclude “nothing found” — try at least 1–2 alternative strategies first (different query, broader filter, prior lookup, different source)

These two rules give prompts the “endurance” to carry long tasks all the way through.

4. Verification Loop

A four-pronged check just before completion:

Additional gating rule: “If needed information is missing, don’t guess — use a lookup tool if possible; if not, ask only the minimum necessary question.”

Handling Mid-Conversation Direction Changes

Users frequently change course. The default policy:

  • Reversible and low-risk → proceed without asking
  • External impact / irreversible / sensitive data → ask for permission
  • Instruction priority: the latest user instruction overrides earlier style rules — except safety, honesty, and privacy, which are never overridden

Grounding Research Quality

The biggest risk in AI research: blurring the line between what was actually found and what was inferred.

  • Only cite sources actually retrieved in this workflow
  • Never fabricate URLs, IDs, or quotations
  • Place citations inline next to each claim, not bundled at the end
  • Enforce a 3-phase research cycle: decompose the question → search each sub-question + follow secondary leads → reconcile contradictions, then write with citations

reasoning_effort Tuning

Task typeRecommended level
Fast execution / extraction / classification / short transformsnone ~ low
Long synthesis / multi-document review / strategic writingmedium or above
xhighOnly when evals show a clear gain

Migration order: swap the model first → fix reasoning effort → evaluate → add prompt blocks → adjust the reasoning knob one step at a time.

Insight

The lessons in this guide aren’t specific to GPT-5.4. They’re universal patterns that apply to Claude, Gemini, and any other LLM agent. The core strategy is to exploit the model’s ability to follow rules precisely and consistently: output contract, forced tool use, completeness contract, verification loop. Fix these four first, then raise reasoning effort only if you still need more. Ultimately, prompt engineering is not about telling an AI to think harder — it’s about eliminating the room for it to drift.

Built with Hugo
Theme Stack designed by Jimmy