It is Thursday afternoon. You paste the launch brief into Cursor and ask the obvious question:
Should we delay the v2 release by a week to fix performance, or ship on Friday as planned?
The reply looks helpful. It cites one metric, names a risk, throws in a morale worry, suggests a compromise, and ends with:
it depends on your risk tolerance
You read it twice. You still cannot tell whether the performance data drove the answer or whether the model just wanted to sound balanced.
That is not a bad model. It is a mixed hat problem: facts, feelings, risks, upsides, and new ideas arrived in the same paragraph, so none of them are easy to audit.
The problem with one “helpful” answer
Hard decisions need more than multiple angles. They need angles kept apart long enough to trust.
Edward de Bono’s Six Thinking Hats is a form of parallel thinking: everyone focuses on the same kind of thinking at a step, then switches together. White is only facts. Red is only gut reactions. Black is only risks. Yellow is only benefits. Green is only new options. The colors are labels—the point is separation.
Once perspectives mix, basic audit questions get hard to answer:
- What do we actually know?
- What are we assuming?
- What are people worried about?
- Which ideas are new?
- Why did we pick this recommendation?
In a workshop, a facilitator enforces that structure. In chat, there is no facilitator. The model’s default is to be useful in one pass—which usually means folding every hat into one polished block. You get the shape of rigor without the separation that makes rigor checkable.
What the same question looks like with hats
Take the launch question again:
Should we delay the v2 release by a week to fix performance?
Instead of one answer trying to do everything, a Full run at Standard depth might open with Setup, then walk the hats in order (abbreviated below):
Setup
- Focus question: Should we delay the v2 release by a week to fix performance?
- Mode: Full
- Depth: Standard
- Hat order: Blue → White → Red → Black → Yellow → Green → Blue
White
[KNOWN]Performance regression increased load times by 40%[UNKNOWN]Impact on churn[ASSUMED]Slower pages affect conversion
Red
- Stated: The team is anxious about missing Friday’s ship date again
- The launch might feel safer emotionally once performance is back in range
Black
- Risk: Shipping poor performance damages user trust — Mitigation: Fix highest-impact bottlenecks first
Yellow
- Benefit: Better launch experience — Condition: Performance improvements are noticeable to users
Green
- Delay only performance-heavy features
- Release to a smaller beta group first
- Stage the rollout
Blue (synthesis)
- Main tension: shipping momentum vs user experience
- Recommendation: staged rollout
- Next step: validate impact with a limited user cohort
Now you can see where the recommendation came from. You can disagree with it, challenge assumptions, expand one hat, or rerun a single section. The answer is easier to inspect instead of merely sounding convincing.
A fix you can install once
The six-thinking-hats Agent Skill encodes this structure for Cursor, Claude Code, and any host that loads Agent Skills. Instead of improvising “be thorough,” the agent runs one hat per section, then closes with Blue synthesis that may only stitch together what earlier hats already said.
Same focus question every time. Same rules about what each hat may add. A rerun next week should look like the same process, not a new essay.
New to skills? Read Your AI Agent Shouldn't Start From Zero Every Session first.
Try it on a real question
Install:
npx skills add ysskrishna/ai-agent-skills --skill six-thinking-hatsInvoke with the focus question in the prompt:
/six-thinking-hats Should we delay the v2 release by a week to fix performance?Plain language works too—“parallel thinking,” “de Bono,” “six thinking hats”—when your host maps intent to the skill.
What you should see on the first run
Before any colored hat, the skill prints a short Setup block:
- Focus question (one sentence)
- Mode (which hat sequence; default Full)
- Depth (bullets per hat; default Standard)
- Hat order (the sequence it will follow)
If your brief is thin, it may ask up to three clarifying questions, then continue and label gaps. Even when you skip the details, mode and depth should still appear so you know what you got.
Then the hats run in order, each in its own section—not two hats in one block.
| Hat | Job in that pass |
|---|---|
| White | Known, assumed, or unknown—no interpretation |
| Red | Stated or carefully inferred feelings only |
| Black | Risks and mitigations |
| Yellow | Benefits and what must hold for them |
| Green | New options, no scoring yet |
| Blue | Frame at the start; synthesize at the end without new facts or ideas |
Blue bookends the run: it sets the question, then names tensions, a recommendation, and a next step drawn only from prior hats.
Pick a mode for the kind of decision
Most launch-or-invest calls want Full (default): Blue → White → Red → Black → Yellow → Green → Blue.
| Mode | When to reach for it |
|---|---|
| Creative | Ideation; skips Black so criticism does not kill ideas early |
| Risk | Failure prevention; White and Black only, then Blue |
| Decision | Go/no-go; facts, risks, upsides—no Red or Green |
| Custom | Your order, still closing with Blue |
Examples you can paste:
Six Thinking Hats: should we open-source the internal CLI?
Six Thinking Hats, Creative mode: newsletter topic ideas.
Six Thinking Hats, Risk mode: migrate auth this weekend?
Six Thinking Hats, Decision mode: ship the beta on Friday or delay by a week?Match depth to stakes
| Depth | Bullets per hat | Synthesis |
|---|---|---|
| Quick | 2 | 3 bullets (tension, recommendation, next step) |
| Standard | 3 | Full Blue synthesis |
| Deep Dive | 4–5 | Richer synthesis for leadership readouts |
Quick is enough for a hallway call. Deep Dive when the write-up has to stand alone in Slack or a doc.
How to tell the run actually worked
Skim for structure, not eloquence. The SKILL.md checklist is the source of truth; these are the checks readers notice first.
Structure (before you judge the prose)
- Setup appears up front with all four fields: focus question, mode, depth, and hat order. If you did not specify mode or depth, they should still be stated explicitly (defaults: Full, Standard).
- Hats run in the declared order, each in its own section—never two hats blended into one block.
- The run closes with Blue synthesis after the last colored hat.
Per-hat formats
| Hat | What to look for |
|---|---|
| White | Every bullet tagged [KNOWN], [ASSUMED], or [UNKNOWN]—facts and gaps only, no interpretation |
| Red | Stated feelings passed through verbatim; inferred feelings use “might,” “could,” or “may,” not asserted as fact |
| Black | Every bullet has Risk and Mitigation |
| Yellow | Every bullet has Benefit and Condition (what must hold for the upside) |
| Green | One distinct option per bullet; count matches depth (2 / 3 / 4–5); no scoring or ranking yet |
| Blue (close) | Names tensions (e.g. Yellow upside vs Black risk), a recommendation, and a next step—all drawn from earlier hats only |
When to ask for a rerun
- Hats mixed in one section (e.g. White, Black, and Yellow together)—say so and ask for one hat per section.
- Blue overstep—new fact, risk, idea, or recommendation that no earlier hat supported. Point at the checklist and request a structured rerun.
- Wrong order or missing Setup—the declared sequence was not followed, or the run skipped the opening Setup block.
When this skill is the wrong tool
Skip it for plain factual questions, execution-only implementation with no structured perspective pass, or a single-angle hot take when you do not want a fixed hat-by-hat sequence.
What to do next
- Install with
npx skills add ysskrishna/ai-agent-skills --skill six-thinking-hats. - Run a real question on your desk in Full + Standard mode.
- If hats mix, link the SKILL.md and ask for a structured rerun.



