ADR-0027: Closed component vocabulary for plans-app spec review pages

Status

Proposed

Decision

Spec review pages in apps/plans use a closed component vocabulary: a fixed set of CSS classes declared in apps/plans/COMPONENT_CATALOG.md and styled in apps/plans/src/styles/spec-components.css. The agent authoring body_html MUST use only catalog classes — no inline style="" attributes, no <style> tags, no undeclared class names. When the agent encounters content that does not fit any existing class, it:

Picks the closest existing fallback class for the rendered output (so the page still renders).
Records a suggested_components[] entry on the JSON describing what’s missing: name, when_to_use, sample_html, sample_css, used_in_this_spec: true.

Catalog extension is a manual decision. The orchestration script prints suggestions after each run; a human reviews them, edits COMPONENT_CATALOG.md + spec-components.css, then re-runs affected specs with --force if regeneration is wanted. The system does not auto-adopt and does not track which specs would benefit from a regen after adoption. The v1 catalog is 14 classes (see spec docs/superpowers/specs/2026-05-25-plan-html-review-app-design.md §6 for the table).

Why

The reference HTML at docs/superpowers/specs/2026-05-22-fieldforce-phase4-briefings-design.html is ~820 lines because the LLM that produced it invented all of its visual structure inline — bespoke color tokens, bespoke callout styles, bespoke grid system. The output is beautiful but it has three structural problems if used as the production pattern:

Every spec looks different. A reader scanning 7 specs has to re-learn the layout each time. The plan’s §1 goal — “simple and easy to understand by human quickly” — depends on visual consistency, not per-page visual novelty.
Design iteration cost compounds. When you want to tweak callout color, change the decision-card layout, or fix a heading hierarchy bug, the fix has to happen 7 times (once per spec) — or you regenerate everything and hope the next LLM run reproduces the previous look-and-feel, which it won’t.
Search and accessibility suffer. Stable class names mean predictable DOM for screen readers and predictable weighting for the MiniSearch index. Per-spec invented classes can’t be weighted meaningfully (.css-7f3a2b isn’t a signal).

A closed vocabulary fixes all three. But a purely closed vocabulary creates a different problem: real specs will eventually need visual patterns the v1 catalog doesn’t cover. The suggestion loop is the explicit pressure-release valve — the agent reports unmet needs, and the catalog grows under human judgment. Rejected alternatives:

Free-form HTML with inline styles. What the reference page does. Rejected for the consistency, iteration-cost, and accessibility reasons above. Beautiful one-offs are not what this workflow optimizes for.
Open catalog: agent invents classes; script merges new ones automatically. Rejected because LLM-invented CSS drifts in naming (.spec-warn vs .spec-warning vs .alert-warn), spacing tokens, and color logic. After 10 specs you have a catalog full of near-duplicates and no coherent design system.
Vocabulary + escape hatch (agent allowed to write inline styles for “unique cases”). Rejected because the escape will be over-used. The LLM will rationalize most non-trivial content as “unique,” and the catalog becomes advisory rather than authoritative. Better: ship a tight catalog and grow it deliberately.
Track which specs would benefit from a regen after a class is adopted. Rejected as premature bookkeeping. Class adoption is rare (probably less than 1 per month at steady state); the human running the adoption knows which specs requested the class — they can --force those few specs by hand.

How it works

Catalog as prompt input. The prompt template inlines COMPONENT_CATALOG.md verbatim. The agent sees the catalog every time. The prompt’s “Visual structure rules” section instructs: “Use ONLY the classes listed in the Component Catalog. Do not invent new class names. Do not write <style> tags. Do not use inline style="" attributes. If content does not fit any existing class, pick the closest fallback and record a suggestion in suggested_components[].” Suggestion payload shape:

type SuggestedComponent = {
  name: string;             // proposed class name, kebab-case, spec-* prefix
  when_to_use: string;      // 1–3 sentence rationale
  sample_html: string;      // minimal HTML using the proposed class
  sample_css: string;       // proposed CSS rules (illustrative only — human will rewrite)
  used_in_this_spec: true;  // present means: agent fell back to existing class for this spec
};

Script behavior after a run with suggestions:

✓ Generated 2026-05-25-foo
  - Source hash: a1b2…
  - Diagram format: mermaid
  - 1 component suggestion: spec-risk-matrix
    "When the spec lists 3+ risks with severity and likelihood. Existing
     spec-callout--risk renders flat list; matrix would show severity×likelihood."
  - Review at: apps/plans/src/content/specs/2026-05-25-foo.json
  - To adopt: edit apps/plans/src/styles/spec-components.css + COMPONENT_CATALOG.md,
    then re-run with --force.

Adoption workflow (manual):

Read the suggestion payload in the JSON.
Decide whether the pattern is reusable (rule of thumb: would 2+ future specs use this?).
Add the class to apps/plans/src/styles/spec-components.css with proper design tokens, spacing, and dark-mode-readiness (even though dark mode is light-only today, write classes that wouldn’t need rewriting later).
Add an entry to apps/plans/COMPONENT_CATALOG.md with class name, when to use, sample HTML, and a screenshot or visual hint.
Re-run the affected specs with --force to have the agent use the new class.

Hot rules (enforced by review, not code):

Catalog classes use the spec- prefix (e.g. spec-callout, spec-decision-card). Modifiers use -- (e.g. spec-callout--decision).
Classes are semantic, not visual (--decision, not --blue). Re-skinning later doesn’t break the agent’s vocabulary.
A class is added when there’s a real pattern, not because the LLM suggested it once.

Known limitations

Fallback rendering can look subtly wrong. When the agent picks spec-callout--warning because no spec-risk-matrix exists, the rendered page is “OK but not ideal” until adoption. This is a feature for v1 — we want unmet patterns to be visible (in suggested_components) rather than silently hidden — but it does mean the first run of a spec with novel content can look slightly off.
The agent might fail to suggest. If the prompt is unclear or the agent is in a hurry, it can shoehorn content into an ill-fitting class without recording a suggestion. Mitigation: PR-review of the rendered page catches this. Long-term mitigation: tighten the suggestion-protocol section of the prompt template.
No central enforcement. A future agent could violate the rules by writing inline styles. Mitigation: add a lint pass in the orchestration script that scans body_html for forbidden patterns (<style>, style=, undeclared classes) and rejects the artifact. This is a v1.1 enhancement, not v1.
Catalog growth must be reviewed. Without discipline the catalog could bloat to 50+ classes that drift in style. Mitigation: keep the adoption bar high (2+ specs would use it), and audit the catalog quarterly.

Rules for agents

body_html MUST use only classes declared in apps/plans/COMPONENT_CATALOG.md.
body_html MUST NOT contain <style> tags.
body_html MUST NOT contain inline style="" attributes.
body_html MUST NOT contain class="…" values that include any class not in the catalog (excluding native HTML semantics like <details>, which the catalog permits implicitly).
When content does not fit any existing class, the agent MUST record an entry in suggested_components[] AND render the section using the closest existing fallback class. The agent MUST set used_in_this_spec: true on every suggestion (a suggestion implies the agent had a real need).
The agent MUST NOT add classes to the catalog itself. Catalog edits are human-only.
Layout-owned chrome (hero, sticky TOC, footer) MUST NOT be authored in body_html. The Astro layout supplies these from the JSON metadata; the agent’s body fragment is content only.

Exception: UI mockup figures

The closed-vocabulary rule has one narrow exception: <figure class="spec-mockup"> containers, where custom CSS is permitted. This exists because real UI mockups (panel widgets, modals, dashboards, mobile screens) have visual layout that is the message — no catalog component can faithfully represent a screen depiction. Permitted inside <figure class="spec-mockup"> only:

Inline style="…" attributes on any element.
A <style>…</style> block.

Required for <style> blocks inside spec-mockup:

Every selector MUST descend from .spec-mockup (e.g. .spec-mockup .briefing-card, never bare .briefing-card).
The body-HTML lint in scripts/generate-plan-html.ts parses the <style> block, splits its rules, and rejects any selector that doesn’t include .spec-mockup. This is automated — agents can’t accidentally leak styles past the figure.

Still forbidden anywhere (including inside mockups):

<html>, <body>, <head> tags.

Why this exception is safe:

Bounded scope. The .spec-mockup selector requirement keeps custom CSS from leaking into other components or the layout chrome. A mockup with bad CSS breaks only itself.
Lint-enforced. It’s not a “trust the agent” exception — the script parses the CSS and rejects unscoped selectors before the artifact is promoted. A leaked rule fails generation.
Bounded use. Mockups are rare in specs. Most specs have zero; even UI-heavy ones have 1–3. The escape hatch isn’t load-bearing for ordinary content.
Visual mockups are not part of the design system. The catalog exists to keep cross-spec layout consistent for prose content. Mockups are illustrations, not prose — they’re meant to look like their own thing.

Why not extend the catalog to cover mockups instead? Tried mentally; rejected:

Mockup layouts vary by surface (panel widget, mobile screen, modal, dashboard, email) — no single set of classes covers all surfaces without bloat.
Mockups need pixel-tweaked spacing to read as “this is the UI”; catalog components are tuned for prose, not UI fidelity.
Adding 30+ classes to handle UI variations would defeat the catalog’s “small, semantic, predictable” goal.
The lint enforcement makes a scoped escape hatch safer than trying to enumerate every possible mockup shape.

Bad pattern (do not generate)

<!-- Inline-styled callout: forbidden -->
<div style="background: #fef3c7; border-left: 4px solid #d97706; padding: 1rem;">
  <strong>Warning:</strong> The migration locks the table for ~30s.
</div>

<!-- Invented class: forbidden -->
<div class="spec-risk-heatmap">
  <div class="cell-high-likely">…</div>
</div>

<!-- <style> block in body_html: forbidden -->
<style>.callout-yellow { background: #fef3c7; }</style>
<div class="callout-yellow">…</div>

Good pattern

<!-- Use catalog class spec-callout--warning -->
<aside class="spec-callout spec-callout--warning">
  <strong>Migration lock:</strong> The migration locks the table for ~30s.
</aside>

<!-- Closest fallback when no risk-matrix class exists -->
<section class="spec-section" id="risks">
  <h2>Risks</h2>
  <aside class="spec-callout spec-callout--risk">
    Severity × likelihood matrix collapsed to flat list — see suggested_components.
  </aside>
  <ul>
    <li><strong>DB downtime:</strong> high severity, low likelihood — mitigation: …</li>
    <li><strong>Index drift:</strong> medium / medium — mitigation: …</li>
  </ul>
</section>

// In <slug>.json:
{
  "suggested_components": [
    {
      "name": "spec-risk-matrix",
      "when_to_use": "When a spec lists 3+ risks with severity and likelihood axes. Existing spec-callout--risk renders a flat list; a 2D matrix surfaces the high-severity/high-likelihood quadrant at a glance.",
      "sample_html": "<div class=\"spec-risk-matrix\"><div class=\"spec-risk-matrix__cell spec-risk-matrix__cell--hh\">…</div>…</div>",
      "sample_css": ".spec-risk-matrix { display: grid; grid-template-columns: repeat(2, 1fr); }",
      "used_in_this_spec": true
    }
  ]
}

0027 plans app closed component vocabulary

ADR-0027: Closed component vocabulary for plans-app spec review pages

Status

Tags

Decision

Why

How it works

Known limitations

Rules for agents

Exception: UI mockup figures

Bad pattern (do not generate)

Good pattern

​ADR-0027: Closed component vocabulary for plans-app spec review pages

​Status

​Tags

​Decision

​Why

​How it works

​Known limitations

​Rules for agents

​Exception: UI mockup figures

​Bad pattern (do not generate)

​Good pattern

ADR-0027: Closed component vocabulary for plans-app spec review pages

Status

Tags

Decision

Why

How it works

Known limitations

Rules for agents

Exception: UI mockup figures

Bad pattern (do not generate)

Good pattern