ADR-0027: Closed component vocabulary for plans-app spec review pages
Status
ProposedTags
docs, astro, plans, design-system, llm, ai, generationDecision
Spec review pages inapps/plans use a closed component vocabulary: a fixed set of CSS classes declared in apps/plans/COMPONENT_CATALOG.md and styled in apps/plans/src/styles/spec-components.css. The agent authoring body_html MUST use only catalog classes — no inline style="" attributes, no <style> tags, no undeclared class names.
When the agent encounters content that does not fit any existing class, it:
- Picks the closest existing fallback class for the rendered output (so the page still renders).
- Records a
suggested_components[]entry on the JSON describing what’s missing:name,when_to_use,sample_html,sample_css,used_in_this_spec: true.
COMPONENT_CATALOG.md + spec-components.css, then re-runs affected specs with --force if regeneration is wanted. The system does not auto-adopt and does not track which specs would benefit from a regen after adoption.
The v1 catalog is 14 classes (see spec docs/superpowers/specs/2026-05-25-plan-html-review-app-design.md §6 for the table).
Why
The reference HTML atdocs/superpowers/specs/2026-05-22-fieldforce-phase4-briefings-design.html is ~820 lines because the LLM that produced it invented all of its visual structure inline — bespoke color tokens, bespoke callout styles, bespoke grid system. The output is beautiful but it has three structural problems if used as the production pattern:
- Every spec looks different. A reader scanning 7 specs has to re-learn the layout each time. The plan’s §1 goal — “simple and easy to understand by human quickly” — depends on visual consistency, not per-page visual novelty.
- Design iteration cost compounds. When you want to tweak callout color, change the decision-card layout, or fix a heading hierarchy bug, the fix has to happen 7 times (once per spec) — or you regenerate everything and hope the next LLM run reproduces the previous look-and-feel, which it won’t.
-
Search and accessibility suffer. Stable class names mean predictable DOM for screen readers and predictable weighting for the MiniSearch index. Per-spec invented classes can’t be weighted meaningfully (
.css-7f3a2bisn’t a signal).
- Free-form HTML with inline styles. What the reference page does. Rejected for the consistency, iteration-cost, and accessibility reasons above. Beautiful one-offs are not what this workflow optimizes for.
- Open catalog: agent invents classes; script merges new ones automatically. Rejected because LLM-invented CSS drifts in naming (
.spec-warnvs.spec-warningvs.alert-warn), spacing tokens, and color logic. After 10 specs you have a catalog full of near-duplicates and no coherent design system. - Vocabulary + escape hatch (agent allowed to write inline styles for “unique cases”). Rejected because the escape will be over-used. The LLM will rationalize most non-trivial content as “unique,” and the catalog becomes advisory rather than authoritative. Better: ship a tight catalog and grow it deliberately.
- Track which specs would benefit from a regen after a class is adopted. Rejected as premature bookkeeping. Class adoption is rare (probably less than 1 per month at steady state); the human running the adoption knows which specs requested the class — they can
--forcethose few specs by hand.
How it works
Catalog as prompt input. The prompt template inlinesCOMPONENT_CATALOG.md verbatim. The agent sees the catalog every time. The prompt’s “Visual structure rules” section instructs: “Use ONLY the classes listed in the Component Catalog. Do not invent new class names. Do not write <style> tags. Do not use inline style="" attributes. If content does not fit any existing class, pick the closest fallback and record a suggestion in suggested_components[].”
Suggestion payload shape:
- Read the suggestion payload in the JSON.
- Decide whether the pattern is reusable (rule of thumb: would 2+ future specs use this?).
- Add the class to
apps/plans/src/styles/spec-components.csswith proper design tokens, spacing, and dark-mode-readiness (even though dark mode is light-only today, write classes that wouldn’t need rewriting later). - Add an entry to
apps/plans/COMPONENT_CATALOG.mdwith class name, when to use, sample HTML, and a screenshot or visual hint. - Re-run the affected specs with
--forceto have the agent use the new class.
- Catalog classes use the
spec-prefix (e.g.spec-callout,spec-decision-card). Modifiers use--(e.g.spec-callout--decision). - Classes are semantic, not visual (
--decision, not--blue). Re-skinning later doesn’t break the agent’s vocabulary. - A class is added when there’s a real pattern, not because the LLM suggested it once.
Known limitations
- Fallback rendering can look subtly wrong. When the agent picks
spec-callout--warningbecause nospec-risk-matrixexists, the rendered page is “OK but not ideal” until adoption. This is a feature for v1 — we want unmet patterns to be visible (insuggested_components) rather than silently hidden — but it does mean the first run of a spec with novel content can look slightly off. - The agent might fail to suggest. If the prompt is unclear or the agent is in a hurry, it can shoehorn content into an ill-fitting class without recording a suggestion. Mitigation: PR-review of the rendered page catches this. Long-term mitigation: tighten the suggestion-protocol section of the prompt template.
- No central enforcement. A future agent could violate the rules by writing inline styles. Mitigation: add a lint pass in the orchestration script that scans
body_htmlfor forbidden patterns (<style>,style=, undeclared classes) and rejects the artifact. This is a v1.1 enhancement, not v1. - Catalog growth must be reviewed. Without discipline the catalog could bloat to 50+ classes that drift in style. Mitigation: keep the adoption bar high (2+ specs would use it), and audit the catalog quarterly.
Rules for agents
body_htmlMUST use only classes declared inapps/plans/COMPONENT_CATALOG.md.body_htmlMUST NOT contain<style>tags.body_htmlMUST NOT contain inlinestyle=""attributes.body_htmlMUST NOT containclass="…"values that include any class not in the catalog (excluding native HTML semantics like<details>, which the catalog permits implicitly).- When content does not fit any existing class, the agent MUST record an entry in
suggested_components[]AND render the section using the closest existing fallback class. The agent MUST setused_in_this_spec: trueon every suggestion (a suggestion implies the agent had a real need). - The agent MUST NOT add classes to the catalog itself. Catalog edits are human-only.
- Layout-owned chrome (hero, sticky TOC, footer) MUST NOT be authored in
body_html. The Astro layout supplies these from the JSON metadata; the agent’s body fragment is content only.
Exception: UI mockup figures
The closed-vocabulary rule has one narrow exception:<figure class="spec-mockup"> containers, where custom CSS is permitted. This exists because real UI mockups (panel widgets, modals, dashboards, mobile screens) have visual layout that is the message — no catalog component can faithfully represent a screen depiction.
Permitted inside <figure class="spec-mockup"> only:
- Inline
style="…"attributes on any element. - A
<style>…</style>block.
<style> blocks inside spec-mockup:
- Every selector MUST descend from
.spec-mockup(e.g..spec-mockup .briefing-card, never bare.briefing-card). - The body-HTML lint in
scripts/generate-plan-html.tsparses the<style>block, splits its rules, and rejects any selector that doesn’t include.spec-mockup. This is automated — agents can’t accidentally leak styles past the figure.
<html>,<body>,<head>tags.
- Bounded scope. The
.spec-mockupselector requirement keeps custom CSS from leaking into other components or the layout chrome. A mockup with bad CSS breaks only itself. - Lint-enforced. It’s not a “trust the agent” exception — the script parses the CSS and rejects unscoped selectors before the artifact is promoted. A leaked rule fails generation.
- Bounded use. Mockups are rare in specs. Most specs have zero; even UI-heavy ones have 1–3. The escape hatch isn’t load-bearing for ordinary content.
- Visual mockups are not part of the design system. The catalog exists to keep cross-spec layout consistent for prose content. Mockups are illustrations, not prose — they’re meant to look like their own thing.
- Mockup layouts vary by surface (panel widget, mobile screen, modal, dashboard, email) — no single set of classes covers all surfaces without bloat.
- Mockups need pixel-tweaked spacing to read as “this is the UI”; catalog components are tuned for prose, not UI fidelity.
- Adding 30+ classes to handle UI variations would defeat the catalog’s “small, semantic, predictable” goal.
- The lint enforcement makes a scoped escape hatch safer than trying to enumerate every possible mockup shape.