NO. 002
SATURDAY · 09 MAY 2026 · field report on agentic web design
≈ 7 MIN READ

Markdown is over.
The agents prefer HTML.

Anthropic's Thariq Shihipar and Latent Space's swyx both shipped the same quiet pattern this week — skip the Markdown, generate the page. Meanwhile Codex's install curve goes vertical, METR clocks a 16-hour autonomous horizon, and a one-line inference fix gives Claude Code a 5× speedup on long contexts. The agentic stack is rearranging around its real artifact.

§01

Tooling

— what shipped, what shifted

Codex's install curve goes vertical

The week's shock chart: OpenAI Codex weekly installs surged past 86 million, dwarfing Claude Code's 7.2 million on the same a16z-sourced graphic circulating on X this week. Treat that number with care — Codex's GPT-5.5 rollout reset the install counter, and Claude Code's nightly auto-updates don't fully register the same way — but the directional signal is real. The New Stack's hands-on review calls Codex "the strongest Claude Code rival yet," with parity on most agentic tasks and a stronger long-context story since the GPT-5.5 update.

Design implicationThe agentic-CLI category is plural now. Tools come and go; the prompt files, rules of engagement, and project context you write for them are the durable artifacts in your repo.

METR clocks a 16-hour horizon

METR's evaluation of an early Claude Mythos Preview reports a 50% time horizon of at least 16 hours on agentic tasks — more than double the next-best model and pinned at the top of METR's measurement range. Sixteen hours is the difference between "an agent that drafts your page" and "an agent that ships your page, deploys it, watches the analytics, and fixes the broken card on the second pass." Designers should start sketching for the second case now: longer-running agents need legible status pages, structured plan files, and review surfaces — which is exactly the shift §03 is about.

Stable prompt prefixes are infrastructure

An infrastructure cameo worth noting: NVIDIA found an unstable billing header in Claude Code prompts that was busting KV-cache reuse in their Dynamo inference engine. Removing the variable header dropped time-to-first-token roughly 5× on 52k-token contexts. If you're authoring system prompts or skill files for an in-house agent, treat the first 10 KB like a cache key — bytes that change every request cost you measurable seconds, every request, forever.

§02

Technique

— prompts, patterns, evaluations

Character is a training problem

Two pieces this week converge on the same idea: agent character is a training problem, not a runtime one. AI engineer Victor Taelin published a draft "no-hacks" coding rulebook he wants pretrained into models on trillions of tokens — a flat ban on monkey patches, workarounds, and partial fixes. Read uncharitably it's a manifesto. Read charitably, it's an honest recognition that if you want an agent to refuse a quick hack, you cannot bolt that refusal on at the prompt — the model will know which way is downhill.

Stories work better than rules

The post-training counterpart shipped at Anthropic. Their misalignment-mitigation paper shows that pairing Claude's constitution with fictional stories of aligned AI behavior cuts agentic misalignment by more than 3× on tests like blackmail and financial crime.

Practical moveYour AGENTS.md shouldn't read like a SOC-2 policy. Write the rule, then write the example of an aligned agent obeying the rule under pressure. Future-you will thank past-you for the second paragraph.

Links are part of the truth signal

One cold splash from a small but pointed test: ChatGPT confidently dismissed real reporting as false when given excerpts without source URLs, and reversed itself once the links were attached. If you're building agents that read your CMS or your inbox, surfaced links aren't decoration — they are part of the truth signal the model uses to decide what to believe. Strip them at your peril.

§03

Workflow

— a studio practice worth stealing

Skip the Markdown, generate the page

The most quietly important workflow story of the week is short and almost banal. Anthropic engineer Thariq Shihipar wrote that he has stopped maintaining Markdown source files for most of his planning, review, and content artifacts; he asks Claude Code to generate full HTML instead. Simon Willison's writeup picked it up under the heading "the unreasonable effectiveness of HTML," and tech essayist swyx independently adopted the same pattern. The argument is mechanical, not aesthetic: anything you intend to review, click, compare, edit, or share is closer to a real workspace as HTML than as Markdown.

A Markdown file is a compressed sketch of a page. If your agent can write the page directly, the sketch was never load-bearing.

The cost-benefit is real

HTML takes two to four times longer to generate than Markdown, and Git diffs balloon. The benefit is that the model is now writing the actual artifact, not a lossy intermediate. Side-by-side draft comparisons, in-place editing widgets, expand/collapse outlines, a code-review surface tuned to your team rather than GitHub's defaults — all are easier to produce as a self-contained HTML file than to fight Markdown into. A Claude skill called html-artifacts is already on GitHub codifying the pattern; give it a few days and your repo's skills/ folder will have one too.

What to change in your studio

Keep Markdown for canonical text content (READMEs, blog posts, anything indexed by search), but stop maintaining Markdown intermediates for plans, reviews, dashboards, and one-off tools. Hand the agent a styled template HTML, give it a clear filename convention, and let it write each artifact whole.

Mental modelTreat the HTML the way you used to treat Notion pages: ephemeral, regeneratable, easier to throw away than to maintain.
§04

Prompt Lab

— a concrete prompt with commentary

Recreate this page

Paste this into your AI design or build tool to reproduce this issue's visual system.

Design a single self-contained HTML page in a Frutiger Aero idiom, the glossy aqua-era web of 2004-2008. The content is a daily design-news briefing: a top nav, an issue masthead with number, date, and read time, a hero headline with a one-line deck, a boxed art-direction note, numbered sections of linked news items with one or two sentences of context each, one pullquote, a monospace prompt block, a sources list, and a colophon. Treatment: a sky ground from #DCEEFF down to #B7DEF5 with drifting cumulus clouds and exactly one lens flare; glossy aqua panels with white inner highlights and soft bevels; beveled tabs for the nav; capsule buttons; a peach horizon glow with one sunset-orange accent #FF6B00; royal-blue links #0066CC and teal secondary accents #00A99D. Type: Tahoma or Verdana ultra-bold for display with a chrome gradient fill on the hero headline only, Verdana or Tahoma body, Times-style italic kickers, Courier-style monospace captions. Guardrails: body text at least 18px with line height 1.6 or more, prose in the body face and never in monospace, line length 60-75 characters, WCAG AA contrast on every surface, hover and focus states on real links, decoration in the margins and panels rather than under running prose, no fake readable text in images, and no default AI styling (no purple-blue gradients, no glow, no pill-shaped everything). Gloss and gradients live on panels, chrome, and the hero sky, never under body text, which sits on a plain light surface.

Works in v0, Lovable, Bolt, Figma Make, Beaver Builder AI, or as a Claude / GPT system brief.

§05

Field Note

— short editorial synthesis

The substrate of agent-built websites is HTML. The substrate of agent-built workflows is increasingly HTML too. The constitution that shapes agent behavior is plain text written like a manual. The pattern across all three: humans write the rule, agents write the artifact — and as the agents get better, the artifact moves down a level of abstraction, closer to the metal of the platform, while the human's work moves up, closer to intent. Markdown was a halfway house we built for ourselves; it turns out the agents don't need it.

§06

Sources

— full link list
No. 002 · 09 May 2026 Type set in Outfit & Public Sans, with DM Serif Display italic for accents. Palette: Frutiger Aero — sky, mint, royal aqua, sunset reflection.

A field experiment from the team behind Beaver Builder AI.