AI Builders Brief — 2026-05-09

Follow builders, not influencers.

2026.05.09

25+ builders tracked

TL;DR

Albert said Claude’s time horizon jumped past rivals, while Askell argued alignment should teach models what to be, not just what to avoid. Levie pushed token budgets over headcount, and Tan said personal software finally turned real.

BUILDER INSIGHTS

Alex Albert AnthropicAI

Claude’s time horizon jumps past rivals

An early Claude Mythos Preview snapshot reportedly more than doubles the next-best model on METR’s 80% success-rate benchmark. That’s a big signal that Anthropic is pushing hard on long-horizon task execution, not just chat quality.

880

Peter Steinberger OpenClaw

More agent skills, less prompt babysitting

The more skills you give Codex, the less you have to micromanage it — that’s the core point here. It’s a clean argument for building agents with real capabilities instead of endlessly polishing prompts.

541

Amanda Askell AnthropicAI

Alignment should teach models what to be, not just avoid

She argues alignment work shouldn’t only be about stopping bad behavior — it should also give models a clear, positive vision of what they’re for and why. That’s a more constructive framing for Anthropic’s safety work: not just constraint, but shaping the model’s character.

394

Aaron Levie CEO, box

Enterprise will need token budgets, not just headcount

As agents take on longer-running work, token budgeting becomes a real enterprise planning problem — more like managing headcount, marketing spend, or lunch budgets than classic IT costs. He says companies will need new software and new controls to route compute to the highest-value work, and that this could spawn a startup category of its own.

328

Thariq anthropicai

HTML is replacing markdown for AI-first workflows

He says he’s stopped writing markdown for most things and now uses Claude Code to generate HTML instead. The take: HTML is becoming the better default for AI-assisted docs and notes, because it’s more flexible and easier for models to shape into something useful.

#1 286 #2 8.7k #3 7.6k

Zara Zhang

YouTube gets a realtime voice copilot

Built a browser extension that watches YouTube with you and answers questions about what was just said using OpenAI’s Realtime API. The neat bit: it separates the video’s audio from your voice, so it stays quiet unless you actually ask something. Also pushed 32 HTML slide templates into AnyGen, making them plug-and-play even without a coding agent.

#1 109 #2 153 #3

Matt Turck FirstMarkCap

AI agents may need seat-like pricing, not tokens

He argues agent pricing won’t stay purely consumption-based: token costs matter, but enterprise agents also need identities, roles, auth, budgets, and audit logs. That starts to look less like metered API usage and more like a weird new kind of seat model.

Garry Tan CEO, ycombinator

Personal software is finally becoming real

He says the next wave is "personal software" — and points to a 1M-token coding agent that supposedly runs on a 128GB MacBook Pro as the proof point. The vibe: AI is pushing serious software creation onto a single machine, which is exactly the kind of shift YC loves to see.

#1 #2 #3 301

Dan Shipper CEO, every

AI hype lags — trade the gap

He says the market is still 3–4 months behind the AI curve, which creates a real edge for people willing to act before the crowd catches up. His example: Codex will be mainstream hype soon, so the smart money should already be positioning now.

#1 #2 244 #3

PODCAST HIGHLIGHTS

AI & I by Every

The Secrets of Claude's Platform From the Team Who Built It

Claude’s platform is becoming the agent infrastructure layer, not just an API

The Takeaway: Claude’s platform is shifting from “model access” to the full stack needed to ship autonomous agents fast.

The real product isn’t a completion endpoint anymore; it’s a set of opinionated primitives that help Claude get better outcomes with less work.
The hard part isn’t prompt tinkering — it’s production infrastructure: persistence, sandboxing, credentials, and keeping agents alive at scale.
Model and harness are getting tightly coupled, so “generic, hot-swappable” setups are losing ground to model-specific agent design.

Angela, head of product for Claude’s platform at Anthropic, and Caitlin, head of engineering, describe a philosophy that’s more pragmatic than flashy: make the model easier to use by baking in the boring parts. Their view is that the platform should evolve toward “whatever it’s like the set of primitives and infrastructure that enables you to basically get the outcome as fast as possible.” That means messages API, file systems, skills, code execution, web search, memory, and managed infrastructure — not just tokens in and out.

Their sharpest point is that most teams misjudge where the pain lives. People assume harness engineering is the hard part, but the wall usually shows up later: “everyone hits an infrastructure wall.” Once an agent works in a Mac mini or a quick prototype, production becomes a mess of uptime, state, storage, security, and long-running jobs. That’s why Claude Managed Agents exists: Anthropic built the thing it kept rebuilding for itself.

They also argue the old “generic harness, swap models later” mindset is fading. As models diverge, the best results come from pairing the harness and model more deliberately. In other words, the platform is no longer neutral plumbing — it’s part of the model’s behavior, and that path dependence matters.

YouTube

STAY UPDATED

Daily builder insights, straight to your inbox.

Prefer RSS? Subscribe via RSS