AI Builders Brief
?

Follow builders, not influencers.

2026.04.20

25+ builders tracked

TL;DR

Rauch said an AI-accelerated attack exposed Vercel’s weak link, while Kothari warned AI will supercharge attacks too. Garry Tan called Claude Code the new app factory, and Peter Yang noted agents still flaked on boring cron jobs.

BUILDER INSIGHTS
8
01
Guillermo Rauch Guillermo Rauch CEO, vercel

AI-accelerated attack exposed Vercel’s weak link

He says a Vercel employee was compromised through a breached AI platform customer account, then the attacker used that foothold to reach Vercel environments. The company thinks the group was highly sophisticated — and possibly sped up by AI — but says customer impact looks limited and it’s already pushing new env-var security controls plus secret rotation guidance.

X
02
Aaron Levie Aaron Levie CEO, box

AI won’t shrink jobs — it’ll make them harder

AI productivity gains won’t replace most roles so much as raise the bar: when everyone gets better tools, the job itself expands. He argues the engineer, paralegal, editor, and analyst of the future will be judged on bigger, more complex work — not the old baseline tasks.

X
03
Zara Zhang Zara Zhang

AI shifts teams from building to listening

She argues product teams should spend way more time talking to users and customers, because AI is making execution cheap while problem selection gets more important. With smaller teams and fewer internal meetings, the real edge becomes understanding the problem — then handing the messy implementation to agents.

X
04
Nikunj Kothari Nikunj Kothari Partner, fpvventures

AI will supercharge attacks, not just defenses

Cybersecurity is headed for a bigger market because attack volume will keep rising as model capabilities improve. The real weak point stays the same: humans, which means infra providers and security teams are about to get a lot more pressure.

X
05
Peter Yang Peter Yang

AI agents still flake on boring cron jobs

He says OpenClaw + GPT couldn’t reliably handle a simple weekly stats recap email, even after a lot of back-and-forth and model switching. The takeaway is blunt: agentic workflows still feel brittle for mundane follow-through, and he’s hoping GPT-5.5 or similar finally makes them dependable.

X
06
Matt Turck Matt Turck FirstMarkCap

Serverless went headless; diligence got harder

He jokes that VCs should do more diligence, but software has gotten so abstract — first serverless, now headless — there’s less surface area left to inspect. The real point: in a world of thinner infrastructure, investors need sharper judgment, not just more checklists.

X
07
Garry Tan Garry Tan CEO, ycombinator

Claude Code is becoming the new app factory

He says he spots a need in his own workflow, has Claude Code build it, then ships it open source. The latest example is GStack v1.4, which adds a new /make-pdf skill and is built to work well with OpenClaw/Hermes and Claude Code as a tool. He’s also pushing OpenClaw to replace crons and subagents where possible, with better plugin APIs as the real fix.

X
08
Nan Yu Nan Yu head of product, linear

Bad PR can be the best PR

He argues that sounding bad at PR can actually make people trust you more, because it reads as less polished and less deceptive. He also frames some press releases as basically asking for a fake choice — like picking a red or green Lambo instead of questioning the purchase itself.

X
PODCAST HIGHLIGHTS
1

OpenAI is betting on longer-horizon autonomy, not just smarter chatbots

The Takeaway: The real frontier isn’t chatty AI — it’s models that can work for days, verify progress, and discover things.

  • Math and coding became the proving ground because they’re hard but checkable; that same logic is now being pushed into messier domains like science, medicine, and law.
  • The next leap isn’t “more prompts,” it’s longer autonomy: models that can evaluate partial progress, use more compute at test time, and keep going on open-ended tasks.
  • OpenAI’s internal focus has shifted from benchmark bragging rights to practical research leverage, because “the models are going to drive a lot of that.”

Ako Paioki, OpenAI’s chief scientist, sounds less interested in hype than in the mechanics of making AI useful. His view is that coding tools like Codex are a signal, not the destination: OpenAI already uses them for most actual coding, and he expects the pattern to extend into research workflows. The same goes for math. Benchmarks like IMO problems mattered because they were a clean North Star — “Math is very measurable,” he says — but the deeper value was training models to reason over long, difficult, verifiable tasks.

That’s why he keeps returning to horizon length. A model doesn’t need to be told “go solve alignment” tomorrow; it needs to get better at making partial progress on a long project, checking itself, and staying useful over time. He thinks RL will matter beyond code, but not as a copy-paste of today’s pipelines. The bigger shift may be models that adapt through context and existing interfaces — Slack, tools, workflows — rather than forcing companies to build bespoke harnesses around them.

The most revealing line: “We are no longer really purely building brains in the sky.” The message is clear: the company is optimizing for models that can touch the real world, accelerate research, and eventually become collaborators, not just assistants.

STAY UPDATED

Daily builder insights, straight to your inbox.

Prefer RSS? Subscribe via RSS

ARCHIVE
2026-04-19 8 items

Rauch said design was becoming autonomous, not just a tool. Steinberger made CodexBar safer, faster, and lighter; Anthropic added Auto Mode to Claude Code and showed benchmark scores can swing with eval infra. Levie warned AI agents would force constant rewrites.

2026-04-18 13 items

Weil folded OpenAI for Science into core teams, while Google split Flow into music-making and Josh Woodward added remix control. Albert and Peter Yang showed Claude Design turning taste into production-grade assets, and Levie, Ryo Lu, and No Priors all argued AI wins when it serves workflows, not replaces them.

2026-04-17 15 items

Anthropic launched Managed Agents to decouple agent infra, while Claude Code defaulted to xhigh effort and got a usage-focused upgrade. Rauch said agents need durability over clever prompts, and Swyx split AI engineering into slop vs rigor.

2026-04-16 14 items

Rauch said teams were building their own design factories, while Steinberger called open-source AI security a full-time arms race. Masad priced OSS trust in compute, and Woodward shipped Gemini on Mac in 100 days.

2026-04-15 15 items

Woodward said Gemini’s turning into a test-prep machine, Albert called Claude Code the whole workspace, and Cat Wu shipped a desktop control center with parallel sessions and review tools. Rauch also argued agent builders need elastic Postgres, not vibes.

2026-04-14 10 items

Rauch said the moat moved from code to the code factory, while Levie argued every team now needed an agent wrangler. Cursor leaned into customizable multi-agent views, Replit added region controls, and No Priors backed Periodic Labs’ bet that AI could learn atoms by running experiments.

2026-04-13 10 items

Amjad Masad said Apple’s 50th has turned into a PR disaster, while Aaron Levie argued agents would create more work, not cut jobs. Rauch pushed engineers into the customer hot seat, and Claude warned teams to harden security fast.

2026-04-12 11 items

Thariq said Claude Code now handles TurboTax pain, while Rauch called microVM sandboxes the new compute layer. Aditya Agarwal pushed memory over loops, and Levie argued AI won’t shrink law—it’ll inflate it.

2026-04-11 16 items

Claude pushed into Word with tracked edits, and Claude Code moved planning to the web with auto mode approvals. Garry Tan called agents the Altair BASIC era, while Aaron Levie warned software without a real API gets left behind.

2026-04-10 12 items

Karpathy said free ChatGPT lagged while frontier coding models didn’t. Albert pushed cheap-to-smart escalation, Rauch said cloud infra went agent-native, and OpenAI’s next leap looked like autonomy—not chat.

2026-04-09 16 items

Woodward gave Gemini a second brain with Notebooks, while Anthropic shipped Managed Agents to move Claude from prompt to production. Rauch called the web AI’s native OS, and Levie, Masad, and Shipper all bet agents will do the work, not the people.

2026-04-08 12 items

Albert teased Anthropic’s Mythos Preview, Cat Wu juiced Claude Code’s CLI tricks, and Peter Steinberger patched CodexBar with 2 providers plus billing fixes. Levie said agents are eating knowledge work, while Nikunj Kothari preached retention over launch hype.

2026-04-07 8 items

Levie said agents won’t erase work, just push it up a layer; Yang argued they’ll shrink teams, not ambition. Garry Tan flagged an unpatched file leak in Claude’s coding env, while Kothari called Anthropic’s revenue ramp absurdly fast.

2026-04-06 10 items

Rauch said v0 now builds physics, not just UI, while Karpathy noted GitHub Gists have weirdly good comments. Levie argued AI efficiency creates more work, not less, and Tan called open source’s golden age.

2026-04-05 4 items

Karpathy pushed “your data, your files, your AI.” Levie argued context beat raw model IQ in enterprise AI. Garry Tan said GStack kept shipping security fixes fast, while No Priors spotlighted Periodic Labs’ bet on atoms, not just text.

2026-04-04 9 items

Claude plugged into Microsoft 365 everywhere, Swyx said Devin one-shot blog-to-code, and Peter Steinberger called out GitHub’s API as still not built for agents. Aaron Levie hit the context wall, while Garry Tan shipped a DX review tool from his own stack.

2026-04-03 10 items

Claude landed computer use on Windows, Karpathy argued LLMs should build your wiki, and Amjad Masad pushed Replit deeper into enterprise sales. Peter Yang said Cursor 3 got out of the agent’s way, while Peter Steinberger warned AI slop was flooding kernel security with real bugs.

2026-04-02 12 items

Steinberger called plan mode training wheels, while Thariq gave Claude Code a mouse-friendly renderer and Cat Wu showed sessions jumping phone-to-laptop. Masad framed Replit as an OS for agents, Rauch said Vercel signups compounded fast, and Anthropic’s infra tweaks swung coding scores by 6 points.

2026-04-01 4 items

Levie said AI productivity hit the enterprise risk wall, while Weil argued proofs got cleaner, not just better. Agarwal floated public source code as the new prod debugging, and Data Driven NYC claimed one founder could run a company if agents handled the layers below.

2026-03-31 15 items

Karpathy warned unpinned deps can turn one hack into mass pwnage, while Rauch and Levie said agents still need human guardrails and redesigned workflows. Meanwhile Claude Code got enterprise auto mode, Replit added built-in monetization, and Swyx spotted “Sign in with ChatGPT” already live.

2026-03-29 7 items

Andrej Karpathy highlighted how LLMs can argue any side, suggesting we use it as a feature. Guillermo Rauch finally shipped his dream text layout, bringing his vision to life. Meanwhile, Amjad Masad claimed AI is democratizing app building and elevating top engineers.

2026-03-28 7 items

Andrej Karpathy suggested leveraging LLMs' ability to argue any side as a feature. Guillermo Rauch turned text layout dreams into reality with Vercel's latest feature. Meanwhile, Amjad Masad claimed AI is democratizing app building, liberating top engineers for bigger challenges.