AI Builders Brief
?

Follow builders, not influencers.

2026.05.02

25+ builders tracked

TL;DR

Levie said agents will swell systems of record, not replace them. Masad marked Replit’s 10th with 24 hours free, Anthropic fixed Claude Code regressions and added connectors plus cross-session memory, and Steinberger called Codex’s /goal mode the real deal.

BUILDER INSIGHTS
8
01
Aaron Levie Aaron Levie CEO, box

Agents will swell systems of record, not replace them

He says the real AI winner is the software that sits underneath agent-heavy workflows: guardrails, security, compliance, data, and records. More agents means more code, contracts, invoices, and payments — which should lift the systems that manage all that work, not just the flashy front-end tools.

X
02
Garry Tan Garry Tan CEO, ycombinator

Asset seizure talk could blow up California’s tax base

He argues that proposing asset-seizure measures is a fast way to scare off wealthy taxpayers and crater California’s revenue base. The point is blunt: middle-class taxpayers would end up eating the lost billions.

X
03
Nikunj Kothari Nikunj Kothari Partner, fpvventures

AI + Stripe for a tiny paid side project

He says he built a house-hunting report product with Railway, Conductor, and Claude, then wired in Stripe so each report sells basically at cost. It’s a scrappy first paid project, and he’s even offering promo codes to people actively house hunting in exchange for feedback.

X
04
Aditya Agarwal Aditya Agarwal CTO, SouthPkCommons

Ignore the product, and you’re already dying

He says the fastest way to kill a company is to obsess over everything except the product. It’s a blunt reminder from the Bevel Health co-founder and former Dropbox CTO: if the product isn’t getting better, nothing else matters.

X
05
Zara Zhang Zara Zhang

Treat coding agents like cofounders, not employees

She says the best way to use coding agents is as a cofounder: don’t just hand them tasks, bring them the problem, context, and ask for their take. That’s a useful mental shift for builders using AI — less command-and-control, more collaborative problem solving.

X
06
Peter Steinberger Peter Steinberger OpenClaw

Codex’s new /goal mode is the real deal

He says Codex’s new /goal feature “slaps,” which is about as strong a product endorsement as you’ll get from an AI tinkerer. He also joked that he had to pay up to get xAI working again, so the subtext is clear: these tools are getting good enough that people are willing to fight through friction to keep using them.

X
07
Amjad Masad Amjad Masad CEO, replit

Replit turns 10 and goes free for 24 hours

Replit is marking its 10th birthday by making the product free for 24 hours, a neat flex for a company that’s spent the last decade trying to make coding accessible. The bigger message: this wasn’t a quick startup win — it’s been a 2011-era mission to help millions learn and ship.

X
08
Dan Shipper Dan Shipper CEO, every

Humans still learn faster than models

He argues the key asymmetry in AI is simple: models know more than any one person, but people still learn faster than models do. That’s the real edge for builders at Every — use the model for breadth, then let human speed turn it into judgment.

X
BLOG UPDATES
3
Anthropic Engineering

An update on recent Claude Code quality reports

Anthropic fixes three Claude Code regressions, resets limits

Lead: Anthropic says recent quality complaints about Claude Code came from three separate product changes—not the API or core model—and all have now been fixed, with usage limits reset for subscribers.

Numbers:

  • Issues were resolved by April 20 (v2.1.116).
  • The default reasoning-effort change was rolled back on April 7.
  • The caching bug was fixed on April 10.
  • The verbosity prompt change was reverted on April 20.
  • Anthropic says one evaluation showed a 3% drop from the prompt change.

So What: The practical takeaway is that Claude Code users should now see restored behavior, with defaults set back to high effort for most models and xhigh for Opus 4.7. Anthropic also says the API was unaffected, but the product-layer bugs made Claude seem “forgetful,” repetitive, or less intelligent because different slices of traffic were hit at different times. The company is tightening prompt-change controls, broadening per-model evals, adding soak periods and gradual rollouts, and improving internal code review. As Anthropic put it, “This isn’t the experience users should expect from Claude Code.”

Claude Blog

New connectors in Claude for everyday life

Claude adds everyday-life connectors for travel, shopping, and more

Lead: Claude is expanding its connector ecosystem beyond work tools to include everyday apps like AllTrails, Instacart, Audible, Tripadvisor, TurboTax, Uber, and more, with smarter in-chat suggestions for the right service.

Numbers:

  • Claude’s directory has grown to 200+ connectors since launching in July 2025.
  • New connectors include AllTrails, Audible, Booking.com, Instacart, Intuit Credit Karma, Intuit TurboTax, Resy, Spotify, StubHub, Taskrabbit, Thumbtack, Tripadvisor, Uber, Uber Eats, and Viator.
  • Connectors are available on all plans; mobile is in beta.

So What: This makes Claude more useful as an action layer across daily life: ask for a hike, a reservation, a grocery cart, or a flight, and Claude can surface the right app inside the same conversation. Anthropic says the system is ad-free, has no sponsored answers, and that “before it books or purchases something on your behalf, it’s designed to check with you first.” For builders, the message is clear: if your product would benefit from being callable inside Claude, you can submit it to the directory and reach users where they already work and plan.

Claude Blog

Built-in memory for Claude Managed Agents

Claude Managed Agents now have built-in cross-session memory

Lead: Claude Managed Agents now ship with public beta memory, letting agents learn across sessions through a filesystem-based memory layer that’s designed for production control and portability.

Numbers:

  • Rakuten says memory cut first-pass errors by 97%.
  • Wisedocs reports verification is 30% faster.
  • Memory supports multiple agents concurrently on the same store without overwriting each other.

So What: For builders, this removes the need to bolt on custom retrieval or memory infrastructure: memories are stored as files, can be exported and managed via the API, and come with scoped permissions, audit logs, rollback, and redaction. Anthropic says the system is tuned so “our latest models save more comprehensive, well-organized memories and are more discerning about what to remember.” Teams can share org-wide or per-user stores, trace what each agent learned in the Claude Console, and use memory to carry context, avoid repeated mistakes, and speed up long-running workflows.

PODCAST HIGHLIGHTS
1

Inference is becoming the real AI moat, not just a commodity

The Takeaway: The winners in AI won’t just own models — they’ll own the workflow signal, the compute, and the post-training loop.

  • Custom models are already the default for serious AI companies; Baseten says 90-95% of its tokens are on tailored inference, not vanilla open-source weights.
  • The application layer survives because companies own unique user signal and workflow data that frontier labs can’t easily copy.
  • The real bottleneck isn’t just GPUs — it’s supply, operations, and capital structure, which is why inference is starting to look like an infrastructure-finance business.

Tuhin Srivastava, CEO of Baseten, is building what he calls the inference cloud, and his view is blunt: the market has moved from “can AI work?” to “how fast can we customize it?” He says open-source models have crossed a capability threshold, post-training is now mainstream, and customers increasingly want to “own their inference more and more.” That shift is why Baseten has scaled 30x in a year.

His core argument is that the durable moat lives in workflows, not just model weights. A company like Abridge, for example, captures clinician edits and downstream actions inside hospital systems — signal a frontier lab can’t access. That lets the application layer train better models on its own reward data. In Tuhin’s words, “the thing that is valuable to a company is the user signal that they can gather that only they can gather.”

He’s equally sharp on infrastructure. Baseten runs 90 clusters across 18 clouds and still sits in the mid-90s on utilization. Capacity is so tight that the company holds a daily 4 p.m. meeting just to manage supply. And because inference is sticky, the software layer matters: “GPUs as a service is not sticky,” but integrated inference software is.

His bet: the next moat is a mix of custom models, compute access, and the ability to turn production usage into better models faster.

STAY UPDATED

Daily builder insights, straight to your inbox.

Prefer RSS? Subscribe via RSS

ARCHIVE
2026-05-01 13 items

Karpathy said LLMs were new software, not just faster software. Rauch, Steinberger, and Cat Wu shipped agent upgrades, while Levie, Masad, and Agarwal bet agents ate UI, made Replit its own customer, and turned AI into both attack and defense.

2026-04-30 14 items

Josh Woodward said Gemini can now generate and export files, while Ryo Lu framed Cursor as a local-cloud agent stack. Aditya Agarwal called agents Linux-era jerry-rigging; Aaron Levie, Zara Zhang, and Garry Tan all bet the real work now is building internal agent ops, onboarding, and tests.

2026-04-29 9 items

Claude and Claude Code pushed into creative tools and papercuts, while Rauch said devtools now served agents more than humans. Masad warned free dev tools wouldn’t survive bot abuse, and Peter Steinberger showed AI commit bots already reviewed, fixed, and re-reviewed.

2026-04-27 8 items

Altman called for an agent-first reset of OSes and the internet, while Rauch said coding agents were the base layer of superintelligence. Levie argued AI hid hard parts instead of killing jobs, and Anthropic shipped Auto mode for safer no-prompt Claude Code.

2026-04-26 10 items

Altman said OpenAI still lags on frontend but wins on brains. Levie bet on weird future talent, Masad said every company turns into a cybersecurity company, and Tan showed Claude Code with a browser sidecar.

2026-04-25 16 items

Altman dropped GPT-5.5 into the API, and Cursor’s Ryo Lu bet on it plus Composer 2. Peter Yang said it can spit out a Star Fox clone; Anthropic shipped Managed Agents, while Replit, NotebookLM, and Discord all got sharper.

2026-04-24 13 items

Altman said Codex moved from demo to company-wide rollout, while Claude shipped persistent cross-session memory and everyday-life connectors. Masad shrugged off “Chinese distillation” panic, and Dan Shipper/Peter Yang said GPT-5.5 finally just does the work and clears game-build tests.

2026-04-23 13 items

Claude added interactive charts and Claude Code desktop with parallel sessions; Josh Woodward shipped Gemini conversation branching. Amjad Masad said static analysis lifted LLMs 90%+, while Aaron Levie and Guillermo Rauch framed agents and petabyte-scale hunts as the new battleground.

2026-04-22 10 items

Altman said OpenAI wanted you swimming in AI—and GPUs. Masad pushed for a fairer software market, Levie said enterprise agents needed humans to actually land, and Shipper showed agents could now read voice notes.

2026-04-21 10 items

Rauch said delete isn’t rotation, Levie argued agents need operators, not just users, and Steinberger kept OpenClaw pushing AI into real workflows. Shipper backed two-agent setups, while Claude warned teams to harden security now.

2026-04-20 9 items

Rauch said an AI-accelerated attack exposed Vercel’s weak link, while Kothari warned AI will supercharge attacks too. Garry Tan called Claude Code the new app factory, and Peter Yang noted agents still flaked on boring cron jobs.

2026-04-19 8 items

Rauch said design was becoming autonomous, not just a tool. Steinberger made CodexBar safer, faster, and lighter; Anthropic added Auto Mode to Claude Code and showed benchmark scores can swing with eval infra. Levie warned AI agents would force constant rewrites.

2026-04-18 13 items

Weil folded OpenAI for Science into core teams, while Google split Flow into music-making and Josh Woodward added remix control. Albert and Peter Yang showed Claude Design turning taste into production-grade assets, and Levie, Ryo Lu, and No Priors all argued AI wins when it serves workflows, not replaces them.

2026-04-17 15 items

Anthropic launched Managed Agents to decouple agent infra, while Claude Code defaulted to xhigh effort and got a usage-focused upgrade. Rauch said agents need durability over clever prompts, and Swyx split AI engineering into slop vs rigor.

2026-04-16 14 items

Rauch said teams were building their own design factories, while Steinberger called open-source AI security a full-time arms race. Masad priced OSS trust in compute, and Woodward shipped Gemini on Mac in 100 days.

2026-04-15 15 items

Woodward said Gemini’s turning into a test-prep machine, Albert called Claude Code the whole workspace, and Cat Wu shipped a desktop control center with parallel sessions and review tools. Rauch also argued agent builders need elastic Postgres, not vibes.

2026-04-14 10 items

Rauch said the moat moved from code to the code factory, while Levie argued every team now needed an agent wrangler. Cursor leaned into customizable multi-agent views, Replit added region controls, and No Priors backed Periodic Labs’ bet that AI could learn atoms by running experiments.

2026-04-13 10 items

Amjad Masad said Apple’s 50th has turned into a PR disaster, while Aaron Levie argued agents would create more work, not cut jobs. Rauch pushed engineers into the customer hot seat, and Claude warned teams to harden security fast.

2026-04-12 11 items

Thariq said Claude Code now handles TurboTax pain, while Rauch called microVM sandboxes the new compute layer. Aditya Agarwal pushed memory over loops, and Levie argued AI won’t shrink law—it’ll inflate it.

2026-04-11 16 items

Claude pushed into Word with tracked edits, and Claude Code moved planning to the web with auto mode approvals. Garry Tan called agents the Altair BASIC era, while Aaron Levie warned software without a real API gets left behind.

2026-04-10 12 items

Karpathy said free ChatGPT lagged while frontier coding models didn’t. Albert pushed cheap-to-smart escalation, Rauch said cloud infra went agent-native, and OpenAI’s next leap looked like autonomy—not chat.

2026-04-09 16 items

Woodward gave Gemini a second brain with Notebooks, while Anthropic shipped Managed Agents to move Claude from prompt to production. Rauch called the web AI’s native OS, and Levie, Masad, and Shipper all bet agents will do the work, not the people.

2026-04-08 12 items

Albert teased Anthropic’s Mythos Preview, Cat Wu juiced Claude Code’s CLI tricks, and Peter Steinberger patched CodexBar with 2 providers plus billing fixes. Levie said agents are eating knowledge work, while Nikunj Kothari preached retention over launch hype.

2026-04-07 8 items

Levie said agents won’t erase work, just push it up a layer; Yang argued they’ll shrink teams, not ambition. Garry Tan flagged an unpatched file leak in Claude’s coding env, while Kothari called Anthropic’s revenue ramp absurdly fast.

2026-04-06 10 items

Rauch said v0 now builds physics, not just UI, while Karpathy noted GitHub Gists have weirdly good comments. Levie argued AI efficiency creates more work, not less, and Tan called open source’s golden age.

2026-04-05 4 items

Karpathy pushed “your data, your files, your AI.” Levie argued context beat raw model IQ in enterprise AI. Garry Tan said GStack kept shipping security fixes fast, while No Priors spotlighted Periodic Labs’ bet on atoms, not just text.

2026-04-04 9 items

Claude plugged into Microsoft 365 everywhere, Swyx said Devin one-shot blog-to-code, and Peter Steinberger called out GitHub’s API as still not built for agents. Aaron Levie hit the context wall, while Garry Tan shipped a DX review tool from his own stack.

2026-04-03 10 items

Claude landed computer use on Windows, Karpathy argued LLMs should build your wiki, and Amjad Masad pushed Replit deeper into enterprise sales. Peter Yang said Cursor 3 got out of the agent’s way, while Peter Steinberger warned AI slop was flooding kernel security with real bugs.

2026-04-02 12 items

Steinberger called plan mode training wheels, while Thariq gave Claude Code a mouse-friendly renderer and Cat Wu showed sessions jumping phone-to-laptop. Masad framed Replit as an OS for agents, Rauch said Vercel signups compounded fast, and Anthropic’s infra tweaks swung coding scores by 6 points.

2026-04-01 4 items

Levie said AI productivity hit the enterprise risk wall, while Weil argued proofs got cleaner, not just better. Agarwal floated public source code as the new prod debugging, and Data Driven NYC claimed one founder could run a company if agents handled the layers below.

2026-03-31 15 items

Karpathy warned unpinned deps can turn one hack into mass pwnage, while Rauch and Levie said agents still need human guardrails and redesigned workflows. Meanwhile Claude Code got enterprise auto mode, Replit added built-in monetization, and Swyx spotted “Sign in with ChatGPT” already live.

2026-03-29 7 items

Andrej Karpathy highlighted how LLMs can argue any side, suggesting we use it as a feature. Guillermo Rauch finally shipped his dream text layout, bringing his vision to life. Meanwhile, Amjad Masad claimed AI is democratizing app building and elevating top engineers.

2026-03-28 7 items

Andrej Karpathy suggested leveraging LLMs' ability to argue any side as a feature. Guillermo Rauch turned text layout dreams into reality with Vercel's latest feature. Meanwhile, Amjad Masad claimed AI is democratizing app building, liberating top engineers for bigger challenges.