AI Builders Brief — 2026-05-02

Follow builders, not influencers.

2026.05.02

25+ builders tracked

TL;DR

Levie said agents will swell systems of record, not replace them. Masad marked Replit’s 10th with 24 hours free, Anthropic fixed Claude Code regressions and added connectors plus cross-session memory, and Steinberger called Codex’s /goal mode the real deal.

BUILDER INSIGHTS

Aaron Levie CEO, box

Agents will swell systems of record, not replace them

He says the real AI winner is the software that sits underneath agent-heavy workflows: guardrails, security, compliance, data, and records. More agents means more code, contracts, invoices, and payments — which should lift the systems that manage all that work, not just the flashy front-end tools.

#1 430 #2 232

Garry Tan CEO, ycombinator

Asset seizure talk could blow up California’s tax base

He argues that proposing asset-seizure measures is a fast way to scare off wealthy taxpayers and crater California’s revenue base. The point is blunt: middle-class taxpayers would end up eating the lost billions.

#1 463 #2

Nikunj Kothari Partner, fpvventures

AI + Stripe for a tiny paid side project

He says he built a house-hunting report product with Railway, Conductor, and Claude, then wired in Stripe so each report sells basically at cost. It’s a scrappy first paid project, and he’s even offering promo codes to people actively house hunting in exchange for feedback.

#1 179 #2 #3

Aditya Agarwal CTO, SouthPkCommons

Ignore the product, and you’re already dying

He says the fastest way to kill a company is to obsess over everything except the product. It’s a blunt reminder from the Bevel Health co-founder and former Dropbox CTO: if the product isn’t getting better, nothing else matters.

Zara Zhang

Treat coding agents like cofounders, not employees

She says the best way to use coding agents is as a cofounder: don’t just hand them tasks, bring them the problem, context, and ask for their take. That’s a useful mental shift for builders using AI — less command-and-control, more collaborative problem solving.

#1 #2 #3

Peter Steinberger OpenClaw

Codex’s new /goal mode is the real deal

He says Codex’s new /goal feature “slaps,” which is about as strong a product endorsement as you’ll get from an AI tinkerer. He also joked that he had to pay up to get xAI working again, so the subtext is clear: these tools are getting good enough that people are willing to fight through friction to keep using them.

#1 #2 433 #3 2.0k

Amjad Masad CEO, replit

Replit turns 10 and goes free for 24 hours

Replit is marking its 10th birthday by making the product free for 24 hours, a neat flex for a company that’s spent the last decade trying to make coding accessible. The bigger message: this wasn’t a quick startup win — it’s been a 2011-era mission to help millions learn and ship.

Dan Shipper CEO, every

Humans still learn faster than models

He argues the key asymmetry in AI is simple: models know more than any one person, but people still learn faster than models do. That’s the real edge for builders at Every — use the model for breadth, then let human speed turn it into judgment.

#1 #2 #3

BLOG UPDATES

Anthropic Engineering

An update on recent Claude Code quality reports

Anthropic fixes three Claude Code regressions, resets limits

Lead: Anthropic says recent quality complaints about Claude Code came from three separate product changes—not the API or core model—and all have now been fixed, with usage limits reset for subscribers.

Numbers:

Issues were resolved by April 20 (v2.1.116).
The default reasoning-effort change was rolled back on April 7.
The caching bug was fixed on April 10.
The verbosity prompt change was reverted on April 20.
Anthropic says one evaluation showed a 3% drop from the prompt change.

So What: The practical takeaway is that Claude Code users should now see restored behavior, with defaults set back to high effort for most models and xhigh for Opus 4.7. Anthropic also says the API was unaffected, but the product-layer bugs made Claude seem “forgetful,” repetitive, or less intelligent because different slices of traffic were hit at different times. The company is tightening prompt-change controls, broadening per-model evals, adding soak periods and gradual rollouts, and improving internal code review. As Anthropic put it, “This isn’t the experience users should expect from Claude Code.”

Read original

Claude Blog

New connectors in Claude for everyday life

Claude adds everyday-life connectors for travel, shopping, and more

Lead: Claude is expanding its connector ecosystem beyond work tools to include everyday apps like AllTrails, Instacart, Audible, Tripadvisor, TurboTax, Uber, and more, with smarter in-chat suggestions for the right service.

Numbers:

Claude’s directory has grown to 200+ connectors since launching in July 2025.
New connectors include AllTrails, Audible, Booking.com, Instacart, Intuit Credit Karma, Intuit TurboTax, Resy, Spotify, StubHub, Taskrabbit, Thumbtack, Tripadvisor, Uber, Uber Eats, and Viator.
Connectors are available on all plans; mobile is in beta.

So What: This makes Claude more useful as an action layer across daily life: ask for a hike, a reservation, a grocery cart, or a flight, and Claude can surface the right app inside the same conversation. Anthropic says the system is ad-free, has no sponsored answers, and that “before it books or purchases something on your behalf, it’s designed to check with you first.” For builders, the message is clear: if your product would benefit from being callable inside Claude, you can submit it to the directory and reach users where they already work and plan.

Read original

Claude Blog

Built-in memory for Claude Managed Agents

Claude Managed Agents now have built-in cross-session memory

Lead: Claude Managed Agents now ship with public beta memory, letting agents learn across sessions through a filesystem-based memory layer that’s designed for production control and portability.

Numbers:

Rakuten says memory cut first-pass errors by 97%.
Wisedocs reports verification is 30% faster.
Memory supports multiple agents concurrently on the same store without overwriting each other.

So What: For builders, this removes the need to bolt on custom retrieval or memory infrastructure: memories are stored as files, can be exported and managed via the API, and come with scoped permissions, audit logs, rollback, and redaction. Anthropic says the system is tuned so “our latest models save more comprehensive, well-organized memories and are more discerning about what to remember.” Teams can share org-wide or per-user stores, trace what each agent learned in the Claude Console, and use memory to carry context, avoid repeated mistakes, and speed up long-running workflows.

Read original

PODCAST HIGHLIGHTS

No Priors

Baseten CEO Tuhin Srivastava on the AI Inference Crunch, Custom Models, and Building the Inference Cloud

Inference is becoming the real AI moat, not just a commodity

The Takeaway: The winners in AI won’t just own models — they’ll own the workflow signal, the compute, and the post-training loop.

Custom models are already the default for serious AI companies; Baseten says 90-95% of its tokens are on tailored inference, not vanilla open-source weights.
The application layer survives because companies own unique user signal and workflow data that frontier labs can’t easily copy.
The real bottleneck isn’t just GPUs — it’s supply, operations, and capital structure, which is why inference is starting to look like an infrastructure-finance business.

Tuhin Srivastava, CEO of Baseten, is building what he calls the inference cloud, and his view is blunt: the market has moved from “can AI work?” to “how fast can we customize it?” He says open-source models have crossed a capability threshold, post-training is now mainstream, and customers increasingly want to “own their inference more and more.” That shift is why Baseten has scaled 30x in a year.

His core argument is that the durable moat lives in workflows, not just model weights. A company like Abridge, for example, captures clinician edits and downstream actions inside hospital systems — signal a frontier lab can’t access. That lets the application layer train better models on its own reward data. In Tuhin’s words, “the thing that is valuable to a company is the user signal that they can gather that only they can gather.”

He’s equally sharp on infrastructure. Baseten runs 90 clusters across 18 clouds and still sits in the mid-90s on utilization. Capacity is so tight that the company holds a daily 4 p.m. meeting just to manage supply. And because inference is sticky, the software layer matters: “GPUs as a service is not sticky,” but integrated inference software is.

His bet: the next moat is a mix of custom models, compute access, and the ability to turn production usage into better models faster.

YouTube

STAY UPDATED

Daily builder insights, straight to your inbox.

Prefer RSS? Subscribe via RSS