AI Builders Brief
?
← BACK TO TODAY

Follow builders, not influencers.

2026.04.24

25+ builders tracked

TL;DR

Altman said Codex moved from demo to company-wide rollout, while Claude shipped persistent cross-session memory and everyday-life connectors. Masad shrugged off “Chinese distillation” panic, and Dan Shipper/Peter Yang said GPT-5.5 finally just does the work and clears game-build tests.

BUILDER INSIGHTS
9
01
Sam Altman Sam Altman

Codex is moving from demo to company-wide rollout

He says OpenAI and NVIDIA just tested a new way to deploy Codex across an entire company, and it actually worked. That’s the interesting part: this is less about a flashy AI demo and more about pushing coding agents into real enterprise workflows.

X
02
Amjad Masad Amjad Masad CEO, replit

Open AI beats panic about “Chinese distillation”

He says US politicians are fearmongering about Chinese distillation while Chinese scientists are sharing real AI breakthroughs openly. His take: these advances aren’t about hoarding data, and they help everyone — including small and maybe even big US labs.

X
03
Claude Claude anthropicai

Claude adds persistent memory for agents

Memory on Claude Managed Agents is now in public beta, so agents can learn from every session instead of starting from scratch. Anthropic says the memory layer is built to balance performance and flexibility, and developers can export and manage memories via the API to keep control.

X
04
Aaron Levie Aaron Levie CEO, box

AI won’t cut work — it expands it

He says AI isn’t shrinking workloads; it’s making more work worth starting. At Box, he’s seeing agents turn “never got done” tasks into 3-hour rabbit holes, and even make some ongoing work economical to hire out. He also says GPT-5.5 is a real step up for enterprise knowledge work, with Box’s evals showing a 10-point accuracy jump on complex content tasks.

X
05
Aditya Agarwal Aditya Agarwal CTO, SouthPkCommons

SF wins by packing weird builders together

He says San Francisco’s edge isn’t just talent or VC — it’s a culture where curious, humble builders keep showing up, jamming, and pushing weird ideas until they work. He points to a design talk at South Park Commons and a mind-bending pixel-generation demo as proof that the city’s density of builders is what keeps producing outsized breakthroughs.

X
06
Dan Shipper Dan Shipper CEO, every

GPT-5.5 stops planning and just does the work

He says many models can outline a great plan, then hesitate — but OpenAI’s GPT-5.5 actually follows through. His take is that this is a real behavior shift, not just a benchmark bump, and he’s framing it as a practical upgrade for anyone using AI to get work done.

X
07
Garry Tan Garry Tan CEO, ycombinator

GBrain gets smarter with graph + vector search

He says GBrain’s new evals show a big jump when you layer graph search and vector search on top of grep across knowledge wikis. He’s also pushing more of his OpenClaw cron jobs and subagents onto GBrain Minions, with stability work aimed at making that infra stick.

X
08
Peter Yang Peter Yang

GPT-5.5 finally clears the game-build test

He says GPT-5.5 plus Codex is the first model combo that actually built a working F-Zero-style game in his recurring benchmark. That’s a pretty clean signal that the new stack is moving from demos to real, playable output — and he’s already using it to spin up bots to race against.

X
09
Nikunj Kothari Nikunj Kothari Partner, fpvventures

M&A is about to outpace fundraising

He says the startup market is tilting hard toward acquisitions: the seed-to-A gap is widening, 2021 zombiecorns are finally getting cleaned up, and talent is flowing to big token factories. His blunt takeaway as an FPV Ventures partner: there are plenty of founders, but very few real entrepreneurs — don’t start a company unless you can’t do anything else.

X
BLOG UPDATES
3
Anthropic Engineering

An update on recent Claude Code quality reports

Anthropic fixes three Claude Code regressions, resets limits

Lead: Anthropic says recent quality complaints about Claude Code came from three separate product-side changes—not the API or core models—and all have now been fixed in v2.1.116.

Numbers:

  • 3 distinct issues affected Claude Code, Claude Agent SDK, and Claude Cowork
  • Fixes landed on April 7, April 10, and April 20
  • The prompt change caused a 3% drop in broader evals
  • Usage limits are being reset for all subscribers as of April 23

So What: The company is tightening release controls because the regressions made Claude feel “less intelligent,” forgetful, and overly terse in some sessions. One bug repeatedly dropped prior reasoning after idle sessions, another defaulted users from high to medium effort, and a prompt tweak to reduce verbosity hurt coding quality. Anthropic says it will broaden internal testing on the exact public build, expand code review context, add per-model evals and prompt ablations, and use soak periods plus gradual rollouts for any change that could trade off against intelligence. As the post puts it, “We take reports about degradation very seriously.”

Claude Blog

New connectors in Claude for everyday life

Claude adds everyday-life connectors for travel, shopping, and more

Lead: Claude is expanding its connector ecosystem beyond work tools to include everyday apps like AllTrails, Instacart, Audible, TripAdvisor, TurboTax, Uber, and more, so users can act on personal tasks directly inside chat.

Numbers:

  • Claude directory has grown to 200+ connectors since launching in July 2025.
  • New connectors include 15+ consumer services, from travel and dining to taxes and rides.
  • Connectors are available on all plans; mobile is in beta.

So What: The big shift is that Claude now surfaces the right app dynamically based on your intent, context, and preferences, then keeps the workflow in one thread. Anthropic says, “Claude suggests the right app for what you’re doing,” and if multiple connectors fit, it shows options ranked by usefulness. For builders, this means a larger distribution surface for apps that can be installed into Claude’s directory, while users get a more agentic assistant that can recommend, compare, and prepare actions without leaving the conversation. Privacy and control remain central: no ads, no sponsored placements, app data isn’t used to train models, and Claude must ask before booking or purchasing on your behalf.

Claude Blog

Built-in memory for Claude Managed Agents

Claude Managed Agents get built-in cross-session memory

Lead: Claude Managed Agents now ship with public beta memory, letting agents learn from every session through a filesystem-based layer that’s designed for long-running, production use.

Numbers:

  • Public beta available today
  • Rakuten says first-pass errors fell by 97%
  • Wisedocs reports verification sped up by 30%
  • Memory stores can be shared across multiple agents with different access scopes

So What: This removes a major piece of custom infrastructure for teams building persistent agents: memory is portable, API-manageable, and auditable, with export, rollback, and redaction built in. Because memories are stored as files and mounted directly onto the filesystem, Claude can use the same bash and code execution tools it already relies on, while keeping “full control over what agents retain.” The practical payoff is better continuity across sessions, fewer repeated mistakes, and easier enterprise governance. Teams can use org-wide read-only stores, per-user read/write stores, and concurrent agents without overwriting each other. In short, Claude is positioning memory as a native capability for agents that need to improve over time, not a separate retrieval system you have to assemble yourself.

PODCAST HIGHLIGHTS
1

AI infra is stabilizing, but coding agents are just getting started

The Takeaway: The real shift isn’t “AI is everywhere” — it’s that coding agents have become the proving ground for a new market structure.

  • The infrastructure layer is finally settling into a usable pattern: agents now look like LLMs with tools, a file system, and “skills” as the minimal viable packaging format.
  • The biggest winners won’t just be model companies or apps; they’ll be the “outsourced AI teams” that sit between frontier models and messy enterprise workflows.
  • In coding, the market is still in capability-exploration mode, which means spending more, trying weirder things, and chasing speed can matter more than efficiency.

Swix, the founder behind the AI Engineer events and a close observer of the developer ecosystem, argues that the last year has been less about neat product categories and more about constant adaptation. He thinks the infrastructure chaos is easing, but only because the industry has converged on a simple shape: “skills,” APIs, and agent-friendly tooling. That doesn’t mean the game is over; it means the rules are clearer.

His sharper point is that the AI coding wars are already enormous — with OpenAI, Anthropic, Cursor, and Cognition all fighting for a market that has exploded in under a year. He sees this as a momentum game, not a mean-reversion story. The mistake is assuming coding is saturated when it may still be compounding. “Why if it went from 10 to 50% in the past year, why can’t it keep going?” he asks.

That same logic applies to infra, chips, and even go-to-market. Agents are now the primary users in many systems, which means products need to be API-first, CLI-friendly, and built for machine consumption. The bigger lesson: if you want to know where AI is headed next, watch coding — because it’s the first place where the market is rewarding raw capability over polish.

STAY UPDATED

Daily builder insights, straight to your inbox.

Prefer RSS? Subscribe via RSS

ARCHIVE