MCP won’t scale by exposing everything; it needs ruthless curation.
The Takeaway: The winning MCP strategy isn’t “more tools”; it’s fewer, sharper tools designed for how models actually behave.
- Exposing an entire API to an LLM is a trap: you burn context, confuse the model, and still don’t get reliable execution.
- The hard part isn’t wiring up endpoints — it’s product design, evals, and making the tool names, schemas, and outputs model-friendly.
- A practical workaround is to compress knowledge outside the model, like Alex’s markdown “knowledge repo,” so the AI can reuse curated context instead of rediscovering it.
Alex Rattray, founder and CEO of Stainless, has spent years building the plumbing that lets computers talk to computers — first through APIs and SDKs for companies like Stripe, OpenAI, and Anthropic, and now through MCP servers. His view is blunt: the dream of agentic AI is real, but the current implementation is clumsy. If you hand an LLM every endpoint in a giant product like Stripe, “you’ve burned through your entire context budget” before the model even starts thinking.
His answer is not to expose more, but to expose better. That means fewer tools, precise names, tight schemas, and responses that return only what the model needs. It also means accepting that MCP is still a research problem, not a solved interface. Humans can learn Python; models can’t learn to “think like an LLM” from the outside, so the interface has to do the heavy lifting.
The most interesting part is how Alex uses AI himself: not as a chatbot, but as a research assistant that writes into a curated Git repo of notes, quotes, and citations. That way, future questions don’t require another expensive search through live systems. It’s a very Stainless answer to the AI era: don’t just automate the task — design the interface so the machine can actually use it.