The Takeaway: Cerebras bet that AI speed would matter before anyone else did—and that patience, not hype, would decide the winner.
Key Insights
- Radical performance usually requires a radical architecture; Feldman says you don’t get 15–20x better by making a small tweak to the GPU model.
- Hardware is brutally non-linear: Cerebras spent years “ahead of the market,” building a dinner-plate-sized wafer-scale chip while demand was basically zero.
- The real inflection came when AI stopped being a demo and became daily work; then slow inference became unacceptable, just like “slow search” or dial-up internet.
The Story
Andrew Feldman, cofounder and CEO of Cerebras, has spent more than a decade building what he calls AI computers—machines optimized for inference and training, not general-purpose computing. The company’s wager was contrarian from day one: wafer-scale chips, a 46,000 square millimeter design the size of a dinner plate, when everyone else was building postage-stamp silicon. “They told us we were out of our mind,” he says. “It would never work.”
It almost didn’t. Between 2017 and 2019, the team burned roughly $8 million a month trying to make the thing work, with board meetings every six weeks asking the same painful question: why isn’t it built yet? Then the chip finally yielded, and the company spent the next few years in an awkward place—technically ahead, commercially ignored. Feldman’s read is simple: “When it’s a novelty, nobody cares that you’re fast.”
That changed in 2025, when AI became embedded in real workflows and speed turned into a hard requirement. Cerebras was suddenly in demand across models and customers, from supercomputing labs to OpenAI and AWS. Feldman’s philosophy is blunt: love the hard road, hire fearlessly, and don’t let a company drift into safe mediocrity. His favorite line captures it: “I’m a professional David.”