Modal is building AI infrastructure that doesn't get in the way
A conversation with Eric Bernhardsson, CEO of Modal
Eric Bernhardsson from Modal spent years at Spotify building music recommendation systems. He saw a fundamental gap: traditional infrastructure wasn't built for data-intensive, compute-intensive AI applications. During the pandemic, he started hacking on a solution. Then generative AI took off, and Modal was perfectly positioned.
Modal's thesis is simple: AI applications need better abstractions. They need to work with GPU capacity across the world, scale up and down dynamically, and handle large models efficiently. Traditional infrastructure makes all of this harder than it needs to be.
Function as a Service, but for heavy compute
The closest analogy to Modal is Lambda, but optimized for compute-intensive workloads with real GPU support and better developer experience. Under the hood, Modal uses gVisor for workload isolation. For many companies, Modal replaces Kubernetes and Docker entirely—handling orchestration and scheduling while abstracting away the underlying clouds.
What Modal has done particularly well is container speed. They can load models into GPUs and spin up containers extremely fast. This matters because inference demand is unpredictable. In the past, you had to buy rectangular blocks of reserved instances. That's not how people want to consume inference—they want elasticity for spiky, bursty batch jobs. Modal can get you a thousand GPUs within minutes, sometimes seconds.
Where the revenue actually is
The adoption pattern for generative AI has been "upside down," as Eric puts it. VCs always say enterprise is where the revenue is, but the actual AI revenue today is in startups. Early-stage generative AI companies like Lovable are spending serious money on running models, and Modal manages their infrastructure.
That said, digital natives—food delivery, self-driving, media streaming companies—are starting to put generative AI into production at scale. Eric expects this segment to drive most growth over the next few years. Enterprise will follow eventually, but he estimates it's still two years out before Bank of America is spending heavily on GPUs.
The use cases that surprised him
Modal's customers span everything from face-swapping apps to companies literally trying to cure cancer. A recent customer does weather forecasting—running ensembles of simulations across a thousand GPUs in parallel to compute probability distributions over ten-day horizons.
Audio and media are particularly strong verticals: transcription, text-to-speech, accent coaching, music generation. Computational biotech is growing fast. Vibe coding has become a major vertical.
When asked where AI is heading in a year, Eric gave an honest answer: "I don't know." The beauty of being an infrastructure company is you don't have to predict which applications will win. You just run the compute, see it all, and grow with whatever takes off.
This interview was conducted at AWS re:Invent 2025.