In this article

October 28, 2025

The Productivity Paradox: When AI Tools Make Things Worse Before They Make Them Better

A panel with Linear, Amp, and Jam explores why 80% of companies get net negative value from AI coding tools—and what it takes to be in the winning 20%.

Zack Proser

October 28, 2025

This post is part of our ERC 2025 Recap series. Read the full recap post here.

When Claire Vo opened the productivity panel at Enterprise Ready Conference with a lighthearted suggestion to "put a penny in a jar every time someone says 'enterprise,'" she was acknowledging something everyone in the room knew: enterprise productivity is the conversation in 2025.

But what followed wasn't another predictable discussion about how AI will make everyone 10x more productive. Instead, the panel—featuring Cristina Cordova, COO of Linear; Quinn Slack, CEO and daily committer at Amp; and Dani Grant, CEO of Jam—delivered something more valuable and considerably more uncomfortable: honesty about how poorly most organizations are actually using these tools.

The panel's most striking statistic came early: approximately 80% of people using agentic coding tools are getting net negative value from them, while perhaps 20% or less know how to use them effectively and extract significant value.

For companies building these tools, that's not just humbling—it's terrifying. It's also the most important admission in the room, because it acknowledges the gap between the promise of AI productivity tools and the reality of their implementation.

Defining Productivity When No One Knows What It Means

Before diving into why these tools fail most companies, the panel tackled a more fundamental question: how do you define productivity in the first place?

The Outcome-Based View

From the enterprise customer perspective, particularly in older, more traditional companies, productivity is often "ultimately outcome-based—do they feel that they're shipping more?" Many of these organizations don't use productivity measurement tools extensively or think about outcomes in terms of quantifiable metrics. Part of the challenge for productivity tool vendors is helping change that culture.

The panel offered a revealing definition of "enterprise" itself: it's less about company size and more about structure—"when there's someone who's picking the tools for other people and it's not just each person picks their own tools." In other words, enterprise isn't primarily about headcount—it's about decision-making structure and power dynamics.

But when it comes to defining productivity, the panel was blunt: many enterprises don't know how to define it. "It's the age-old question and they need whoever's selling them that tool to define it for them, especially in this new world."

From Vibes to Metrics

The discussion highlighted a crucial transition happening in larger organizations. While many companies remain "vibe-based" when it comes to productivity assessments—even in organizations "where you think they really have their skis on"—more teams are now establishing dedicated builder tools teams or developer velocity teams.

Once you have a team responsible for developer productivity, you need metrics to justify its existence. That's when organizations start assembling KPIs like latency between committing code and code being in production, time between PR creation and when it gets reviewed, PR throughput, and percentage of PRs with an AI co-author.

This shift from vibes to metrics sounds like progress, but it introduces new problems. What happens when you optimize for easily measurable metrics like PR volume, but the tools generating those PRs are actually making things worse?

The 80/20 Problem in Agentic Coding

The statistic about 80% of users getting net negative value from agentic coding tools wasn't a throwaway line. It was the panel's central tension.

When you're thinking about enterprises, many are probably getting net negative value from these tools right now. There's a small group—perhaps 20% or even less—that knows how to use these tools effectively and gets tremendous value. But the majority is struggling.

The Prompting Problem That's Actually a Product Problem

The panel cut through typical AI discourse with a pointed observation: there's this classic dynamic where a tool isn't working for someone and the response is "well you're prompting it wrong and you don't know how to do that." But maybe that's actually a product problem, not a user problem.

The discussion used a seemingly trivial example to illustrate a deeper challenge: what do you call the smallest increment of work in a project management tool? Is it a task? An issue? A ticket? When AI tools require users to learn a specific vocabulary or framework to be effective, the burden of adoption increases dramatically. "What do I have to call this in this tool in order for it to start doing the work right?"

This nomenclature problem represents something larger. The tool might be powerful, but if it requires extensive training to use correctly, most users will never reach the point where it provides value.

Returning to the Old Model

The panel argued for a return to an earlier era of software: "I think in a lot of ways we have to get back to the mode where there was a lot of help in using some of these tools versus expectation that the customer is going to figure it out."

The difference with AI tools is the expectation cycle. Previously, if a traditional software product didn't work well initially, that customer probably wasn't coming back. But with AI products, there's been this unusual dynamic where a tool might not work well initially, but customers will try it again three months later and find it's somehow better because the underlying models improved.

The panel pushed back on this: "Whereas before it was like you had your one shot with that customer and if it didn't work they're never coming back again. I would like us to return to thinking that if you don't get it right the first time it's not going to go well."

This has significant implications for enterprise sales. If 80% of companies are getting net negative value from AI coding tools, and those tools are sold on the premise of improving productivity, there's a credibility problem that will compound over time. Eventually, enterprises will stop giving these tools second and third chances.

The Free Tier Calculation

The conversation shifted to go-to-market strategies, specifically the role of free tiers in enterprise-focused products. This revealed fundamentally different philosophies among the panelists about how to balance growth, conversion, and sustainable business models.

The Generous Free Tier Strategy

One approach described was a "very generous free plan" as essential to product strategy. The goal is letting people "try and feel the product" because some developer products offer something that's "hard to describe or explain" without hands-on experience.

But this free tier strategy goes deeper than trial functionality. Many people use productivity tools for free for personal tasks and side projects—users who will never reach a payment threshold. At first glance, these look like permanent free riders providing no value to the business.

The reality is more complex. These free users often work at other companies. When those companies start evaluating tools, having employees who already use and love the product for personal projects is invaluable. They become internal advocates who can onboard teammates and evangelize the product from within.

"If they already use [the product] for a bunch of other stuff, side projects, personal things, and they come back and they're like 'oh I would love to use that at work,' that's such a value add because they already know how to use the product. They're also going around a bunch of other different people in their company being like 'oh here's how you do this or here's how you do this' and they're an advocate for us."

This kind of organic advocacy—all generated from free tier users—represents significant marketing value that doesn't appear in traditional B2B SaaS metrics.

The Opposite Approach

Claire offered a contrasting perspective from running ChatPRD. Through extensive testing around monetization, she found that "for our format of product, buying is a leading indicator of adoption in a way that's very margin accretive."

Free customers, in her experience, "kick tires and feel no incentive to actually really go deep in your product, especially onesie-twosie one-off kind of individual users." A hard paywall filters out people who would never become paying customers anyway, allowing the team to focus on "really highly incentivized people" who swiped a credit card and are "going to give it the good old college try."

This isn't a contradiction with the generous free tier approach—it's context-dependent. Different business models and user profiles require different strategies. What matters isn't which approach is universally correct, but understanding which dynamics apply to your specific product and market.

The Ad-Supported Wild Card

The panel also discussed the newest experiment in developer tool monetization: Amp launched an ad-supported free tier just the week before the conference.

The model is straightforward—developers can use the coding agent for free, but they'll see ads for developer tools while coding. As the panel discussed, this addresses a real problem: "These companies want to reach developers, developers are hard to reach, and ads is one way to do that."

The company's position as the one paying for API tokens creates an interesting incentive alignment: "We have an incentive to solve that for users since we're the ones paying their tokens." The business is directly motivated to make the product more efficient and effective because compute costs come out of their margin.

It's too early to know if ad-supported coding agents will work long-term, but it represents a genuine innovation in business model experimentation. If nothing else, it demonstrates that the enterprise productivity software space is still wide open for new approaches to monetization and go-to-market strategy.

The Quality vs. Velocity Debate

An audience member asked what might be the panel's most important question: in a world where AI creates more PRs, and AI reviews those PRs, how do you measure quality in addition to velocity?

The panel painted the scenario vividly: "It's the AI quality singularity right. AI creates the code, AI creates the PR, AI reviews the PR, AI pushes the PR to prod, AI runs tests in prod, AI monitors your code in prod, AI catches the bug, creates the alert, fixes it—like whatever."

But the discussion quickly grounded this scenario back in reality: "If a human is going to use your product, a human at some point needs to do a quality check. A lot of bugs and issues are not actually technical logical problems, it's that something is a bad user experience.

An AI can assume what a bad user experience is but I think as the surface area of products increase there's still going to be moments where the quality check is basic stuff like can users use this, do they retain, are they churning—it's still normal business metrics."

The Hot Take on Bad Code

The panel then delivered what might be the discussion's most provocative statement: "Bad code rarely kills companies but bad product definitely does."

They pointed to the AWS us-east-1 outage from that week. "AWS is going to survive that. The bad code is not the problem—they have a compelling product."

This perspective on quality is fundamentally optimistic: "More interesting things are going to be built that can solve real problems. We can solve performance, scalability, team ability to write the code and maintain the code—we solve all those problems on the back end. What you have to be really focused on from a quality perspective is do customers love it, use it, get value out of it, can they see and feel and touch it. The stuff on the back end will be solved."

The panel agreed on a simple principle: if people aren't shouting from the rooftops about how great your product is, you probably don't have a high quality product. That's a metric that can't be gamed by generating more PRs.

The Future Costs Problem

The discussion introduced a different dimension to quality: "There's future costs that people will pay and if you bring them forward then it can be net negative."

This is the dark side of AI-generated code velocity. You might ship faster today, but if that code is harder to maintain, creates technical debt, or introduces security vulnerabilities, those future costs can outweigh the immediate productivity gains.

But the panel was also pragmatic about measuring quality in enterprise sales processes: "If you're trying to prove or measure the quality of your coding agent versus another, I've been there done that and it's basically impossible and the decision is not actually made based on that even though I understand why it's valuable."

The enterprise buying decision for coding agents isn't primarily about quality metrics—it's about perceived value, organizational fit, and ultimately trust. That makes the 80% net negative value problem even more concerning, because enterprises aren't buying based on rigorous quality evaluation. They're buying on promise and demo, then discovering months later whether it actually works.

Building for the Enterprise Reality

By the end of the panel, several themes had emerged that paint a picture of enterprise productivity in 2025 quite different from the usual "AI will make everyone 10x more productive" narrative.

First, most companies are currently getting net negative value from the most hyped productivity tools in the market. This isn't a temporary problem that will be solved by better models. It's a product design, implementation, and change management problem that requires vendors to take responsibility for customer success.

Second, the definition of productivity itself remains contested and often vibe-based even in organizations that should know better. The shift toward metrics like PR volume can actually make the problem worse if those metrics are divorced from actual business outcomes.

Third, go-to-market strategies for enterprise productivity tools remain highly experimental. Generous free tiers, hard paywalls, and ad-supported models all work for their respective contexts, but there's no universal playbook.

Fourth, the quality vs. velocity debate ultimately resolves in favor of customer love and product-market fit. Bad code is a solvable problem. Bad product kills companies. The challenge is ensuring that productivity tools optimize for the right outcomes.

Claire closed the panel with a perfect summary of the practical advice shared: "Velocity, quality, free, get a senator on your board—these are all the tips we have for building excellent, good, productive enterprise companies."

Behind the humor was a serious point. Building enterprise productivity tools in the age of AI requires balancing competing priorities: shipping fast while maintaining quality, offering free tiers while building a sustainable business, automating workflows while keeping humans in the loop where it matters.

The companies that succeed won't be the ones with the most impressive demos or the most aggressive AI promises. They'll be the ones that honestly assess what's working and what isn't, take responsibility for the 80% of users currently getting net negative value, and build products that help enterprises actually become more productive rather than just appearing to.

For the audience of founders, engineers, and enterprise buyers at ERC, that honesty was probably worth more than any number of rosy productivity projections.

Watch more panels and sessions from Enterprise Ready Conference 2025 in our full event recap.

We’re hiring

Our global team is growing and we’re hiring all types of roles.

View open roles

About us

WorkOS builds developer tools for quickly adding enterprise features to applications.

Learn more