In this article
March 16, 2026
March 16, 2026

Compression is one of the core patterns of this era of LLMs

Knowledge compression, time compression, skill compression — understanding the unifying pattern behind what AI tooling is actually doing.

Every interesting thing happening with AI tooling right now is a form of the same thing: compression.

Not compression in the technical sense of making files smaller. Compression as a pattern — the systematic collapsing of previously vast things into tight, accessible, immediately useful forms. Knowledge that took days to find, timelines that took months to execute, skills that took years to develop — all getting compressed into something one person can wield in an afternoon.

Knowledge compression

Think about how a lawyer used to research case law. You had shelves — entire rooms — of legal texts, precedents, annotations. A junior associate would spend three days combing through volumes, flagging potentially relevant cases, summarizing arguments, and assembling a forty-page research memo that a partner would skim in ten minutes. The knowledge existed, but accessing it required an enormous amount of time and specialized effort.

Now consider what a RAG system does. You take that same corpus, chunk it, embed it into a vector database, and suddenly the most relevant passages for any given query surface in seconds. The lawyer describes the case, and the system retrieves the precise precedents that matter. Not a keyword search that returns a thousand results — a semantic search that understands what you're actually asking about and returns the material that's genuinely useful.

That's compression of knowledge. The information hasn't changed. The case law is the same case law. But the distance between a person and the knowledge they need has collapsed from days to seconds. The entire research workflow — the assistant, the index cards, the late nights in the library — gets compressed into a retrieval call.

And this pattern is everywhere now. Every company building a support chatbot on top of their docs, every team indexing their internal knowledge base for semantic search, every developer using embeddings to make their data queryable in natural language — they're all doing the same thing. They're compressing the distance between a question and its answer.

Time and artifact compression

Here's where it gets more interesting. Once you have tools like Claude Code wired up with MCP servers and the right context, you're not just compressing knowledge retrieval anymore. You're compressing entire workflows.

Consider a concrete example. You need to build an internal dashboard that lets your team query customer usage data — filtering by plan tier, date range, feature adoption — without filing a ticket with engineering every time someone in sales or support needs an answer. The old version of this project has a familiar shape: a product brief, a design review, a backend engineer building the API endpoints, a frontend engineer wiring up the UI, a round of QA, a deploy pipeline. Spread across two or three sprints, minimum. Not because any single step is hard, but because the coordination overhead between steps is enormous.

With AI tooling, one person who understands the data model and knows what the team actually needs can move through the whole thing in a continuous session. Describe the schema, generate the query layer, scaffold the UI, iterate on the layout, test the edge cases, deploy. The parts that used to create drag — waiting for a handoff, writing a ticket to describe what you already know, sitting in a meeting to align on scope — simply don't happen. The gap between "I know exactly what this should do" and "it's running in production" collapses.

What's being compressed here isn't just time. It's the artifact chain. A multi-sprint project produces a trail of artifacts — design docs, tickets, PRs, review comments, deployment configs. When one person does it in a day with AI assistance, many of those intermediate artifacts simply never need to exist. The scaffolding we built to manage the gap between intent and implementation becomes unnecessary when the gap itself shrinks.

This is the same pattern we've applied to our own internal workflows at WorkOS — taking multi-step manual processes and compressing them into something that's immediately queryable or automated to the point where anyone on the team can use it, not just the person who originally understood the plumbing. The compression isn't just about speed. It's about removing the bottleneck of specialized access.

This is why it feels different from previous waves of developer tooling. Better frameworks and better CI/CD pipelines made teams faster. AI tooling compresses the work itself.

Skill compression

This is the one that I think has the biggest implications, and it's also the one that makes people the most uncomfortable.

A single person can now operate at a level that previously required multiple specialists. Not because they've suddenly acquired deep expertise in design, backend engineering, DevOps, and security. Because the AI tools compress the skill gap between "I roughly know what needs to happen" and "I can actually execute it at a professional level."

A backend engineer can now produce a frontend that's good enough to ship to internal users and, with some iteration, to customers. Not pixel-perfect — but functional, accessible, and structured well enough that a designer can refine it rather than rebuild it from scratch. A solo founder can stand up infrastructure that used to require a dedicated platform team. A developer who's never written a Terraform config can describe what they need and get something that works, then learn from the output what they'd need to change next time.

The skills haven't disappeared. Someone still needs to know what good looks like. Someone still needs to evaluate the output, catch the mistakes, and make the judgment calls. But the barrier between "I understand the problem" and "I can ship the solution" has been compressed in a way that changes what one person can do.

This is a redistribution of capability. The specialists still matter — their taste, their judgment, their ability to handle the genuinely hard edge cases. But the floor has risen dramatically. The minimum viable skill set needed to execute across domains is much smaller than it was two years ago.

And this raises a question that's worth being honest about: what happens to the pipeline that produces specialists? The junior associate in the legal example wasn't just doing grunt work. They were learning — developing the pattern recognition and judgment that would eventually make them a senior attorney. If RAG compresses away the research phase, you have to ask what replaces it as a training ground. The same question applies to junior developers, junior designers, junior analysts. Compression makes experienced practitioners more powerful, but it may also narrow the path that creates them. I don't think anyone has a clean answer to this yet, but ignoring it makes you less credible when you talk about what compression enables.

The compounding effect

There's something more interesting happening than just three types of compression running in parallel. Each layer enables the next, and the feedback loop is what makes this moment different from "tools are getting better."

Knowledge compression gives you instant access to what humanity knows. That faster access to knowledge directly enables time compression — you can act on information at speeds that weren't possible when retrieval was the bottleneck. And time compression, in turn, enables skill compression, because when execution is fast and cheap, you can afford to iterate your way to competence in a domain that isn't your specialty. Try, evaluate, adjust, try again — a cycle that used to take weeks now takes minutes.

Then the loop closes: more people building more things, across more domains, generates more knowledge and more artifacts, which feeds back into the retrieval systems that started the whole chain. The corpus gets richer. The compressions get tighter.

This compounding is why thinking about compression as a pattern — rather than evaluating each AI tool in isolation — gives you a more useful mental model. When you look at a new AI product, ask: what is this compressing? When you look at your own workflow, ask: where is there still unnecessary distance between intent and outcome? The answers point you toward where the leverage actually is.

The thing about compression

There's a property of compression that's worth sitting with: it's lossy.

When you compress knowledge retrieval, you might miss the serendipitous discovery that came from browsing the stacks. A lawyer using a RAG system will find the cases most semantically similar to their query, but they might never encounter the unrelated opinion from a different circuit that reframes the whole argument — the kind of thing a human researcher would stumble onto while pulling a book off the shelf. Serendipity doesn't index well.

When you compress timelines, you might skip the slow thinking that would have caught a fundamental design flaw. That internal dashboard you built in a day works, and it ships, and the team uses it — but three months later you discover that the data model assumptions baked into the query layer don't hold for your enterprise tier, and now you're rearchitecting something that a longer design phase would have pressure-tested. Speed has a cost, and the cost is often invisible until it isn't.

When you compress skill requirements, you might not notice what you don't know until it bites you. The Terraform config that the AI generated works in staging but has a security group misconfiguration that a platform engineer would have caught on sight. The frontend that the backend engineer shipped is functional but has accessibility failures that a specialist would never have allowed past review. The gap between "it works" and "it's right" is exactly the gap that compressed-away expertise used to fill.

This isn't an argument against compression. It's an argument for being deliberate about it. The best practitioners I've seen in this new era aren't the ones who compress everything blindly. They're the ones who know what to compress and what to leave at full resolution — who move fast through the parts where speed is safe, and slow down for the parts where shortcuts compound into debt.

Compression is the pattern. Judgment about when to apply it is the skill.

This site uses cookies to improve your experience. Please accept the use of cookies on this site. You can review our cookie policy here and our privacy policy here. If you choose to refuse, functionality of this site will be limited.