In this article
June 10, 2025
June 10, 2025

AI isn't magic. Context chaining is.

Professional knowledge workers use AI tools more efficiently, because they understand how to manage context. Learn the best tactics to uplevel your entire organization.

You've purchased multiple AI tools for your team — Pro subscriptions, piles of API credits, even training sessions. But the productivity gains haven't materialized. Your team is still asking the same questions, performing the same rituals, and getting blocked on the same Slack replies.

The problem isn't the tools. It's your mental model.

Most people treat AI like a fancy search engine, firing off isolated requests and getting isolated answers.

Professionals treat it like a charged battery — they build up concentrated understanding in one conversation, then blast that energy across multiple outputs at lightning speed.

The difference isn't the tools. It's knowing how to build the charge.

A Six-Hour Field Test

Yesterday I began with zero practical knowledge of Model Context Protocol (MCP) documentation servers. By the end of a single workday we had shipped: 

all spun out of one evolving LLM conversation thread.

Here's exactly how the context chain played out.

Reverse-engineering Mastra's reference implementation (Building the charge)

I dropped the entire @mastra/mcp-docs-server codebase into gitingest, an excellent tool I wrote about in Context is King, and then Cursor's composer.

Three conversational turns later, the model and I had surfaced the build system architecture, identified a sneaky installation bug that appears when you append @latest during npx execution, and mapped out the complete API surface. A quick Slack with Mastra's team confirmed the repro and saved an hour of debugging.

Note that my use of an LLM did not replace human communication, but it allowed me to spin up on the project and load it into my own head so I could ask better questions, faster.

Use LLMs to clear low-hanging fruit more quickly before bothering another human with a request complex enough to merit their attention.

Fork 1 – Scaffolding a WorkOS-aware clone

With Mastra's architecture now "charged" into context, I opened a new conversation that inherited the entire codebase understanding and prompted: "Same endpoints and MIME types, but serve our private workos-docs package. Prefer esbuild over bundling complexity and keep the container lean."

I talked through the nuances of how our own repo setup would require a slightly different build approach, and once I was satisified that I'd considered all relevant angles, I asked the LLM to scaffold our docs server.

A few minutes later, I had an initial version that was sufficient to begin local testing and verification.

Fork 2 – Internal testing guide

Still in the same context chain, I requested internal team communications that referenced the specific branch, flagged the exact preview commands, and included warnings about the @latest installation bug we'd discovered.

My colleagues are busy professionals with their own work to do, so I can best respect their time by preparing a comprehensive testing guide that shows them exactly how to get started and verify the server end to end locally in five minutes or less. This reduces the perceived and actual time investment required to satisfy my ask, making it more likely that more of my colleagues will agree to help me test.

Because every prior decision rode along in the conversation memory, it was effortless to request a local testing guide worthy of my colleagues.

I walked through the guide myself from a clean starting point to ensure it worked as intended. I remain fully responsible for verifying all my artifacts, and for their functionality, typos, and defects.

Fork 3 – Instant teammate communications

Next prompt: "Write a clear request for my colleagues to use our local testing guide to help me verify our new MCP docs server".

A lot of engineers are meticulous with their code and architecture documents but sloppier with interpersonal and team-wide communications.

Asking for an LLM to review or condense your Slack message is a great way to make it more crisp and clear and reduce the amount of time it takes for others to read and comprehend your requests.

It was easy to create a compelling Slack message pointing to the local testing guide and explaining how it would take five minutes or less to help me verify our new server. As a result, several teammates signed up to assist, giving us additional dogfooding opportunities and therefore more confidence we were ready for an initial release.

Fork 4 – Outward-facing content

Two final conversational turns generated the first draft of the announcement blog post and concise social media copy, each referencing specific branch names, port configurations, and known edge cases without me re-explaining any technical context.

Still within Cursor, I got an initial starting point for our announcement blog post and social posts sharing our new MCP server.

Final tally

  • Elapsed time: Roughly six hours
  • Distinct conversation threads: four
  • Deliverables: production code, test suite, internal docs, public announcement, social assets

What is Context Chaining?

For three years, I've experimented with every available AI tool, consulted with senior engineers, designers, and writers, and observed a clear pattern among the most effective practitioners. They don't just use AI — they chain context.

Context chaining is the practice of building deep understanding in collaboration with AI, then systematically directing that context across every deliverable you need.

It's the difference between asking ChatGPT random questions and conducting a sustained intellectual partnership where understanding compounds across outputs.

How Context Chaining Actually Works

Build Primary Context First

Professional knowledge workers don't delegate understanding to AI. We use it to accelerate understanding.

The process starts the same way it always has — by exploring — but now we can move faster through discussion, code analysis, and rapid prototyping with AI as a thinking partner.

What NOT to do: Jump into ChatGPT and ask, "How do I build an MCP server?" You'll get generic documentation regurgitated back at you. No context. No understanding of your specific needs.

What TO do: Upload the codebase of an existing implementation. Walk through it systematically with AI. Ask questions at each layer: "Why is this structured this way? What would happen if we changed X? How does this compare to the Y approach?"

Build genuine comprehension through investigation, not passive consumption.

When I reverse-engineered the Mastra MCP server, I didn't ask for tutorials. I dissected real code with the LLM as my thinking partner, tested hypotheses about the architecture, maintained ownership of the learning process, and verified all inputs and outputs manually.

Apply Context Systematically

Once you have primary context, you direct it across every output your project needs. From my initial MCP server context, I generated:

  • Working implementation accounting for our specific monorepo architecture
  • Comprehensive local testing plan with executable verification steps
  • Internal documentation that turns complex setup into a 5-minute process
  • Team communication linking everything together and requesting verification support
  • External blog post and social media content announcing the release

Same context. Multiple applications. Half the time it would have taken traditionally.

What NOT to do: Start five separate ChatGPT conversations asking "write me a test plan," "write me documentation," "write me a blog post." Each output will be generic, disconnected, and require extensive revision.

What TO do: Keep everything in the same conversation thread. Reference your established context explicitly: "Using the MCP server architecture we just analyzed, create a testing plan that focuses on the three critical failure points we identified."

The LLM builds on shared understanding instead of starting from zero each time.

Preserve and Shuttle Context

The magic happens in how you manage context across conversations. You're not starting fresh each time — you're building on previous understanding, referencing earlier decisions, maintaining continuity of thought across multiple outputs.

What NOT to do: Let context windows fill up with irrelevant back-and-forth. When you hit token limits, start a completely fresh conversation and lose all your built-up understanding. Keep your AI conversations isolated from your actual work environment.

What TO do: Modern AI platforms like Claude, ChatGPT, and Cursor now offer persistent workspaces where you can upload key documents that get automatically referenced across conversations.

The real power comes from integrations that connect LLMs directly to your work environment. Claude can search your Google Drive, Gmail, and Calendar. ChatGPT connects to Slack, Notion, and dozens of other tools. Instead of copying and pasting information between systems, the LLM pulls live context from where your work actually lives.

When I was analyzing the MCP server architecture, the LLM could reference our existing codebase structure, pull in relevant team discussions, and understand project constraints from our previous conversations. This isn't just convenience — it's context multiplication.

When you've built up significant understanding in one conversation, create new threads that explicitly reference your key insights: "I've been analyzing MCP server architecture in our previous discussion. Key findings: [3-4 bullet points]. Now I need to create user-facing documentation that reflects this technical understanding."

You're context shepherding, not prompt writing — and the tools are getting better at helping you preserve that hard-won understanding.

Why This Changes Everything

Traditional knowledge work required extensive coordination between specialists. Product managers briefed designers, who briefed engineers, who briefed marketers. Context got lost in translation at every handoff.

Context chaining collapses these handoffs. One person with deep context can direct AI to produce deliverables across multiple disciplines, maintaining coherence throughout.

You become a conductor orchestrating AI capabilities rather than a user making isolated requests.

This isn't about AI replacing human judgment — it's about amplifying human context across more domains than any individual could traditionally handle.

When I shipped the WorkOS MCP server, I wasn't just writing code. I was maintaining technical accuracy across implementation, testing, documentation, team communication, and marketing — all from the same foundational understanding, all in a single workday.

The Mental Model Shift

Stop thinking about AI as a tool you use occasionally. Start thinking about it as a thinking partner that never forgets, never gets tired, and can instantly apply your shared context to new problems.

The professionals getting extraordinary results aren't using different AI tools. They're using the same tools with a fundamentally different approach: building context once, applying it everywhere, and maintaining intellectual continuity across entire projects.

That's why your team's productivity gains haven't materialized yet. You're still thinking in terms of individual tasks rather than sustained context.

Try It on Your Next WorkOS Integration

Spinning up Directory Sync or SSO? Upload the SDK documentation, walk through the integration flow once with AI, and keep that conversation alive. From the same thread, ask for unit tests, customer-facing code examples, or changelog updates. The context battery is already charged — just direct the energy where you need it.

Or better yet, try the WorkOS MCP docs server in your AI coding environment. Install it once, and your AI assistant will have persistent access to our latest documentation, code examples, and changelogs right where you're already working.

Fix the mental model, and the results follow.

This site uses cookies to improve your experience. Please accept the use of cookies on this site. You can review our cookie policy here and our privacy policy here. If you choose to refuse, functionality of this site will be limited.