In this article
June 9, 2026
June 9, 2026

Directory sync beyond SCIM: Why "we support SCIM" isn't enough

What you're actually signing up for when a customer's IdP doesn't speak SCIM.

Explore with AI
Open in ChatGPT
Open in Claude
Open in Perplexity

Most teams building enterprise software reach the same conclusion at roughly the same time: SCIM is the standard, so we'll support SCIM and call it done. It's a reasonable place to start. The contract is clean, the schema is standardized, and every major IdP claims to support it.

Then the customer paying you half a million dollars a year emails to say their directory doesn't work that way.

Four columns grouping directory providers by sync method: SCIM-native (Okta, Entra ID, JumpCloud, OneLogin, Rippling), pull-only (Google Workspace), on-prem (Active Directory, LDAP), and HRIS/CSV (Workday, BambooHR, HiBob, SFTP). Only the SCIM-native column pushes events to your app.

The part nobody tells you about SCIM

SCIM works well when both sides of the integration cooperate. The IdP pushes CreateUser, UpdateUser, and DeleteUser events to your endpoint, you reconcile against your user model, and you're done. In practice, a substantial share of enterprise directory deployments don't fit that mold.

  • Google Workspace has a rich Directory API, but it doesn't push SCIM events to your app. For changes originating in Google, you pull or subscribe to push notifications on specific resources. There is no SCIM source feed out.
  • On-prem Active Directory and LDAP often have no webhook surface at all, and sometimes no internet egress. Companies running these systems aren't edge cases; they're often the largest, oldest, and most valuable customers you'll land.
  • HRIS platforms (Workday, BambooHR, HiBob, and dozens of others) are the real source of truth for employment status. They own who is hired, who is terminated, who is on leave. Each one has its own API shape, its own auth model, and sometimes the only "integration" on offer is a scheduled SFTP drop of a CSV file.

You can't tell any of these customers "we only support SCIM." So you either build the fallbacks, or you don't sell to them.

What building the fallbacks actually involves

This is the part that looks like a sprint and turns out to be a permanent engineering surface.

Polling infrastructure

When you can't be pushed to, you pull. Pulling means polling, and polling immediately raises questions that don't have obvious answers.

How frequently do you poll? Too often and you burn rate limits and money. Too infrequently and a terminated employee keeps access for hours after their last day, which is the exact scenario enterprise security teams lose sleep over.

How do you detect change efficiently? Full snapshot diffs are simple but expensive. Delta tokens (Google's syncToken, Microsoft Graph's @odata.deltaLink) are cheaper but require you to maintain state across runs. And delta streams drift. In practice, they drop events often enough that you cannot treat them as authoritative on their own. You need a nightly full reconciliation job running underneath as a correctness guarantee, with the delta stream only serving as a latency optimization on top.

A tiered cadence that works: critical attributes (active status, group membership for high-privilege groups) every few minutes. Profile attributes like display name and title on the order of an hour. Full snapshots nightly.

Architecture diagram showing three sync tiers — webhooks (fast path, not guaranteed), delta polling (every few minutes), and full snapshot (nightly, correctness guarantee) — all feeding into a reconciliation engine that writes to a canonical user model with stable ID, active status, and group membership.

Webhooks, but not only webhooks

Some non-SCIM systems do offer webhooks. Google Workspace has push notifications for some Directory API resources. Several HRIS platforms emit change events. When they're available, you take them, but you never rely on them as your only signal.

Webhooks get lost. They get delivered out of order. They get delivered twice. The IdP's webhook subsystem goes down and you often don't find out until your own reconciliation catches the drift. The pattern that holds up is webhooks as a latency optimization on top of polling as a correctness guarantee. Webhook fires, you fetch the resource and reconcile. Polling runs anyway on its normal cadence.

Your reconciliation logic also has to be idempotent across all of this. If you process the same UpdateUser three times (once from a webhook, once from a delta poll, once from a nightly snapshot) the end state has to be identical every time.

Conflict resolution across sources

The harder problem isn't pulling data. It's deciding which pulled data wins.

A real enterprise customer often has an HRIS that owns employment status, an IdP that owns authentication and group membership, and an IT system that owns email and provisioning state. These systems disagree constantly. Someone is terminated in Workday on Friday afternoon but the IdP doesn't deactivate their account until Monday. A new hire exists in the HRIS a week before they have an email address. A contractor exists in the IdP but never in the HRIS.

You need an explicit priority model, configured per customer, that answers: for each attribute on the user, which source wins? Employment status comes from the HRIS. Group membership comes from the IdP. Email comes from the IdP. Department comes from the HRIS. And when two sources disagree about an attribute that should be authoritative, you need to log it and surface it, not silently pick one. Silent last-write-wins is how you end up with a terminated employee who still has SSO access.

You also never hard-delete. When a source says a user is gone, you mark them inactive with a reason and a timestamp. You will need that record the day a customer asks why someone lost access at 3am on a Tuesday.

Schema normalization

Every directory system models a user differently. SCIM defines userName, emails, active, groups. Google's Directory API uses primaryEmail and suspended. LDAP gives you sAMAccountName, userPrincipalName, and memberOf DNs. A CSV gives you whatever columns the customer's IT admin felt like exporting that month.

All of that has to map to one internal user model that the rest of your product can consume without knowing or caring where the data came from. Each source connector has to own the job of producing canonical output. You preserve the raw payload alongside the normalized one so that when a mapping turns out to be wrong (and it will) you can re-derive without re-fetching. And you key everything off a stable identifier that the source guarantees won't change: Google's user id, Active Directory's objectGUID, the HRIS employee ID. Not email addresses. Not display names. Both of those change.

The actual cost

None of this is a single project. It's an ongoing maintenance surface. Delta API behavior changes when providers ship updates. HRIS vendors modify their export formats. Customers switch IdPs and you inherit a migration. A new enterprise prospect shows up with a directory system you've never seen before and a six-figure contract attached to the conversation.

The teams that treat directory sync as a one-time build eventually end up with a patchwork of connector-specific hacks, reconciliation jobs that nobody fully understands, and a growing list of customers who are "almost" fully provisioned.

What WorkOS handles for you

WorkOS Directory Sync is built around the reality that SCIM is one source among several, not the only one worth supporting.

Out of the box, it handles the provider-specific complexity: Google Workspace's pull model, Workday and BambooHR's HRIS APIs, SCIM providers including Okta, Entra ID, JumpCloud, and more, plus SFTP for the customers who hand you a CSV and call it integration. The connector layer, the polling infrastructure, the webhook handling, the schema normalization, that's WorkOS's problem, not yours.

What you get is a single, consistent API your app talks to regardless of what's on the other side. Directory updates arrive via webhooks or the Events API in a normalized shape. You don't write polling logic for Google, then different polling logic for Workday, then figure out SFTP ingestion for whoever comes next.

SCIM-first is still the right default when a customer's IdP supports it cleanly. But "SCIM or nothing" isn't a viable product position for anyone selling into the enterprise. The customers with the largest contracts are often the ones with the most complicated directories. WorkOS makes it possible to say yes to them without making directory sync a permanent line item on your roadmap.