If you’re here, you’ve likely been tasked with building SAML-based SSO as a requirement for an enterprise deal. If you’re just diving into the problem space of SSO / SAML, we’d first suggest checking out
The Developer’s Guide to SSO. Otherwise, buckle up for a brief but titillating foray into why XML-based authentication is... challenging. Why is SAML SSO so vulnerability-prone
The attack surface for SAML authentication is extensive, mostly due to the fact that SAML is XML-based.
XML is a semantic-free meta-language - it’s hard to form, hard to read, and hard to parse. Combined with the high complexity of the SAML specification and the number of parties involved in establishing authentication, we get what often feels like a big ball of mud and all the accompanying implications. Be prepared to tackle a steep learning curve, lots of bugs, high maintenance costs, attack vectors galore, and an absurd spread of edge cases.
Most SAML SSO security vulnerabilities are introduced by Service Providers (SPs) improperly validating and processing SAML responses received from Identity Providers (IdPs). This happens because SAML SSO is typically not a core-value feature for an application, nor is the implementation common knowledge for most developers. Unknowns become even more unlikely to be identified and addressed when the pressure is on to just deliver
something to unblock a high-value contract - as is oftentimes the case. However, to build SAML SSO safely and securely in-house requires significant buy-in and investment by teams - on the scale of months, representing hundreds of thousands of dollars in developer time. If not done right, you expose your application and your customers to potentially huge security risks. To drive that home, here are just a few recently published SAML-related vulnerabilities: January 27, 2020 - “... GitLab SAML integration had a validation issue that permitted an attacker to takeover another user's account.” May 7, 2020 - “... an attacker could exploit this [SAML] vulnerability to bypass the authentication process and gain full administrative access to the system [IBM Data Risk Manager].” June 19, 2020 - “An attacker could authenticate to a different user's [Mattermost] account via a crafted SAML response.” June 29, 2020 - “... improper verification of signatures in PAN-OS SAML authentication enables an unauthenticated network-based attacker to access protected resources.”
It should be evident by now that oversights in SAML implementations are ubiquitous and problematic, even among experienced engineering teams.
So let’s dive into some of the more common security pitfalls developers building SAML-based SSO should be aware of, as well as cover a few suggested countermeasures. Just to be clear,
this guide is by no means comprehensive and is meant to provide a starting point for SAML security considerations as well as some follow-on resources. Brief anatomy of a SAML response
Let's say we're integrating our application with Okta via SAML. Below is an example of an XML document we might get when attempting to authenticate a user, containing a simplified but valid SAML response:
Like we mentioned earlier, the SAML spec is complex and responses can get lengthy, so this example is comparatively quite terse. Keeping things to what you should know before we go into SAML vulnerabilities, let’s walk through what the response (
) is communicating: <saml2p:Response> Line 2 begins the SAML response, which has the unique ID and is intended for consumption by the id72697176167120131651975993 service provider’s Assertion Consumer Service (ACS) URL, i.e. the endpoint workos-test . https://api.workos-test.com/auth/okta/callback Line 3 specifies the issuer ( ) and contains the unique URI (also referred to as EntityID) of the IdP that issued the response, in this case <saml2:Issuer> http://www.okta.com/exk1klancwHzz1SNi357. Line 7 begins the assertion ( ) with the unique ID <saml2:Assertion> . An assertion is a package of information asserting the identity of a user, often containing additional user attributes like first / last name, email, ID, etc. id7269717616793800631152500 Line 8 specifies the issuer of the assertion itself, in this case also . http://www.okta.com/exk1klancwHzz1SNi357 Lines 9 - 30 contains the digital signature ( ) over the assertion, which should be validated to determine the authenticity of the assertion. <ds:Signature> Lines 31 - 36 specify the subject ( ) of the assertion, i.e. the authenticated principal / user corresponding to the unique identifier found in <saml2:Subject> , who in this case is <saml2:NameID> . [email protected] Line 37 ( ) defines the window of time for which the assertion should considered valid, i.e. from <saml2:Conditions> (inclusive) to NotBefore (exclusive). NotOnOrAfter
For the purposes of readability, the SAML 2.0 XML snippets in the remainder of this blog will be simplified, use shorthand, and be stripped of nodes that would otherwise be required in reality but are not relevant to what’s being illustrated. We’ll use a mythical IssueTracker, ContractManager, and PayrollService as hypothetical SPs that have implemented SAML authentication, which you should think of as placeholders for your application or other SAML SSO-enabled apps.
Disable DTD processing
The first step in processing a SAML response is parsing the payload. Parsing and loading an XML document into memory is an inherently expensive set of operations, but can be unexpectedly costly due to a feature of XML that allows references to external or remote documents, i.e.
Document Type Definitions (DTDs).
When a DTD is encountered, parsers will try to fetch and load the referenced document as well. If the referenced document is large enough or results in infinitely looping references, your server can be slowed or even brought down trying to complete the process. The same holds true if the payload itself is very large, DTDs or not.
Two low-hanging mitigations you should implement to prevent buffer overflows are:
Limiting SAML payload size to < 1MB. 1MB is a generous upper limit and should be tuned down based on average received payload size. Configuring your XML parser to never fetch remote files or try to load and parse DTDs. Some XML parsers do so by default, for example, Python’s module. xml.etree.ElementTree
XML processing, and thus by extension SAML response processing, is vulnerable to buffer overflow attacks from other scenarios described later on in this post. And unfortunately, protecting your application from a service outage is among the most mild of outcomes compared to the
possibilities exploiting XML DTD allows - it is a dark and anxiety-inducing rabbit hole.
So, if you’re not writing your own XML parser (generally not suggested), it’s important to
vet the XML parser(s) your application and its dependencies use - ensure they handle other exploits like Billion Laughs and Zip Bombs. Validate the SAML response schema first
The primary security mechanism in the SAML handshake is the cryptographic validation of
XML Signatures (XML-DSig) - which establishes the trust chain between IdPs and SPs. XML-DSig validation should always be done prior to executing business logic; however, the separation between signature verification and operating on the rest of a SAML payload opens up SAML authentication to vulnerabilities exposed by what are called XML Signature Wrapping (XSW) attacks. These attacks have numerous permutations which can result in outcomes such as (but not limited to): Denial-of-service by inserting arbitrary elements that lead to buffer overflows. Escalating permissions by injecting assertions that allow an adversary to impersonate and be authenticated as another user, like an account admin.
The exploit here consists of modifying the payload without invalidating any signatures - think
SQL Injection or Cross Site Scripting, but for for XML.
Original response (pre-XSW):
Modified response (post-XSW):
The broadest countermeasure to XSW attacks is
validating the schema of the SAML XML document. Payloads for SAML responses of any given IdP should have a deterministic standard schema that can be used as a reference in a schema compliance validation module, which should be executed prior to XML-DSig verification. Here are example schemas used by OneLogin’s package to perform python3-saml XML schema validation. Schemas should be vetted local copies as opposed to being fetched from 3rd party remote locations at runtime or on server start.
All of that being said,
schema validation isn’t foolproof; there is room for error in the validation module logic itself, as well as in the syntactic rigor of the reference schema. A second low-hanging countermeasure to XSW attacks that should be employed for the sake of redundancy is to always use absolute XPath expressions to select elements in processes post-schema validation. Explicit absolute XPath expressions set an unambiguous expectation for the location of elements.
Here’s an example of a valid response that’s been modified in an XSW attack (specifically a signature exclusion attack, more on that later):
This modification also exploits the common, incorrect, but not unreasonable assumption that a well-formed SAML response will only ever have a single assertion. So while XML-DSig verification would succeed for the signature returned by
, the assertion returned and processed by doc.getElementsByTagName(“Signature”) would be the injected doc.getElementsByTagName(“Assertion”) assertion. This attack would have been more likely to fail if the XPath expression snek was used in the assertion signature validation logic. “/Response/Assertion/Signature” Check that you’re the intended recipient
This sounds obvious, but make sure to check that a SAML response is intended for your app. This is low-hanging fruit that can prevent attacks exploiting IdPs that use a shared private signing key for all integrated SPs of a given tenant, as opposed to issuing unique keys per application. The most common attack entails the unauthorized lateral movement by a malicious user across an enterprise’s IdP-integrated apps:
A second scenario would be a third party impersonating your app and gaining user access. The likelihood of this attack vector being exploited is pretty low because the malicious party would need to be in possession of the IdP’s private signing key (
among other things) - but we’re mentioning it for the sake of completeness:
There are Service Providers that don’t bother to check if they’re the intended recipient, relying only on the validity of assertion signatures to prove the sender is a trusted party and that the response is valid. But as we’ve illustrated above, valid signatures aren’t enough to prevent unwanted access.
When dealing with security and authentication, stay paranoid my friend, and have some additional redundancies to catch edge cases. In this case, some easy-to-implement checks are:
The response destination is present, non-empty, and refers to an ACS URL that you are expecting. The response and assertion issuers refer to an IdP EntityID you recognize. You are the specified audience for any assertions. Validate every signature
Like we mentioned earlier, cryptographic validation of signatures is the primary mechanism for determining the authenticity of SAML payloads. It’s a good idea to read through the
W3C specs for XML signature processing because it anchors SAML security, but the pithy statement to remember when handling SAML responses is only process entities that are validly signed.
There’s a class of attacks that exploit poorly implemented SP security logic known as
signature exclusion attacks. These attacks will insert forged unsigned elements, banking on the possibility that the SP’s security logic will skip XML-DSig validation if no signature is found. Another common slipup is implementing validation logic that checks only the first assertion’s signature and then assumes remaining assertions are signed. Here are some rules to follow to avoid the most common oversights: ➞ The entire SAML response itself should be signed ➞ Every assertion should be signed
Something to note is that you
should not assume a response will have only one assertion, and furthermore, each assertion should be signed in its entirety. ➞ Only accept encryption schemes from an explicitly defined set of algorithms JWT validation similarly can sometimes overlook this point, and in fact, there was a related Auth0 vulnerability exposed as recently as last year. If possible, we suggest hardcoding your validation logic to only accept RS256 as the encryption scheme. Otherwise, verify attribute values are from a Algorithm recognized set of URIs.
If you’re planning on using a third party library to do SAML processing and XML-DSig validation, be sure to vet what it does under the hood - especially if it does XML parsing and processing as well. As much as possible, try to avoid libraries that depend directly or indirectly on
or its dependency xmlsec1 - many SAML and XML libraries are just language-specific bindings for libxml2 and xmlsec1 , and as a result, libxml2 inherit all the same security vulnerabilities.
We highly suggest doing
XML-DSig validation in native code - in fact, at WorkOS we built our SAML library from scratch for greater control, and so we can respond immediately to newly discovered SAML-related vulnerabilites. Use the canonicalized XML XML canonicalization is the process of transforming an XML document into a semantically identical but normalized representation (more on this later). The specifies which CanonicalizationMethod canonicalization algorithm to apply - the most commonly occurring one we’ve seen is , which strips XML comments during transformation. This generally wouldn’t be a problem, except for the fact that most SAML libraries perform canonicalization xml-exc-c14 prior to doing XML-DSig validation on the canonicalized assertions. Why is this a concern? Here’s what can happen if the library’s underlying XML element text extraction logic doesn’t consider inner comment nodes.
Suppose I’m a disgruntled developer who would dearly like a substantial raise from my company that uses Okta + PayrollService (this is all fictional, I’m not disgruntled). I used to work at PayrollService and so am pretty confident this exploit I’m about to attempt will work, because patching it never got prioritized in favor of feature work, and because no one external has noticed anything amiss, yet... Anyway.
I know that WorkOS IT always uses
as an administrative account for every app used within the organization (we don’t actually). So equipped with this knowledge and using my personal domain, I create a PayrollService account for a user [email protected] and set up SSO with Okta. Self-service free trials FTW. [email protected]
Here’s a simplified SAML assertion authenticating me as a PayrollService user:
Now I can modify the SAML assertion by adding a comment:
This modified assertion doesn’t invalidate the signature because canonicalization will strip comments before XML-DSig verification - it will have
the same canonical representation as the unmodified assertion. Great!
So now, believing the assertion is authentic, PayrollService checks to see which user is being authenticated. Its SAML library grabs the user identifier from the
element, but it, incorrectly, only reads the inner text of the element’s first node, i.e. NameID . Then PayrollService determines that [email protected] is indeed a user, and just like that, I’m in and ready to approve raises for myself and all my friends! [email protected] Duo Labs discovered this vulnerability back in 2018, and while some of the more commonly used open source SAML libraries have since addressed it, there undoubtedly remain many internal or open source libraries that haven’t. So echoing recommendations from before, vet your SAML and XML libraries.
Ideally, comments wouldn’t be purged prior to XML-DSig validation, so that injected comments would indeed cause validation to fail - but that’s unrealistic or inadvisable to try to enforce for a couple reasons, which we’ll leave for another time. Instead, you’ll want to make sure that:
The canonicalized XML document is used in processes post-signature verification. Or, barring the first, that full text-extraction is handled gracefully when inner comments exist. Avoiding replay attacks
Replay attacks occur when a SAML response is captured and re-sent to the Service Provider for duplicate processing, which can have outcomes like denial-of-service for your users, or if the SP charges by API request, eating up request quotas. The most robust countermeasure against replay attacks is preventing the capture of SAML responses in the first place - which can be accomplished by
using HTTPS (should be a given already) and never exposing the SAML response to the browser. Here’s what the authentication flow could look like:
However, very few IdPs actually support the
Artifact Resolution Protocol, a requirement of back-channel SAML authentication. As a result, most SAML implementations rely entirely on the browser to relay SAML payloads between the SP and IdP:
Because the SAML response is exposed to the user agent, it becomes trivial to capture (by inspecting the
dev console, XSS, or with malicious browser plugins) and replay a response. So another approach to mitigating replay attacks is to maintain a cache of previously seen assertion IDs, immediately rejecting responses containing any assertion with an ID that already exists in the cache. A cache item could have a TTL equal to the expiry datetime of the originating assertion, for example:
A third much less robust but much faster to implement countermeasure (which should be implemented regardless) is logic that
strictly enforces the validation window for assertions.
One last thing to note is that most SPs that implement SAML SSO use 3rd party open source SAML libraries for speed to value, yet are not protected against replay attacks because the strongest countermeasures require additional architectural changes.
As with most software engineering, building SAML SSO for enterprises follows the
90-90 rule. There’s a hill to climb to get to an MVP, and an entirely different hill if you’d like to sleep at night. SAML-based authentication is rife with sleeping dragons, of which this guide only introduces a very small subset - but hopefully it has been useful in helping you avoid some of them. If product requirements allow, try to avoid integrating with IdPs using SAML; a more modern, safer, and simpler alternative protocol is OpenID Connect. And if you’re thinking twice about building SAML SSO yourself in-house, then consider using a 3rd party vendor that makes it their business to provide a safe, performant, highly available, and super fast to integrate SSO API... like WorkOS! Additional references and further reading