Rainbow table attacks: What they are and how to prevent them
What they are, how they work, and why modern password security has moved beyond them.
Rainbow table attacks are one of the most well-known techniques for cracking hashed passwords. While modern defenses have made them far less effective than they once were, understanding how rainbow tables work is essential for anyone building authentication systems or reasoning about password storage security.
This article breaks down the mechanics of rainbow table attacks, traces their history, and explains the specific countermeasures that neutralize them.
What is a hash function?
Before diving into rainbow tables, it helps to understand the problem they were designed to solve.
When a user creates a password, a well-designed system never stores that password in plaintext. Instead, it passes the password through a cryptographic hash function, a one-way mathematical operation that produces a fixed-length output called a digest or hash. For example, the SHA-256 hash of the string password123 is always the same 64-character hexadecimal value, regardless of when or where you compute it.
Hash functions have a few properties that make them useful for password storage:
- They are deterministic, meaning the same input always produces the same output.
- They are fast to compute in one direction but computationally infeasible to reverse.
- Small changes in input produce dramatically different outputs, a property known as the avalanche effect.
The idea is simple: store the hash, discard the password. When a user logs in, hash the password they provide and compare it to the stored hash. If they match, the password is correct. If the database is ever breached, an attacker gets hashes rather than usable passwords.
The problem is that "infeasible to reverse" does not mean "impossible to attack."
The brute-force baseline
The most straightforward attack on a hashed password is brute force. An attacker tries every possible input, hashes each one, and checks whether the result matches the target hash. For short or simple passwords, this works. For long, complex passwords, the search space grows exponentially and brute force becomes impractical.
But brute force is also wasteful. If an attacker wants to crack a second hash, they have to repeat all that computation from scratch. Every attack is independent, and none of the work from previous attempts carries over.
This inefficiency is exactly what rainbow tables were designed to address.
Precomputed hash tables: The naive approach
The simplest optimization over brute force is a lookup table. An attacker precomputes the hash of every possible password (up to some length or drawn from a dictionary) and stores the results in a massive table mapping hashes back to their plaintext inputs. Cracking a password then becomes a simple database lookup rather than a computation.
The problem is storage. A table covering all possible eight-character alphanumeric passwords would require petabytes of disk space. The computational cost shifts from CPU time to storage, and storage becomes the bottleneck.
Rainbow tables are a clever compromise between these two extremes.
How rainbow tables actually work
Rainbow tables were introduced by Philippe Oechslin in 2003 as a refinement of an earlier technique by Martin Hellman from the 1980s. The core insight is that you can trade off between computation time and storage space by using a chain-based structure.
Here is how the process works.
Building the table
The attacker starts with a set of initial plaintext values. For each starting value, they build a chain by alternating between two operations: hashing and reduction.
The hash step is straightforward: apply the hash function to produce a digest. The reduction step is less obvious. A reduction function takes a hash value and maps it back into the space of possible plaintexts. It is not the inverse of the hash function (no such inverse exists). Instead, it is an arbitrary but deterministic function that converts a hash into a string that could plausibly be a password.
The attacker alternates hashing and reducing for a fixed number of steps, say a few thousand, to create a chain. Only the first and last values in each chain are stored. The intermediate values are discarded. This is where the space savings come from: instead of storing every hash-to-plaintext mapping, the table stores only the endpoints of long chains.
The role of multiple reduction functions
The "rainbow" in rainbow tables comes from the use of a different reduction function at each position in the chain. Earlier chain-based approaches, known as Hellman tables, used a single reduction function. This caused a problem called chain merging: if two chains ever produced the same intermediate value, they would follow identical paths from that point forward, wasting storage on duplicate chains.
By using a position-dependent reduction function (conceptually, a different "color" at each step in the chain, hence the rainbow metaphor), the probability of chain merges drops dramatically. Two chains can share an intermediate value without merging, as long as the collision occurs at different positions.
Looking up a hash
To crack a target hash, the attacker works backward through the table. They apply the final reduction function to the hash and check whether the result appears as an endpoint in the table. If not, they apply the second-to-last reduction function followed by a hash and the final reduction, checking after each step. They continue extending backward through the chain positions.
If they find a match against a stored endpoint, they know the target hash falls somewhere in that chain. They then regenerate the chain from its stored starting point, stepping forward until they find the plaintext that produces the target hash.
This lookup process requires computation proportional to the chain length, but it is still far faster than a full brute-force search. And because the table can be reused across any number of target hashes, the upfront cost of building the table is amortized across many attacks.
The time-memory trade-off
Rainbow tables sit on a spectrum between two extremes. On one end is brute force: no storage, maximum computation per attack. On the other end is a full lookup table: massive storage, negligible computation per attack. Rainbow tables occupy a middle ground, using moderate storage with moderate computation per lookup.
The attacker can tune this trade-off by adjusting the chain length. Longer chains mean fewer chains need to be stored, reducing disk usage at the cost of more computation per lookup. Shorter chains mean faster lookups but a larger table. In practice, popular rainbow tables for common hash algorithms and character sets fit on a few terabytes of storage, well within reach for a motivated attacker.
Why salting defeats rainbow tables
The definitive defense against rainbow tables is salting. A salt is a random value generated uniquely for each password. Before hashing, the salt is concatenated with (or otherwise mixed into) the password. The resulting hash depends on both the password and the salt. The salt is stored alongside the hash, typically in the same database row.
Salting works because it makes precomputation infeasible. A rainbow table is built against a specific hash function operating on bare passwords. When each password has a unique salt, the attacker would need a separate rainbow table for every possible salt value. If the salt is, say, 16 bytes long, the number of possible salts is astronomically large, far too many to precompute tables for.
Even if two users choose the identical password, their different salts produce different hashes. This means the attacker cannot even identify duplicate passwords by looking at the hash database, a useful property beyond just rainbow table resistance.
It is worth noting that the salt does not need to be secret. Its purpose is not to add a secret to the hash, but to ensure that each hash is unique even for identical inputs. Knowing the salt does not help the attacker use a rainbow table, because they would still need a table specific to that salt.
Modern password hashing: Beyond simple salting
Salting alone is necessary but not sufficient for modern password security. Even with salts, an attacker who obtains a database of hashed passwords can still attempt brute-force or dictionary attacks against individual hashes. Modern password hashing algorithms address this with two additional properties: computational cost and memory hardness.
Algorithms like bcrypt, scrypt, and Argon2 are deliberately slow. Where SHA-256 can compute billions of hashes per second on modern hardware, bcrypt with a reasonable work factor might manage only a few thousand. This slowness is intentional: it has negligible impact on a legitimate login (a fraction of a second) but makes large-scale password cracking prohibitively expensive.
Argon2, the winner of the Password Hashing Competition in 2015, goes further by also requiring large amounts of memory during computation. This makes it resistant not only to CPU-based attacks but also to GPU-based and ASIC-based attacks, where the cost of memory is a more significant constraint than raw compute throughput.
Together, salting and slow hashing make rainbow tables entirely impractical. The salt eliminates precomputation, and the slow hash function makes per-target brute force expensive.
Are rainbow tables still relevant?
In a well-designed system, no. Any modern authentication implementation that uses bcrypt, scrypt, or Argon2 with proper salting is immune to rainbow table attacks.
However, rainbow tables remain relevant in a few contexts.
Legacy systems that use unsalted MD5 or SHA-1 hashes are still common, particularly in older applications that have not been updated. Precomputed rainbow tables for these algorithms are freely available online and can crack weak passwords in seconds.
Rainbow tables also appear in contexts beyond password storage. They have been used to attack Wi-Fi encryption (specifically WPA/WPA2 with predictable SSIDs), NTLM authentication hashes in Windows environments, and other systems where unsalted or weakly salted hashes are present.
Finally, rainbow tables remain a valuable teaching tool. They illustrate the time-memory trade-off elegantly and provide a concrete example of why naive hashing is insufficient for password security.
Practical recommendations
If you are building or maintaining an authentication system, the defenses against rainbow tables are well-established and should be treated as non-negotiable.
- Use a modern password hashing algorithm. Argon2id is the current best practice. bcrypt remains a solid choice if Argon2 is not available in your environment. Avoid MD5, SHA-1, and SHA-256 for password hashing, even with salts, because they are far too fast.
- Generate a unique, cryptographically random salt for every password. Most modern hashing libraries handle this automatically: bcrypt and Argon2 both incorporate salting into their standard APIs.
- Tune the work factor appropriately. The hash should take at least 100 milliseconds on your production hardware. This is imperceptible to users but devastating to attackers.
- Never roll your own password hashing scheme. The interaction between hashing, salting, and work factors is well-understood, and mature libraries exist for every major language and framework. Use them.
Conclusion
Rainbow table attacks represent an elegant application of the time-memory trade-off to the problem of password cracking. For a period in the early 2000s, they were devastatingly effective against the unsalted hash storage that was common at the time.
The security community responded with salting and slow hashing, which together render rainbow tables obsolete for any properly implemented system. But the lessons they teach about precomputation, trade-offs, and the inadequacy of naive security measures remain as relevant as ever. Understanding rainbow tables is not just about defending against a specific attack. It is about developing the intuition for why password security requires deliberate, layered engineering.