A de-identification technique that applies a function to a value to produce a fixed length output known as a hash. The function is one-way, so the hash cannot be converted back to the original value. Similar inputs will result in very different outputs. However, the same input will result in the same hash.

Even though it is not possible to go backward (from hash to original value), it is easy to go forward (from record to hash). This makes hashing appealing as a fast solution for de-identifying direct identifiers. However, at the same time, it makes it vulnerable to so-called “dictionary” and “rainbow attacks” that can compute the hash of all possible values where there is a fixed sized input domain (for example, an account number), and thus uncover the correspondence between original record and hashes.

Therefore, it is recommended to use hashing in conjunction with a salt (that is stored in a secure manner). A salt is a random string that you add to the original record before hashing. Because it makes the original record longer, it protects, somewhat, against dictionary attacks since there are more possible values.

Return to glossary