A de-identification technique that adds random noise to field values or query results. Perturbation protects data against privacy attacks which rely on knowledge of specific values. It can be used for de-identifying quasi identifiers. For example, numeric values, dates and timestamps.

Perturbation-based approaches to de-identifying datasets should ensure that the noise magnitude is small enough that the valuable insights in the dataset are preserved. For example, monetary transactions could be perturbed by any full unit value in a range. So an input value of $182 could be perturbed by +/- $5 to generate an output value in the range $177-$187.

Return to glossary