A property of a de-identified dataset whereby the dataset is defined as being k-anonymous if for every combination of indirectly identifying attributes there are at least k records. K-anonymity is often used as a property of de-identified datasets in which indirect identifiers have been obfuscated to make them less specific.

k-anonymity is a useful property when used in a de-identification process as it shows that the information for each person contained in the dataset cannot be distinguished from at least k-1 other individuals whose information appears in the dataset from the indirect identifiers.

For example, if a de-identified dataset contains two indirect identifiers for an individual (year of birth and two-character zip code). Such a dataset would have a property of k=5 if there were at least five individuals with the same combination of year of birth and two character zip code.



Return to glossary


Share this post

Ready to learn more about Privitar?

Our team of data privacy experts is here to answer your questions and discuss how data privacy can fuel your business.