Here you can find a short video interview with Charlie Cabot, Research Lead at Privitar, in which he talks about some of the risks of Differencing Attacks.
[0:10] One type of privacy attack is called a differencing attack, which targets aggregate statistics, like summary statistics, histograms and charts. The differencing attack works by singling out an individual from multiple aggregate statistics. As a real world example, there is a US site that publishes cancer statistics, broken down by race, gender, county and year. In 2011, in one county, there is only one woman diagnosed with cancer whose race was not white. The system prevents querying about this woman directly, but the statistics leave her vulnerable to a differencing attack.
By queryng about all women in the county, and their types of cancer, then all white women, we learned the one individual non-white woman's type of cancer. This was sensitive information that the system did not intend to provide.
[1:00] How can you prevent differencing attacks?
To prevent differencing attacks you can add noise to the statistics, preventing the attacker from learning anything meaningful about an individual. The differencing attack becomes so imprecise that it's useless for the attacker.
Through differential privacy, the noise is calibrated to stop all differencing attacks, even ones more sophisticated than the one I described.
What's great is that the information about groups is still preserved, cancer researchers can still research trends of cancer rates across regions and over time. We preserve the information about groups, but protected the sensitive information of individuals; and that's the unique benefit of privacy engineering.