Self-service access to safe data
Protect data and manage risk
Analyze conversational chat data
Reduce the time and cost to comply
Right data in the right hands
Align control and business use
Controlled access to data
Flexibility, consistency, scalability
Our professional services
Power responsible use
From clinical to commercial
Optimize data tests
Open new revenue streams
Realize the potential of the cloud
Protect data from misuse
Transform your data
Opinion and industry insights
An A to Z of the industry
The podcast for data leaders
Press releases, awards, and more
Staying at the cutting edge
The team behind Privitar
A thriving partner ecosystem
Our story, values, and careers
Dedicated customer assistance
Jan 29, 2019
The UK Office for National Statistics (ONS) recently published a methodology report on behalf of the Government Statistical Service (GSS). Members of Privitar’s research and policy teams were invited to contribute a chapter, on an area of privacy engineering we are particularly excited about: differential privacy 1.
Differential privacy provides a guarantee that no one can learn anything significant about any individual from their inclusion in a dataset. This helps private companies, as well as government agencies, to share and monetise their data while protecting the privacy of the people in their datasets. This post outlines the key facts about differential privacy, why you need it, and what it can offer you.
Data about groups can reveal more than intended about specific individuals. This issue is increasingly serious in the modern world, where adversaries are in possession of powerful computers, sophisticated techniques, and large amounts of auxiliary data. In addition, regulatory demands on, and public expectations of, data holders are heightening. But what exactly are the risks private sector organisations should be trying to defend against?
Take a health insurer providing insurance to the employees of multiple companies – They wish to provide data about trends in certain health conditions to their clients. One approach is to prevent access to the dataset itself and only release aggregate statistics like sums, counts, and averages. The insurer might release to its clients the the prevalence of different health issues amongst staff broken down by demographic groups, rather than the complete dataset itself.
Aggregates may at first glance appear impossible to attack, but they may be vulnerable to differencing or reconstruction attacks.
Differential privacy was developed to protect against these attacks, as well as all other attacks, whether we know about them yet or not . No other existing privacy approach is capable of doing this. It treats the cause of attacks – information leaked about individuals – rather than the attacks themselves, meaning your defence won’t be broken by discovery of a new attack. Using this robust approach, otherwise inaccessible data can be safely analysed.
Differential privacy can be applied to anything from aggregate statistics to complex machine learning tasks.
When data sharing is taking place within more controlled environments, such as under strict contracts with only a few individuals, the risk of privacy attacks is lower. Without a controlled data environment, businesses must expect an adversary with potentially unlimited background information, state of the art techniques, and plenty of resources. Without differential privacy, it can be very hard to ensure that attacks on aggregate statistics are not possible in these situations.
An insurer using differential privacy to release aggregate statistics to clients about the health of their staff could give a provable mathematical guarantee that this reveals a limited amount about any individual employee. It is therefore safe for the staff to ‘opt in’, as they can be sure that doing so will not reveal enough information to determine their individual diagnosis.
Differential privacy gives you direct control over the balance between privacy and utility of data analysis. It adds a precise amount of probabilistic noise to your statistics, creating controlled uncertainty, and allows you to tune the level of this noise. More noise improves privacy, as less can be learned about individuals, but will reduce accuracy. A balance between accuracy and privacy must always be found, but no other method makes this so directly accessible.
Differential privacy protects multiple analyses of the same data by adding more noise to account for the accumulating risk. Each new analysis will be less accurate than the last. This is sometimes considered a negative point of differential privacy itself – a classic case of ‘shooting the messenger’. Privacy risk accumulates with each data release no matter what you do to protect the data, but only differential privacy allows you to explicitly detect and respond to accumulated risk. It’s an uncomfortable truth certainly, but one you should be aware of.
Calibrating noise is part technical and part pragmatic. The aim is to prevent attacks without overly affecting insights drawn from the data. Be very wary of any vendor who offers you a differential privacy product without suggesting how to calibrate it. Be sceptical, too, of any differential privacy product which does not factor in the risk of multiple analyses. For further reading on this topic see our blog post.
Differential privacy does not depend on hiding any technical details.This means you can enjoy the benefits of being transparent without additional risk: downstream users can know exactly how noise was added and account for it, preventing false conclusions.
Consider differential privacy if you:
To find out more about how differential privacy can help organisations preserve the privacy and utility of data, read our executive summary of the GSS report.
1 This report was co-authored by Professor Kobbi Nissim of Georgetown who, as well as being one of Privitar’s academic advisors, is one of the creators of differential privacy. Along with his co-authors, Nissim was awarded the 2016 Test of Time award and the 2017 G??del Prize, for his work on differential privacy.
Our team of data security and privacy experts are here to answer your questions and discuss how modern data provisioning can fuel business growth.