Everyone has information that needs to be kept private. Far too often, this data is compromised due to the mistakes of others. At In:Confidence 2019, we heard about the top three privacy fails, what caused them, the harmful results they left behind – and how you can avoid similar situations in your own organisations.
When it comes to personal privacy, we’re sometimes reminded of Google’s ex-CEO Eric Schmidt’s philosophy – if you have something you don’t want anyone to know, maybe you shouldn’t be doing it in the first place.
This simply isn’t true. Everyone has information they want – and need – to keep private. Our bank statements, our credit rating, our medical histories, our text message histories – all of these are things that we might want to keep from even our closest friends. As Shoshana Zuboff closed In:Confidence 2019, her final statement was left ringing across the venue. “Anyone who has nothing to hide is nothing.”
It’s the reason we create products that help prevent privacy harm. But as more data is gathered from all of us – whether it’s from our shopping habits, hospital visits, or even our interests on Facebook – preventing privacy harm becomes a bigger challenge.
A problem in three parts
Privacy harm is a challenge easiest looked at in three parts: re-identification, unintended disclosure, and the harmful result.
When data is released anonymously, it can often be easily re-identified by analysing it alongside publicly available information from sources such as the voter registration records or the Land Registry.
2. Unintended disclosure
Some datasets can reveal information about you that’s not explicitly in the data. For example, if a piece of data shows that you spent £10.32 in Boots, it can be deduced that this could only be a pregnancy test.
3. The harmful result
When data is re-identified or information is worked out from a set of data, it can lead to incredibly harmful results – including theft of personal information, fraud, and even blackmail.
This process happens on a major scale far too often, and its results can be catastrophic and irreversible. We’ve identified the top three privacy fails to see what went wrong, the data they revealed, and the lasting damage they caused.
Fail #1: New York Taxis
In 2014, following a Freedom of Information act request from Chris Whong, NYC’s Taxi and Limousine Commission released 173 million trips made by New York taxis.
Every trip recorded contained a huge amount of specific data, including vendor IDs, pickup and drop-off dates and times, trip distance, and even coordinates for the pickup and drop-off locations.
Just by looking at the data, people were quickly able to re-identify passengers by their drop-off locations, discover their mail addresses, and find passengers’ profiles on Facebook.
Unsurprisingly, this process led to a huge violation of privacy for many people. Celebrities were identified and opened to the risk of being stalked across New York City. Men leaving strip clubs at early hours in the morning were identified and at risk of being blackmailed – and having some very awkward conversations with their families, friends and employers.
Fail #2: Massachusetts hospitals
A similar case of re-identification compromised the private information of thousands of hospital visitors in Massachusetts in 1997.
After releasing anonymised hospital visit data that included patients’ zip code, gender, and date of birth, it quickly became clear that these three pieces of information could reveal the identity of almost 90% of the state’s population.
MIT graduate Latanya Sweeney managed to find the Massachusetts governor’s personal health records by analysing the data alongside an electoral roll database she bought for $20. Using the same process, people could discover intensely private information such as diagnoses, procedures, and prescribed medication of thousands of patients in Massachusetts.
Fail #3: Facebook and Cambridge Analytica
One of the biggest privacy fails in the past decade, the Cambridge Analytica scandal saw a huge blow to Facebook’s reputation, sparking a public outrage on the use of personal data.
Cambridge Analytica was able to identify political and psychological traits in users from their profiles, just by looking at their likes and interests supplied by Facebook.
Facebook users never agreed to share this information, and they weren’t even aware it was happening. Despite their lack of consent, Cambridge Analytica used Facebook likes to automatically and accurately predict users’ sexual orientation, happiness, religious and political views, and more.
This information was then used to target specific individuals through Facebook during the 2016 US presidential election in an attempt to influence their vote. With a huge platform and population to work with, the process was performed on a massive scale.
Time for accountability
In retrospect, these three fails seem like they could have easily been avoided with more accountability for consumer data. But fails like these happen all the time – and it’s time to put a stop to this. Are you confident your organisation knows what it’s doing with its data?