Self-service access to safe data
Protect data and manage risk
Analyze conversational chat data
Right data in the right hands
Align control and business use
Controlled access to data
Flexibility, consistency, scalability
Our professional services
Power responsible use
From clinical to commercial
Optimize data tests
Open new revenue streams
Realize the potential of the cloud
Protect data from misuse
Transform your data
Opinion and industry insights
An A to Z of the industry
The podcast for data leaders
Press releases, awards, and more
Staying at the cutting edge
The team behind Privitar
A thriving partner ecosystem
Our story, values, and careers
Dedicated customer assistance
Feb 14, 2018
With synthetic data often falling short of test requirements, there remains a persistent notion that only raw, sensitive data will do for fast and accurate Test and Dev. We make the case: you don’t need production data at all.
Test and Dev environments, by necessity, don’t have the same security controls as production environments. When you move your sensitive data into one, you open it up to a much wider group of users ‘ who may or may not be employed by your business. The result is a massively increased risk of a breach, whether malicious or inadvertent, and of a hefty fine from a regulator.
There’s a misconception that raw production data is the only data fit for Test and Dev. The argument goes as follows’
You could use a synthetic data generator to create completely artificial datasets, using the same schema as your production data ‘ but it won’t replicate your production data’s nuanced structure, complexity and referential integrity.
Use that synthetic data for Test and Dev, and you’ll find:
Given issue resolution and new developments are the two reasons most companies turn to their Test and Dev environment, the case against synthetic data is clear.
Another approach is to take a snapshot of your production data, and mask it ‘ anonymising the information that could be used to instantly identify an individual within the overall dataset.
But for most organisations, masking data remains unsophisticated, lengthy, manual work ‘ mostly due to the one-off permission processes your team has to go through. What’s more, it rarely goes far enough.
Secondary identifiers ‘ which, when taken together, still enable an individual to be identified ‘ are almost always left untouched because data masking requires much more sophisticated de-identification techniques, from introducing noise by perturbation to grouping values by generalisation.
Businesses have simply crossed their fingers and authorised the use of their most sensitive data. Today, however, thanks to growing regulatory pressure ‘ and a better understanding of privacy risks ‘ these special dispensations are coming under ever closer scrutiny.
Simply put, there has to be another option.
It’s true that synthetic and manually masked data are no good ‘ but that doesn’t mean you need to reach for raw data straight away.
It’s not the rawness of production data that makes it fit for Test and Dev purposes. It’s two characteristics that synthetic and manually masked data can’t match:
Those structures and linkages within data sets, the referential integrity needed if you’re going to replicate bugs and issues effectively ‘ and have confidence in your test results.
To get more granular and to be truly valuable, Test and Dev data must:
(And that’s just for starters. See our checklist for a detailed rundown of what makes Test and Dev data useful and safe.)
When your production systems go down, you need to be able to act fast. Valuable Test and Dev data is data that’s readily available ‘ allowing teams to work on patches and fixes the moment it’s clear something’s wrong.
Here’s the good news. A mature approach to data protection can help you rapidly provision rich data in a Test and Dev environment, without relying on raw production data, and opening your business up to a world of risk.
A more sophisticated method of anonymisation can preserve what matters in your production data ‘ its richness and ready availability ‘ while effectively protecting the sensitive information it contains, and even increasing its usefulness for Test and Dev. How? By’
The result isn’t just safer data for Test and Dev, it’s even more valuable Test and Dev data than you started with.
In the end, this isn’t just about Test and Dev. It’s about any transfer of production data to a secondary environment.
We’re living in the age of big data, but without a way to securely provision to Analytics, Machine Learning and Test and Dev environments, many organisations still aren’t feeling a big difference.
A faster, smarter approach to data anonymisation can help open up all of these data flows, while keeping access to sensitive data genuinely locked down. We know, because we’ve pioneered it, and turned it into packaged solution ‘ and because our customers are proving its value every day.
If you’re working to quickly and safely provision data to your own Test and Dev environment, we’ve a checklist to help. It lays out six principles to follow to ensure your data is both safe and useful for Test and Dev purposes ‘ you can download your copy here.
Our team of data security and privacy experts are here to answer your questions and discuss how modern data provisioning can fuel business growth.