Home Blog Fragility Forum: Minimizing Privacy Risks in Humanitarian Data Fragility Forum: Minimizing Privacy Risks in Humanitarian Data Mar 09, 2022 By Dr. Suzanne Weller, Head of Research at Privitar This week sees the start of the World Bank’s Fragility Forum, an event that brings together policymakers and practitioners from humanitarian, development, peace, and security communities; public and private sector; academia; and civil society. The objective is to exchange innovative ideas and knowledge to improve development approaches in fragile, conflict- and violence-affected (FCV) settings to foster peace and stability. As part of the Fragility Forum, I took part in a podcast interview, together with Paddy Brock, a Senior Data Scientist at the World Bank UNHCR Joint Data Center on Forced Displacement, and Jos Berens, Data Responsibility Officer at the UN OCHA Centre for Humanitarian Data. The podcast explores how the humanitarian and development community can ensure that responses and programs continue to be informed by high-quality data, evidence, and research, without exposing the vulnerable to harm. You can listen to the full podcast here, but I also want to share some key takeaways from our conversation: Data helps us respond to crises Data is an important part of how we respond to humanitarian crises. Useful data in this context includes information about affected people, their locations, the threats they face, and the assistance given—as well as data about transportation infrastructure, food prices, and the availability of health and education facilities. The State of Open Humanitarian Data 2022 describes how this type of data “reflects the reality of the world’s worst humanitarian emergencies, from persistent displacement to a lack of food and shelter for vulnerable populations. The data also shows the response to these crises, from who is providing what assistance to funding levels and more.” There is strong demand for this data, growing through the COVID-19 pandemic and the crisis in Ukraine. In 2021, 1.4 million people across 236 countries and territories used the Humanitarian Data Exchange (HDX). They downloaded datasets over 1.8 million times. Data scientists and researchers from the humanitarian community use the data to understand crises and how to respond to them. Where and when should shelter, food, water, and other essentials be provided? The data can also be used for modeling purposes such as developing anticipatory action trigger mechanisms, enabling assistance before a shock occurs, and reducing the impact on the lives of the people affected. The privacy/utility tradeoff is acute At the same time, the potential harm of getting privacy wrong is severe. These datasets contain information about some of the world’s most vulnerable people. Responsible data stewardship needs to ensure the safety of refugees and internally displaced people fleeing their homes due to persecution, conflict, and disaster. The Centre for Humanitarian Data has explored the “mosaic effect,” analyzing HDX datasets to understand the extent to which information and attributes are shared between different datasets in the exchange. The mosaic effect risk is defined as “disparate items of information taking on added significance when combined with other items of information.” This describes a family of privacy attacks that, through linkage with other data, leads to the reconstruction of microdata from aggregate statistics, re-identification of individuals, or sensitive attribute disclosure. The opportunity costs of not using this data to inform humanitarian operations and development projects are unacceptably high. Operations, policy, and research in this area must continue to be informed by safely managed data. The challenge is clear. PETs can help Privacy Enhancing Technologies (PETs for short) are technologies that allow useful insights to be derived from data without requiring full access to original, raw datasets. PETs can help mitigate the tensions between data privacy and utility. For example, homomorphic encryption and trusted execution environments enable processing or analysis of encrypted data; federated analysis or multiparty computation allows the protected analysis of distributed data without the need to centralize; differential privacy and synthetic data enable safe dissemination of the outputs of analytics. When thinking about how PETs can help counter future privacy risks from humanitarian data while using data to inform operations and planning, we need to look for ways of generating insights from sensitive personal data without that data being accessed or shared. As a hypothetical humanitarian use case: analysts may want to pool information on real-time, cross-border and internal movements—allowing them to advise on how to direct supplies in a crisis. To support the analysis, researchers would need to collect and aggregate sensitive location information about individuals from different locations, even different organizations. Federated learning could help in this context. Researchers can train a predictive model on their local dataset, then send that model across several remote datasets at other organizations—for example, mobile operators or other humanitarian organizations that collect data on people’s locations. The model would then return to the researcher having learned something new from those remote datasets, improving its ability to predict people’s movements. These distributed organizations never need to share their record-level data directly. In examples such as these, PETs form a technological toolkit that reduces risk, makes data sharing simpler, and pushes the boundaries of what’s possible in terms of providing strong privacy and high utility from a dataset. While these technologies can’t solve everything, it’s still important to take an approach that considers the ethics of projects, training and vetting of trusted researchers, data sharing agreements, and other legal controls, as well as security and access to the data. PETs are an important complementary part of the system of people, processes, and technology needed for modern data provisioning and responsible stewardship. Listen to the World Bank’s Fragility Forum podcast interview with Dr. Suzanne Weller here. Data Privacy Privacy Enhancing Technologies