Evaluating the Usability of Differential Privacy Tools with Data Practitioners
- URL: http://arxiv.org/abs/2309.13506v3
- Date: Tue, 13 Aug 2024 01:49:10 GMT
- Title: Evaluating the Usability of Differential Privacy Tools with Data Practitioners
- Authors: Ivoline C. Ngong, Brad Stenger, Joseph P. Near, Yuanyuan Feng,
- Abstract summary: Differential privacy (DP) has become the gold standard in privacy-preserving data analytics, but implementing it in real-world datasets and systems remains challenging.
Recently developed DP tools aim to make DP implementation easier, but limited research has investigated these DP tools' usability.
We evaluated the usability of four Python-based open-source DP tools: DiffPrivLib, Tumult Analytics, PipelineDP, and OpenDP.
- Score: 4.072285093323275
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Differential privacy (DP) has become the gold standard in privacy-preserving data analytics, but implementing it in real-world datasets and systems remains challenging. Recently developed DP tools aim to make DP implementation easier, but limited research has investigated these DP tools' usability. Through a usability study with 24 US data practitioners with varying prior DP knowledge, we evaluated the usability of four Python-based open-source DP tools: DiffPrivLib, Tumult Analytics, PipelineDP, and OpenDP. Our results suggest that using DP tools in this study may help DP novices better understand DP; that Application Programming Interface (API) design and documentation are vital for successful DP implementation; and that user satisfaction correlates with how well participants completed study tasks with these DP tools. We provide evidence-based recommendations to improve DP tools' usability to broaden DP adoption.
Related papers
- Machine Learning with Privacy for Protected Attributes [56.44253915927481]
We refine the definition of differential privacy (DP) to create a more general and flexible framework that we call feature differential privacy (FDP)<n>Our definition is simulation-based and allows for both addition/removal and replacement variants of privacy, and can handle arbitrary separation of protected and non-protected features.<n>We apply our framework to various machine learning tasks and show that it can significantly improve the utility of DP-trained models when public features are available.
arXiv Detail & Related papers (2025-06-24T17:53:28Z) - What is the Cost of Differential Privacy for Deep Learning-Based Trajectory Generation? [20.540761983235868]
We show how DP-SGD affects the utility of state-of-the-art generative models.<n>We propose a novel DP mechanism for conditional generation that provides formal guarantees and assess its impact on utility.<n>Our results show that DP-SGD significantly impacts performance, although some utility remains if the datasets is sufficiently large.
arXiv Detail & Related papers (2025-06-11T00:59:52Z) - SoK: Usability Studies in Differential Privacy [3.4111656179349743]
Differential Privacy (DP) has emerged as a pivotal approach for safeguarding individual privacy in data analysis.
This paper presents a comprehensive systematization of existing research on the usability of and communication about DP.
arXiv Detail & Related papers (2024-12-22T02:21:57Z) - DOPPLER: Differentially Private Optimizers with Low-pass Filter for Privacy Noise Reduction [47.65999101635902]
Differentially private (DP) training prevents the leakage of sensitive information in the collected training data from trained machine learning models.
We develop a new component, called DOPPLER, which works by effectively amplifying the gradient while DP noise within this frequency domain.
Our experiments show that the proposed DPs with a lowpass filter outperform their counterparts without the filter by 3%-10% in test accuracy.
arXiv Detail & Related papers (2024-08-24T04:27:07Z) - How Private are DP-SGD Implementations? [61.19794019914523]
We show that there can be a substantial gap between the privacy analysis when using the two types of batch sampling.
Our result shows that there can be a substantial gap between the privacy analysis when using the two types of batch sampling.
arXiv Detail & Related papers (2024-03-26T13:02:43Z) - Pre-training Differentially Private Models with Limited Public Data [54.943023722114134]
differential privacy (DP) is a prominent method to gauge the degree of security provided to the models.
DP is yet not capable of protecting a substantial portion of the data used during the initial pre-training stage.
We develop a novel DP continual pre-training strategy using only 10% of public data.
Our strategy can achieve DP accuracy of 41.5% on ImageNet-21k, as well as non-DP accuracy of 55.7% and and 60.0% on downstream tasks Places365 and iNaturalist-2021.
arXiv Detail & Related papers (2024-02-28T23:26:27Z) - Selectivity Drives Productivity: Efficient Dataset Pruning for Enhanced
Transfer Learning [66.20311762506702]
dataset pruning (DP) has emerged as an effective way to improve data efficiency.
We propose two new DP methods, label mapping and feature mapping, for supervised and self-supervised pretraining settings.
We show that source data classes can be pruned by up to 40% 80% without sacrificing downstream performance.
arXiv Detail & Related papers (2023-10-13T00:07:49Z) - ULDP-FL: Federated Learning with Across Silo User-Level Differential Privacy [19.017342515321918]
Differentially Private Federated Learning (DP-FL) has garnered attention as a collaborative machine learning approach that ensures formal privacy.
We present Uldp-FL, a novel FL framework designed to guarantee user-level DP in cross-silo FL where a single user's data may belong to multiple silos.
arXiv Detail & Related papers (2023-08-23T15:50:51Z) - Personalized DP-SGD using Sampling Mechanisms [5.50042037663784]
We extend Differentially Private Gradient Descent (DP-SGD) to support a recent privacy notion called ($Phi$,$Delta$)- Personalized Differential Privacy (($Phi$,$Delta$)- PDP.
Our algorithm uses a multi-round personalized sampling mechanism and embeds it within the DP-SGD iteration.
Experiments on real datasets show that our algorithm outperforms DP-SGD and simple combinations of DP-SGD with existing PDP mechanisms.
arXiv Detail & Related papers (2023-05-24T13:56:57Z) - DPMLBench: Holistic Evaluation of Differentially Private Machine
Learning [8.568872924668662]
Many studies have recently proposed improved algorithms based on DP-SGD to mitigate utility loss.
More importantly, there is a lack of comprehensive research to compare improvements in these DPML algorithms across utility, defensive capabilities, and generalizability.
We fill this gap by performing a holistic measurement of improved DPML algorithms on utility and defense capability against membership inference attacks (MIAs) on image classification tasks.
arXiv Detail & Related papers (2023-05-10T05:08:36Z) - Exploring the Benefits of Visual Prompting in Differential Privacy [54.56619360046841]
Visual Prompting (VP) is an emerging and powerful technique that allows sample-efficient adaptation to downstream tasks by engineering a well-trained frozen source model.
We explore and integrate VP into canonical DP training methods and demonstrate its simplicity and efficiency.
arXiv Detail & Related papers (2023-03-22T01:01:14Z) - Make Landscape Flatter in Differentially Private Federated Learning [69.78485792860333]
We propose a novel DPFL algorithm named DP-FedSAM, which leverages gradient perturbation to mitigate the negative impact of DP.
Specifically, DP-FedSAM integrates local flatness models with better stability and weight robustness, which results in the small norm of local updates and robustness to DP noise.
Our algorithm achieves state-of-the-art (SOTA) performance compared with existing SOTA baselines in DPFL.
arXiv Detail & Related papers (2023-03-20T16:27:36Z) - How to DP-fy ML: A Practical Guide to Machine Learning with Differential
Privacy [22.906644117887133]
Differential Privacy (DP) has become a gold standard for making formal statements about data anonymization.
The adoption of DP is hindered by limited practical guidance of what DP protection entails, what privacy guarantees to aim for, and the difficulty of achieving good privacy-utility-computation trade-offs for ML models.
This work is a self-contained guide that gives an in-depth overview of the field of DP ML and presents information about achieving the best possible DP ML model with rigorous privacy guarantees.
arXiv Detail & Related papers (2023-03-01T16:56:39Z) - Lifelong DP: Consistently Bounded Differential Privacy in Lifelong
Machine Learning [28.68587691924582]
We show that the process of continually learning new tasks and memorizing previous tasks introduces unknown privacy risks and challenges to bound the privacy loss.
We introduce a formal definition of Lifelong DP, in which the participation of any datas in the training set of any tasks is protected.
We propose a scalable and heterogeneous algorithm, called L2DP-ML, to efficiently train and continue releasing new versions of an L2M model.
arXiv Detail & Related papers (2022-07-26T11:55:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.