Differentially Private Algorithms for Synthetic Power System Datasets
- URL: http://arxiv.org/abs/2303.11079v1
- Date: Mon, 20 Mar 2023 13:38:58 GMT
- Title: Differentially Private Algorithms for Synthetic Power System Datasets
- Authors: Vladimir Dvorkin and Audun Botterud
- Abstract summary: Power systems research relies on the availability of real-world network datasets.
Data owners are hesitant to share data due to security and privacy risks.
We develop privacy-preserving algorithms for the synthetic generation of optimization and machine learning datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While power systems research relies on the availability of real-world network
datasets, data owners (e.g., system operators) are hesitant to share data due
to security and privacy risks. To control these risks, we develop
privacy-preserving algorithms for the synthetic generation of optimization and
machine learning datasets. Taking a real-world dataset as input, the algorithms
output its noisy, synthetic version, which preserves the accuracy of the real
data on a specific downstream model or even a large population of those. We
control the privacy loss using Laplace and Exponential mechanisms of
differential privacy and preserve data accuracy using a post-processing convex
optimization. We apply the algorithms to generate synthetic network parameters
and wind power data.
Related papers
- Differentially Private Synthetic High-dimensional Tabular Stream [7.726042106665366]
We propose an algorithmic framework for streaming data that generates multiple synthetic datasets over time.
Our algorithm satisfies differential privacy for the entire input stream.
We show the utility of our method via experiments on real-world datasets.
arXiv Detail & Related papers (2024-08-31T01:31:59Z) - Mitigating the Privacy Issues in Retrieval-Augmented Generation (RAG) via Pure Synthetic Data [51.41288763521186]
Retrieval-augmented generation (RAG) enhances the outputs of language models by integrating relevant information retrieved from external knowledge sources.
RAG systems may face severe privacy risks when retrieving private data.
We propose using synthetic data as a privacy-preserving alternative for the retrieval data.
arXiv Detail & Related papers (2024-06-20T22:53:09Z) - FewFedPIT: Towards Privacy-preserving and Few-shot Federated Instruction Tuning [54.26614091429253]
Federated instruction tuning (FedIT) is a promising solution, by consolidating collaborative training across multiple data owners.
FedIT encounters limitations such as scarcity of instructional data and risk of exposure to training data extraction attacks.
We propose FewFedPIT, designed to simultaneously enhance privacy protection and model performance of federated few-shot learning.
arXiv Detail & Related papers (2024-03-10T08:41:22Z) - An Algorithm for Streaming Differentially Private Data [7.726042106665366]
We derive an algorithm for differentially private synthetic streaming data generation, especially curated towards spatial datasets.
The utility of our algorithm is verified on both real-world and simulated datasets.
arXiv Detail & Related papers (2024-01-26T00:32:31Z) - Decentralised, Scalable and Privacy-Preserving Synthetic Data Generation [8.982917734231165]
We build a novel system that allows the contributors of real data to autonomously participate in differentially private synthetic data generation.
Our solution is based on three building blocks namely: Solid (Social Linked Data), MPC (Secure Multi-Party Computation) and Trusted Execution Environments (TEEs)
We show how these three technologies can be effectively used to address various challenges in responsible and trustworthy synthetic data generation.
arXiv Detail & Related papers (2023-10-30T22:27:32Z) - On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms [56.119374302685934]
There have been severe concerns over the trustworthiness of AI technologies.
Machine and deep learning algorithms depend heavily on the data used during their development.
We propose a framework to evaluate the datasets through a responsible rubric.
arXiv Detail & Related papers (2023-10-24T14:01:53Z) - Differentially Private Data Generation with Missing Data [25.242190235853595]
We formalize the problems of differential privacy (DP) synthetic data with missing values.
We propose three effective adaptive strategies that significantly improve the utility of the synthetic data.
Overall, this study contributes to a better understanding of the challenges and opportunities for using private synthetic data generation algorithms.
arXiv Detail & Related papers (2023-10-17T19:41:54Z) - Differentially Private Synthetic Data Using KD-Trees [11.96971298978997]
We exploit space partitioning techniques together with noise perturbation and thus achieve intuitive and transparent algorithms.
We propose both data independent and data dependent algorithms for $epsilon$-differentially private synthetic data generation.
We show empirical utility improvements over the prior work, and discuss performance of our algorithm on a downstream classification task on a real dataset.
arXiv Detail & Related papers (2023-06-19T17:08:32Z) - Private Set Generation with Discriminative Information [63.851085173614]
Differentially private data generation is a promising solution to the data privacy challenge.
Existing private generative models are struggling with the utility of synthetic samples.
We introduce a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-07T10:02:55Z) - Decentralized Stochastic Optimization with Inherent Privacy Protection [103.62463469366557]
Decentralized optimization is the basic building block of modern collaborative machine learning, distributed estimation and control, and large-scale sensing.
Since involved data, privacy protection has become an increasingly pressing need in the implementation of decentralized optimization algorithms.
arXiv Detail & Related papers (2022-05-08T14:38:23Z) - New Oracle-Efficient Algorithms for Private Synthetic Data Release [52.33506193761153]
We present three new algorithms for constructing differentially private synthetic data.
The algorithms satisfy differential privacy even in the worst case.
Compared to the state-of-the-art method High-Dimensional Matrix Mechanism citeMcKennaMHM18, our algorithms provide better accuracy in the large workload.
arXiv Detail & Related papers (2020-07-10T15:46:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.