Private Approximate Query over Horizontal Data Federation
- URL: http://arxiv.org/abs/2406.11421v1
- Date: Mon, 17 Jun 2024 11:19:58 GMT
- Title: Private Approximate Query over Horizontal Data Federation
- Authors: Ala Eddine Laouir, Abdessamad Imine,
- Abstract summary: Existing approaches rely on cryptography, which improves privacy, but at the expense of query response time.
We propose a new approach that considers a data distribution-aware online sampling technique to accelerate the execution of range queries.
Our solution is able of providing up to 8 times faster processing than the basic non-secure solution.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many real-world scenarios, multiple data providers need to collaboratively perform analysis of their private data. The challenges of these applications, especially at the big data scale, are time and resource efficiency as well as end-to-end privacy with minimal loss of accuracy. Existing approaches rely primarily on cryptography, which improves privacy, but at the expense of query response time. However, current big data analytics frameworks require fast and accurate responses to large-scale queries, making cryptography-based solutions less suitable. In this work, we address the problem of combining Approximate Query Processing (AQP) and Differential Privacy (DP) in a private federated environment answering range queries on horizontally partitioned multidimensional data. We propose a new approach that considers a data distribution-aware online sampling technique to accelerate the execution of range queries and ensure end-to-end data privacy during and after analysis with minimal loss in accuracy. Through empirical evaluation, we show that our solution is able of providing up to 8 times faster processing than the basic non-secure solution while maintaining accuracy, formal privacy guarantees and resilience to learning-based attacks.
Related papers
- PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a.
Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns.
We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z) - SPECIAL: Synopsis Assisted Secure Collaborative Analytics [6.5653818719859895]
SPECIAL is a secure collaborative analytics system that simultaneously ensures bounded privacy loss and advanced query planning.
It significantly outperforms cutting-edge SCAs, with up to 80X faster query times and over 900X smaller memory for complex queries.
It also achieves up to an 89X reduction in privacy loss under continual processing.
arXiv Detail & Related papers (2024-04-29T02:55:54Z) - Provable Privacy with Non-Private Pre-Processing [56.770023668379615]
We propose a general framework to evaluate the additional privacy cost incurred by non-private data-dependent pre-processing algorithms.
Our framework establishes upper bounds on the overall privacy guarantees by utilising two new technical notions.
arXiv Detail & Related papers (2024-03-19T17:54:49Z) - A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing.
Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data.
Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z) - On Differential Privacy and Adaptive Data Analysis with Bounded Space [76.10334958368618]
We study the space complexity of the two related fields of differential privacy and adaptive data analysis.
We show that there exists a problem P that requires exponentially more space to be solved efficiently with differential privacy.
The line of work on adaptive data analysis focuses on understanding the number of samples needed for answering a sequence of adaptive queries.
arXiv Detail & Related papers (2023-02-11T14:45:31Z) - Private Set Generation with Discriminative Information [63.851085173614]
Differentially private data generation is a promising solution to the data privacy challenge.
Existing private generative models are struggling with the utility of synthetic samples.
We introduce a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-07T10:02:55Z) - Reasoning over Public and Private Data in Retrieval-Based Systems [29.515915401413334]
State-of-the-art systems explicitly retrieve relevant information to a user question from a background corpus before producing an answer.
While today's retrieval systems assume the corpus is fully accessible, users are often unable or unwilling to expose their private data to entities hosting public data.
We first define the PUBLIC-PRIVATE AUTOREGRESSIVE Information RETRIEVAL (PAIR) privacy framework for the novel retrieval setting over multiple privacy scopes.
arXiv Detail & Related papers (2022-03-14T13:08:51Z) - Efficient Logistic Regression with Local Differential Privacy [0.0]
Internet of Things devices are expanding rapidly and generating huge amount of data.
There is an increasing need to explore data collected from these devices.
Collaborative learning provides a strategic solution for the Internet of Things settings but also raises public concern over data privacy.
arXiv Detail & Related papers (2022-02-05T22:44:03Z) - Differentially Private Query Release Through Adaptive Projection [19.449593001368193]
We propose, implement, and evaluate a new algorithm for releasing answers to very large numbers of statistical queries like $k$-way marginals.
Our algorithm makes adaptive use of a continuous relaxation of the Projection Mechanism, which answers queries on the private dataset using simple perturbation.
We find that our method outperforms existing algorithms in many cases, especially when the privacy budget is small or the query class is large.
arXiv Detail & Related papers (2021-03-11T12:43:18Z) - Robust and Differentially Private Mean Estimation [40.323756738056616]
Differential privacy has emerged as a standard requirement in a variety of applications ranging from the U.S. Census to data collected in commercial devices.
An increasing number of such databases consist of data from multiple sources, not all of which can be trusted.
This leaves existing private analyses vulnerable to attacks by an adversary who injects corrupted data.
arXiv Detail & Related papers (2021-02-18T05:02:49Z) - Straggler-Resilient Federated Learning: Leveraging the Interplay Between
Statistical Accuracy and System Heterogeneity [57.275753974812666]
Federated learning involves learning from data samples distributed across a network of clients while the data remains local.
In this paper, we propose a novel straggler-resilient federated learning method that incorporates statistical characteristics of the clients' data to adaptively select the clients in order to speed up the learning procedure.
arXiv Detail & Related papers (2020-12-28T19:21:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.