Accuracy and Privacy Evaluations of Collaborative Data Analysis
- URL: http://arxiv.org/abs/2101.11144v1
- Date: Wed, 27 Jan 2021 00:38:47 GMT
- Title: Accuracy and Privacy Evaluations of Collaborative Data Analysis
- Authors: Akira Imakura, Anna Bogdanova, Takaya Yamazoe, Kazumasa Omote, Tetsuya
Sakurai
- Abstract summary: A collaborative data analysis through sharing dimensionality reduced representations of data has been proposed as a non-model sharing-type federated learning.
This paper analyzes the accuracy and privacy evaluations of this novel framework.
- Score: 4.987315310656657
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distributed data analysis without revealing the individual data has recently
attracted significant attention in several applications. A collaborative data
analysis through sharing dimensionality reduced representations of data has
been proposed as a non-model sharing-type federated learning. This paper
analyzes the accuracy and privacy evaluations of this novel framework. In the
accuracy analysis, we provided sufficient conditions for the equivalence of the
collaborative data analysis and the centralized analysis with dimensionality
reduction. In the privacy analysis, we proved that collaborative users' private
datasets are protected with a double privacy layer against insider and external
attacking scenarios.
Related papers
- Privacy-preserving recommender system using the data collaboration analysis for distributed datasets [2.9061423802698565]
We establish a framework for privacy-preserving recommender systems using the data collaboration analysis of distributed datasets.
Numerical experiments with two public rating datasets demonstrate that our privacy-preserving method for rating prediction can improve the prediction accuracy for distributed datasets.
arXiv Detail & Related papers (2024-05-24T07:43:00Z) - Lazy Data Practices Harm Fairness Research [49.02318458244464]
We present a comprehensive analysis of fair ML datasets, demonstrating how unreflective practices hinder the reach and reliability of algorithmic fairness findings.
Our analyses identify three main areas of concern: (1) a textbflack of representation for certain protected attributes in both data and evaluations; (2) the widespread textbf of minorities during data preprocessing; and (3) textbfopaque data processing threatening the generalization of fairness research.
This study underscores the need for a critical reevaluation of data practices in fair ML and offers directions to improve both the sourcing and usage of datasets.
arXiv Detail & Related papers (2024-04-26T09:51:24Z) - A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing.
Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data.
Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z) - Data Analytics with Differential Privacy [0.0]
We develop differentially private algorithms to analyze distributed and streaming data.
In the distributed model, we consider the particular problem of learning -- in a distributed fashion -- a global model of the data.
We offer one of the strongest privacy guarantees for the streaming model, user-level pan-privacy.
arXiv Detail & Related papers (2023-07-20T17:43:29Z) - Differentially Private Federated Clustering over Non-IID Data [59.611244450530315]
clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server.
We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered.
Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
arXiv Detail & Related papers (2023-01-03T05:38:43Z) - Private Set Generation with Discriminative Information [63.851085173614]
Differentially private data generation is a promising solution to the data privacy challenge.
Existing private generative models are struggling with the utility of synthetic samples.
We introduce a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-07T10:02:55Z) - Non-readily identifiable data collaboration analysis for multiple
datasets including personal information [7.315551060433141]
Data confidentiality and cross-institutional communication are critical for medical datasets.
In this study, the identifiability of the data collaboration analysis is investigated.
The proposed method exhibits a non-readily identifiability while maintaining a high recognition performance.
arXiv Detail & Related papers (2022-08-31T03:19:17Z) - DP2-Pub: Differentially Private High-Dimensional Data Publication with
Invariant Post Randomization [58.155151571362914]
We propose a differentially private high-dimensional data publication mechanism (DP2-Pub) that runs in two phases.
splitting attributes into several low-dimensional clusters with high intra-cluster cohesion and low inter-cluster coupling helps obtain a reasonable privacy budget.
We also extend our DP2-Pub mechanism to the scenario with a semi-honest server which satisfies local differential privacy.
arXiv Detail & Related papers (2022-08-24T17:52:43Z) - Sensitivity analysis in differentially private machine learning using
hybrid automatic differentiation [54.88777449903538]
We introduce a novel textithybrid automatic differentiation (AD) system for sensitivity analysis.
This enables modelling the sensitivity of arbitrary differentiable function compositions, such as the training of neural networks on private data.
Our approach can enable the principled reasoning about privacy loss in the setting of data processing.
arXiv Detail & Related papers (2021-07-09T07:19:23Z) - Interpretable collaborative data analysis on distributed data [9.434133337939498]
This paper proposes an interpretable non-model sharing collaborative data analysis method as one of the federated learning systems.
By centralizing intermediate representations, which are individually constructed in each party, the proposed method obtains an interpretable model.
Numerical experiments indicate that the proposed method achieves better recognition performance for artificial and real-world problems than individual analysis.
arXiv Detail & Related papers (2020-11-09T13:59:32Z) - A Critical Overview of Privacy-Preserving Approaches for Collaborative
Forecasting [0.0]
Cooperation between different data owners may lead to an improvement in forecast quality.
Due to business competitive factors and personal data protection questions, said data owners might be unwilling to share their data.
This paper analyses the state-of-the-art and unveils several shortcomings of existing methods in guaranteeing data privacy.
arXiv Detail & Related papers (2020-04-20T20:21:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.