Data Collaboration Analysis Over Matrix Manifolds
- URL: http://arxiv.org/abs/2403.02780v1
- Date: Tue, 5 Mar 2024 08:52:16 GMT
- Title: Data Collaboration Analysis Over Matrix Manifolds
- Authors: Keiyu Nosaka, Akiko Yoshise
- Abstract summary: Privacy-Preserving Machine Learning (PPML) addresses this challenge by safeguarding sensitive information.
NRI-DC framework emerges as an innovative approach, potentially resolving the 'data island' issue among institutions.
This study establishes a rigorous theoretical foundation for these collaboration functions and introduces new formulations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The effectiveness of machine learning (ML) algorithms is deeply intertwined
with the quality and diversity of their training datasets. Improved datasets,
marked by superior quality, enhance the predictive accuracy and broaden the
applicability of models across varied scenarios. Researchers often integrate
data from multiple sources to mitigate biases and limitations of single-source
datasets. However, this extensive data amalgamation raises significant ethical
concerns, particularly regarding user privacy and the risk of unauthorized data
disclosure. Various global legislative frameworks have been established to
address these privacy issues. While crucial for safeguarding privacy, these
regulations can complicate the practical deployment of ML technologies.
Privacy-Preserving Machine Learning (PPML) addresses this challenge by
safeguarding sensitive information, from health records to geolocation data,
while enabling the secure use of this data in developing robust ML models.
Within this realm, the Non-Readily Identifiable Data Collaboration (NRI-DC)
framework emerges as an innovative approach, potentially resolving the 'data
island' issue among institutions through non-iterative communication and robust
privacy protections. However, in its current state, the NRI-DC framework faces
model performance instability due to theoretical unsteadiness in creating
collaboration functions. This study establishes a rigorous theoretical
foundation for these collaboration functions and introduces new formulations
through optimization problems on matrix manifolds and efficient solutions.
Empirical analyses demonstrate that the proposed approach, particularly the
formulation over orthogonal matrix manifolds, significantly enhances
performance, maintaining consistency and efficiency without compromising
communication efficiency or privacy protections.
Related papers
- The Data Minimization Principle in Machine Learning [61.17813282782266]
Data minimization aims to reduce the amount of data collected, processed or retained.
It has been endorsed by various global data protection regulations.
However, its practical implementation remains a challenge due to the lack of a rigorous formulation.
arXiv Detail & Related papers (2024-05-29T19:40:27Z) - Synergizing Privacy and Utility in Data Analytics Through Advanced Information Theorization [2.28438857884398]
We introduce three sophisticated algorithms: a Noise-Infusion Technique tailored for high-dimensional image data, a Variational Autoencoder (VAE) for robust feature extraction and an Expectation Maximization (EM) approach optimized for structured data privacy.
Our methods significantly reduce mutual information between sensitive attributes and transformed data, thereby enhancing privacy.
The research contributes to the field by providing a flexible and effective strategy for deploying privacy-preserving algorithms across various data types.
arXiv Detail & Related papers (2024-04-24T22:58:42Z) - New Solutions Based on the Generalized Eigenvalue Problem for the Data Collaboration Analysis [10.58933333850352]
Data Collaboration Analysis (DCA) is noted for its efficiency in terms of computational cost and communication load.
Existing optimization problems for determining the necessary collaborative functions have faced challenges.
This research addresses these issues by formulating the optimization problem through the segmentation of matrices into column vectors.
arXiv Detail & Related papers (2024-04-22T13:26:42Z) - State-of-the-Art Approaches to Enhancing Privacy Preservation of Machine Learning Datasets: A Survey [0.0]
This paper examines the evolving landscape of machine learning (ML) and its profound impact across various sectors.
It focuses on the emerging field of Privacy-preserving Machine Learning (PPML)
As ML applications become increasingly integral to industries like telecommunications, financial technology, and surveillance, they raise significant privacy concerns.
arXiv Detail & Related papers (2024-02-25T17:31:06Z) - Privacy-preserving Federated Primal-dual Learning for Non-convex and Non-smooth Problems with Model Sparsification [51.04894019092156]
Federated learning (FL) has been recognized as a rapidly growing area, where the model is trained over clients under the FL orchestration (PS)
In this paper, we propose a novel primal sparification algorithm for and guarantee non-smooth FL problems.
Its unique insightful properties and its analyses are also presented.
arXiv Detail & Related papers (2023-10-30T14:15:47Z) - A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing.
Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data.
Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z) - Auditing and Generating Synthetic Data with Controllable Trust Trade-offs [54.262044436203965]
We introduce a holistic auditing framework that comprehensively evaluates synthetic datasets and AI models.
It focuses on preventing bias and discrimination, ensures fidelity to the source data, assesses utility, robustness, and privacy preservation.
We demonstrate the framework's effectiveness by auditing various generative models across diverse use cases.
arXiv Detail & Related papers (2023-04-21T09:03:18Z) - Decentralized Stochastic Optimization with Inherent Privacy Protection [103.62463469366557]
Decentralized optimization is the basic building block of modern collaborative machine learning, distributed estimation and control, and large-scale sensing.
Since involved data, privacy protection has become an increasingly pressing need in the implementation of decentralized optimization algorithms.
arXiv Detail & Related papers (2022-05-08T14:38:23Z) - Efficient Logistic Regression with Local Differential Privacy [0.0]
Internet of Things devices are expanding rapidly and generating huge amount of data.
There is an increasing need to explore data collected from these devices.
Collaborative learning provides a strategic solution for the Internet of Things settings but also raises public concern over data privacy.
arXiv Detail & Related papers (2022-02-05T22:44:03Z) - Linear Model with Local Differential Privacy [0.225596179391365]
Privacy preserving techniques have been widely studied to analyze distributed data across different agencies.
Secure multiparty computation has been widely studied for privacy protection with high privacy level but intense cost.
matrix masking technique is applied to encrypt data such that the secure schemes are against malicious adversaries.
arXiv Detail & Related papers (2022-02-05T01:18:00Z) - Distributed Machine Learning and the Semblance of Trust [66.1227776348216]
Federated Learning (FL) allows the data owner to maintain data governance and perform model training locally without having to share their data.
FL and related techniques are often described as privacy-preserving.
We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind.
arXiv Detail & Related papers (2021-12-21T08:44:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.