Privacy-Preserving Data Fusion for Traffic State Estimation: A Vertical
Federated Learning Approach
- URL: http://arxiv.org/abs/2401.11836v1
- Date: Mon, 22 Jan 2024 10:52:22 GMT
- Title: Privacy-Preserving Data Fusion for Traffic State Estimation: A Vertical
Federated Learning Approach
- Authors: Qiqing Wang, Kaidi Yang
- Abstract summary: We propose a privacy-preserving data fusion method for traffic state estimation (TSE)
We explicitly address data privacy concerns that arise in the collaboration and data sharing between multiple data owners.
- Score: 3.109306676759862
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper proposes a privacy-preserving data fusion method for traffic state
estimation (TSE). Unlike existing works that assume all data sources to be
accessible by a single trusted party, we explicitly address data privacy
concerns that arise in the collaboration and data sharing between multiple data
owners, such as municipal authorities (MAs) and mobility providers (MPs). To
this end, we propose a novel vertical federated learning (FL) approach, FedTSE,
that enables multiple data owners to collaboratively train and apply a TSE
model without having to exchange their private data. To enhance the
applicability of the proposed FedTSE in common TSE scenarios with limited
availability of ground-truth data, we further propose a privacy-preserving
physics-informed FL approach, i.e., FedTSE-PI, that integrates traffic models
into FL. Real-world data validation shows that the proposed methods can protect
privacy while yielding similar accuracy to the oracle method without privacy
considerations.
Related papers
- Towards Split Learning-based Privacy-Preserving Record Linkage [49.1574468325115]
Split Learning has been introduced to facilitate applications where user data privacy is a requirement.
In this paper, we investigate the potentials of Split Learning for Privacy-Preserving Record Matching.
arXiv Detail & Related papers (2024-09-02T09:17:05Z) - Approximate Gradient Coding for Privacy-Flexible Federated Learning with Non-IID Data [9.984630251008868]
This work focuses on the challenges of non-IID data and stragglers/dropouts in federated learning.
We introduce and explore a privacy-flexible paradigm that models parts of the clients' local data as non-private.
arXiv Detail & Related papers (2024-04-04T15:29:50Z) - FewFedPIT: Towards Privacy-preserving and Few-shot Federated Instruction Tuning [54.26614091429253]
Federated instruction tuning (FedIT) is a promising solution, by consolidating collaborative training across multiple data owners.
FedIT encounters limitations such as scarcity of instructional data and risk of exposure to training data extraction attacks.
We propose FewFedPIT, designed to simultaneously enhance privacy protection and model performance of federated few-shot learning.
arXiv Detail & Related papers (2024-03-10T08:41:22Z) - CaPS: Collaborative and Private Synthetic Data Generation from Distributed Sources [5.898893619901382]
We propose a framework for the collaborative and private generation of synthetic data from distributed data holders.
We replace the trusted aggregator with secure multi-party computation protocols and output privacy via differential privacy (DP)
We demonstrate the applicability and scalability of our approach for the state-of-the-art select-measure-generate algorithms MWEM+PGM and AIM.
arXiv Detail & Related papers (2024-02-13T17:26:32Z) - Federated Learning Empowered by Generative Content [55.576885852501775]
Federated learning (FL) enables leveraging distributed private data for model training in a privacy-preserving way.
We propose a novel FL framework termed FedGC, designed to mitigate data heterogeneity issues by diversifying private data with generative content.
We conduct a systematic empirical study on FedGC, covering diverse baselines, datasets, scenarios, and modalities.
arXiv Detail & Related papers (2023-12-10T07:38:56Z) - Using Decentralized Aggregation for Federated Learning with Differential
Privacy [0.32985979395737774]
Federated Learning (FL) provides some level of privacy by retaining the data at the local node.
This research deploys an experimental environment for FL with Differential Privacy (DP) using benchmark datasets.
arXiv Detail & Related papers (2023-11-27T17:02:56Z) - A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing.
Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data.
Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z) - PS-FedGAN: An Efficient Federated Learning Framework Based on Partially
Shared Generative Adversarial Networks For Data Privacy [56.347786940414935]
Federated Learning (FL) has emerged as an effective learning paradigm for distributed computation.
This work proposes a novel FL framework that requires only partial GAN model sharing.
Named as PS-FedGAN, this new framework enhances the GAN releasing and training mechanism to address heterogeneous data distributions.
arXiv Detail & Related papers (2023-05-19T05:39:40Z) - Benchmarking FedAvg and FedCurv for Image Classification Tasks [1.376408511310322]
This paper focuses on the problem of statistical heterogeneity of the data in the same federated network.
Several Federated Learning algorithms, such as FedAvg, FedProx and Federated Curvature (FedCurv) have already been proposed.
As a side product of this work, we release the non-IID version of the datasets we used so to facilitate further comparisons from the FL community.
arXiv Detail & Related papers (2023-03-31T10:13:01Z) - Distributed Machine Learning and the Semblance of Trust [66.1227776348216]
Federated Learning (FL) allows the data owner to maintain data governance and perform model training locally without having to share their data.
FL and related techniques are often described as privacy-preserving.
We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind.
arXiv Detail & Related papers (2021-12-21T08:44:05Z) - Data-driven Regularized Inference Privacy [33.71757542373714]
We propose a data-driven inference privacy preserving framework to sanitize data.
We develop an inference privacy framework based on the variational method.
We present empirical methods to estimate the privacy metric.
arXiv Detail & Related papers (2020-10-10T08:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.