Privacy Preserving Vertical Federated Learning for Tree-based Models
- URL: http://arxiv.org/abs/2008.06170v1
- Date: Fri, 14 Aug 2020 02:32:36 GMT
- Title: Privacy Preserving Vertical Federated Learning for Tree-based Models
- Authors: Yuncheng Wu, Shaofeng Cai, Xiaokui Xiao, Gang Chen, Beng Chin Ooi
- Abstract summary: Federated learning enables multiple organizations to jointly train a model without revealing their private data to each other.
We propose Pivot, a novel solution for privacy preserving vertical decision tree training and prediction.
- Score: 30.808567035503994
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) is an emerging paradigm that enables multiple
organizations to jointly train a model without revealing their private data to
each other. This paper studies {\it vertical} federated learning, which tackles
the scenarios where (i) collaborating organizations own data of the same set of
users but with disjoint features, and (ii) only one organization holds the
labels. We propose Pivot, a novel solution for privacy preserving vertical
decision tree training and prediction, ensuring that no intermediate
information is disclosed other than those the clients have agreed to release
(i.e., the final tree model and the prediction output). Pivot does not rely on
any trusted third party and provides protection against a semi-honest adversary
that may compromise $m-1$ out of $m$ clients. We further identify two privacy
leakages when the trained decision tree model is released in plaintext and
propose an enhanced protocol to mitigate them. The proposed solution can also
be extended to tree ensemble models, e.g., random forest (RF) and gradient
boosting decision tree (GBDT) by treating single decision trees as building
blocks. Theoretical and experimental analysis suggest that Pivot is efficient
for the privacy achieved.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - A collaborative ensemble construction method for federated random forest [3.245822581039027]
This paper presents a federated random forest approach that employs a novel ensemble construction method aimed at improving performance under non-IID data.
To preserve the privacy of the client's data, we confine the information stored in the leaf nodes to the majority class label identified from the samples of the client's local data that reach each node.
arXiv Detail & Related papers (2024-07-27T07:21:45Z) - Federated Face Forgery Detection Learning with Personalized Representation [63.90408023506508]
Deep generator technology can produce high-quality fake videos that are indistinguishable, posing a serious social threat.
Traditional forgery detection methods directly centralized training on data.
The paper proposes a novel federated face forgery detection learning with personalized representation.
arXiv Detail & Related papers (2024-06-17T02:20:30Z) - An Interpretable Client Decision Tree Aggregation process for Federated Learning [7.8973037023478785]
We propose an Interpretable Client Decision Tree aggregation process for Federated Learning scenarios.
This model is based on aggregating multiple decision paths of the decision trees and can be used on different decision tree types, such as ID3 and CART.
We carry out the experiments within four datasets, and the analysis shows that the tree built with the model improves the local models, and outperforms the state-of-the-art.
arXiv Detail & Related papers (2024-04-03T06:53:56Z) - Effective and Efficient Federated Tree Learning on Hybrid Data [80.31870543351918]
We propose HybridTree, a novel federated learning approach that enables federated tree learning on hybrid data.
We observe the existence of consistent split rules in trees and show that the knowledge of parties can be incorporated into the lower layers of a tree.
Our experiments demonstrate that HybridTree can achieve comparable accuracy to the centralized setting with low computational and communication overhead.
arXiv Detail & Related papers (2023-10-18T10:28:29Z) - Differentially-Private Decision Trees and Provable Robustness to Data
Poisoning [8.649768969060647]
Decision trees are interpretable models that are well-suited to non-linear learning problems.
Current state-of-the-art algorithms for this purpose sacrifice much utility for a small privacy benefit.
We propose PrivaTree based on private histograms that chooses good splits while consuming a small privacy budget.
arXiv Detail & Related papers (2023-05-24T17:56:18Z) - Client-specific Property Inference against Secure Aggregation in
Federated Learning [52.8564467292226]
Federated learning has become a widely used paradigm for collaboratively training a common model among different participants.
Many attacks have shown that it is still possible to infer sensitive information such as membership, property, or outright reconstruction of participant data.
We show that simple linear models can effectively capture client-specific properties only from the aggregated model updates.
arXiv Detail & Related papers (2023-03-07T14:11:01Z) - Federated Boosted Decision Trees with Differential Privacy [24.66980518231163]
We propose a general framework that captures and extends existing approaches for differentially private decision trees.
We show that with a careful choice of techniques it is possible to achieve very high utility while maintaining strong levels of privacy.
arXiv Detail & Related papers (2022-10-06T13:28:29Z) - Fed-EINI: An Efficient and Interpretable Inference Framework for
Decision Tree Ensembles in Federated Learning [11.843365055516566]
Fed-EINI is an efficient and interpretable inference framework for federated decision tree models.
We propose to protect the decision path by the efficient additively homomorphic encryption method.
Experiments show that the inference efficiency is improved by over $50%$ in average.
arXiv Detail & Related papers (2021-05-20T06:40:05Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - Toward Understanding the Influence of Individual Clients in Federated
Learning [52.07734799278535]
Federated learning allows clients to jointly train a global model without sending their private data to a central server.
We defined a new notion called em-Influence, quantify this influence over parameters, and proposed an effective efficient model to estimate this metric.
arXiv Detail & Related papers (2020-12-20T14:34:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.