A collaborative ensemble construction method for federated random forest
- URL: http://arxiv.org/abs/2407.19193v1
- Date: Sat, 27 Jul 2024 07:21:45 GMT
- Title: A collaborative ensemble construction method for federated random forest
- Authors: Penjan Antonio Eng Lim, Cheong Hee Park,
- Abstract summary: This paper presents a federated random forest approach that employs a novel ensemble construction method aimed at improving performance under non-IID data.
To preserve the privacy of the client's data, we confine the information stored in the leaf nodes to the majority class label identified from the samples of the client's local data that reach each node.
- Score: 3.245822581039027
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Random forests are considered a cornerstone in machine learning for their robustness and versatility. Despite these strengths, their conventional centralized training is ill-suited for the modern landscape of data that is often distributed, sensitive, and subject to privacy concerns. Federated learning (FL) provides a compelling solution to this problem, enabling models to be trained across a group of clients while maintaining the privacy of each client's data. However, adapting tree-based methods like random forests to federated settings introduces significant challenges, particularly when it comes to non-identically distributed (non-IID) data across clients, which is a common scenario in real-world applications. This paper presents a federated random forest approach that employs a novel ensemble construction method aimed at improving performance under non-IID data. Instead of growing trees independently in each client, our approach ensures each decision tree in the ensemble is iteratively and collectively grown across clients. To preserve the privacy of the client's data, we confine the information stored in the leaf nodes to the majority class label identified from the samples of the client's local data that reach each node. This limited disclosure preserves the confidentiality of the underlying data distribution of clients, thereby enhancing the privacy of the federated learning process. Furthermore, our collaborative ensemble construction strategy allows the ensemble to better reflect the data's heterogeneity across different clients, enhancing its performance on non-IID data, as our experimental results confirm.
Related papers
- Decoupled Subgraph Federated Learning [57.588938805581044]
We address the challenge of federated learning on graph-structured data distributed across multiple clients.
We present a novel framework for this scenario, named FedStruct, that harnesses deep structural dependencies.
We validate the effectiveness of FedStruct through experimental results conducted on six datasets for semi-supervised node classification.
arXiv Detail & Related papers (2024-02-29T13:47:23Z) - DCFL: Non-IID awareness Data Condensation aided Federated Learning [0.8158530638728501]
Federated learning is a decentralized learning paradigm wherein a central server trains a global model iteratively by utilizing clients who possess a certain amount of private datasets.
The challenge lies in the fact that the client side private data may not be identically and independently distributed.
We propose DCFL which divides clients into groups by using the Centered Kernel Alignment (CKA) method, then uses dataset condensation methods with non-IID awareness to complete clients.
arXiv Detail & Related papers (2023-12-21T13:04:24Z) - Personalized Privacy-Preserving Framework for Cross-Silo Federated
Learning [0.0]
Federated learning (FL) is a promising decentralized deep learning (DL) framework that enables DL-based approaches trained collaboratively across clients without sharing private data.
In this paper, we propose a novel framework, namely Personalized Privacy-Preserving Federated Learning (PPPFL)
Our proposed framework outperforms multiple FL baselines on different datasets, including MNIST, Fashion-MNIST, CIFAR-10, and CIFAR-100.
arXiv Detail & Related papers (2023-02-22T07:24:08Z) - Knowledge-Aware Federated Active Learning with Non-IID Data [75.98707107158175]
We propose a federated active learning paradigm to efficiently learn a global model with limited annotation budget.
The main challenge faced by federated active learning is the mismatch between the active sampling goal of the global model on the server and that of the local clients.
We propose Knowledge-Aware Federated Active Learning (KAFAL), which consists of Knowledge-Specialized Active Sampling (KSAS) and Knowledge-Compensatory Federated Update (KCFU)
arXiv Detail & Related papers (2022-11-24T13:08:43Z) - Federated Learning with GAN-based Data Synthesis for Non-IID Clients [8.304185807036783]
Federated learning (FL) has recently emerged as a popular privacy-preserving collaborative learning paradigm.
We propose a novel framework, named Synthetic Data Aided Federated Learning (SDA-FL), to resolve this non-IID challenge by sharing synthetic data.
arXiv Detail & Related papers (2022-06-11T11:43:25Z) - Straggler-Resilient Personalized Federated Learning [55.54344312542944]
Federated learning allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions.
We develop a novel algorithmic procedure with theoretical speedup guarantees that simultaneously handles two of these hurdles.
Our method relies on ideas from representation learning theory to find a global common representation using all clients' data and learn a user-specific set of parameters leading to a personalized solution for each client.
arXiv Detail & Related papers (2022-06-05T01:14:46Z) - Federated Learning in Non-IID Settings Aided by Differentially Private
Synthetic Data [20.757477553095637]
Federated learning (FL) is a privacy-promoting framework that enables clients to collaboratively train machine learning models.
A major challenge in federated learning arises when the local data is heterogeneous.
We propose FedDPMS, an FL algorithm in which clients deploy variational auto-encoders to augment local datasets with data synthesized using differentially private means of latent data representations.
arXiv Detail & Related papers (2022-06-01T18:00:48Z) - FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling
and Correction [48.85303253333453]
Federated learning (FL) allows multiple clients to collectively train a high-performance global model without sharing their private data.
We propose a novel federated learning algorithm with local drift decoupling and correction (FedDC)
Our FedDC only introduces lightweight modifications in the local training phase, in which each client utilizes an auxiliary local drift variable to track the gap between the local model parameter and the global model parameters.
Experiment results and analysis demonstrate that FedDC yields expediting convergence and better performance on various image classification tasks.
arXiv Detail & Related papers (2022-03-22T14:06:26Z) - IFedAvg: Interpretable Data-Interoperability for Federated Learning [39.388223565330385]
In this work, we define and address low interoperability induced by underlying client data inconsistencies in federated learning for tabular data.
The proposed method, iFedAvg, builds on federated averaging adding local element-wise affine layers to allow for a personalized and granular understanding of the collaborative learning process.
We evaluate iFedAvg using several public benchmarks and a collection of real-world datasets from the 2014 - 2016 West African Ebola epidemic, jointly forming the largest such dataset in the world.
arXiv Detail & Related papers (2021-07-14T09:54:00Z) - Exploiting Shared Representations for Personalized Federated Learning [54.65133770989836]
We propose a novel federated learning framework and algorithm for learning a shared data representation across clients and unique local heads for each client.
Our algorithm harnesses the distributed computational power across clients to perform many local-updates with respect to the low-dimensional local parameters for every update of the representation.
This result is of interest beyond federated learning to a broad class of problems in which we aim to learn a shared low-dimensional representation among data distributions.
arXiv Detail & Related papers (2021-02-14T05:36:25Z) - Decentralised Learning from Independent Multi-Domain Labels for Person
Re-Identification [69.29602103582782]
Deep learning has been successful for many computer vision tasks due to the availability of shared and centralised large-scale training data.
However, increasing awareness of privacy concerns poses new challenges to deep learning, especially for person re-identification (Re-ID)
We propose a novel paradigm called Federated Person Re-Identification (FedReID) to construct a generalisable global model (a central server) by simultaneously learning with multiple privacy-preserved local models (local clients)
This client-server collaborative learning process is iteratively performed under privacy control, enabling FedReID to realise decentralised learning without sharing distributed data nor collecting any
arXiv Detail & Related papers (2020-06-07T13:32:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.