Principled Federated Random Forests for Heterogeneous Data
- URL: http://arxiv.org/abs/2602.03258v1
- Date: Tue, 03 Feb 2026 08:41:59 GMT
- Title: Principled Federated Random Forests for Heterogeneous Data
- Authors: Rémi Khellaf, Erwan Scornet, Aurélien Bellet, Julie Josse,
- Abstract summary: We propose FedForest, a new federated Random Forest algorithm for horizontally partitioned data.<n>We prove that our splitting procedure, based on aggregating carefully chosen client statistics, closely approximates the split selected by a centralized algorithm.<n>We also show that FedForest allows splits on client indicators, enabling a non-parametric form of personalization.
- Score: 19.544248463207612
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Random Forests (RF) are among the most powerful and widely used predictive models for centralized tabular data, yet few methods exist to adapt them to the federated learning setting. Unlike most federated learning approaches, the piecewise-constant nature of RF prevents exact gradient-based optimization. As a result, existing federated RF implementations rely on unprincipled heuristics: for instance, aggregating decision trees trained independently on clients fails to optimize the global impurity criterion, even under simple distribution shifts. We propose FedForest, a new federated RF algorithm for horizontally partitioned data that naturally accommodates diverse forms of client data heterogeneity, from covariate shift to more complex outcome shift mechanisms. We prove that our splitting procedure, based on aggregating carefully chosen client statistics, closely approximates the split selected by a centralized algorithm. Moreover, FedForest allows splits on client indicators, enabling a non-parametric form of personalization that is absent from prior federated random forest methods. Empirically, we demonstrate that the resulting federated forests closely match centralized performance across heterogeneous benchmarks while remaining communication-efficient.
Related papers
- Federated Learning Meets LLMs: Feature Extraction From Heterogeneous Clients [0.0]
Federated learning (FL) enables collaborative model training without sharing raw data.<n>We propose FedLLM-Align, a framework that leverages pre-trained large language models (LLMs) as universal feature extractors.<n>We evaluate FedLLM-Align on coronary heart disease prediction using partitioned datasets with simulated schema divergence.
arXiv Detail & Related papers (2025-09-29T14:06:52Z) - FedDuA: Doubly Adaptive Federated Learning [2.6108066206600555]
Federated learning is a distributed learning framework where clients collaboratively train a global model without sharing their raw data.<n>We formalize the central server optimization procedure through the lens of mirror descent and propose a novel framework, called FedDuA.<n>We prove that our proposed doubly adaptive step-size rule is minimax optimal and provide a convergence analysis for convex objectives.
arXiv Detail & Related papers (2025-05-16T11:15:27Z) - Client-Centric Federated Adaptive Optimization [78.30827455292827]
Federated Learning (FL) is a distributed learning paradigm where clients collaboratively train a model while keeping their own data private.<n>We propose Federated-Centric Adaptive Optimization, which is a class of novel federated optimization approaches.
arXiv Detail & Related papers (2025-01-17T04:00:50Z) - Decentralized Directed Collaboration for Personalized Federated Learning [39.29794569421094]
We concentrate on the Decentralized Personalized Learning (DPFL) that performs distributed training model computation.
We propose a directed collaboration framework by incorporating textbfDecentralized textbfFederated textbfPartial textbfGradient textbfPedGP.
arXiv Detail & Related papers (2024-05-28T06:52:19Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - Adaptive Federated Learning via New Entropy Approach [14.595709494370372]
Federated Learning (FL) has emerged as a prominent distributed machine learning framework.
In this paper, we propose an adaptive FEDerated learning algorithm based on ENTropy theory (FedEnt) to alleviate the parameter deviation among heterogeneous clients.
arXiv Detail & Related papers (2023-03-27T07:57:04Z) - FedSkip: Combatting Statistical Heterogeneity with Federated Skip
Aggregation [95.85026305874824]
We introduce a data-driven approach called FedSkip to improve the client optima by periodically skipping federated averaging and scattering local models to the cross devices.
We conduct extensive experiments on a range of datasets to demonstrate that FedSkip achieves much higher accuracy, better aggregation efficiency and competing communication efficiency.
arXiv Detail & Related papers (2022-12-14T13:57:01Z) - Beyond ADMM: A Unified Client-variance-reduced Adaptive Federated
Learning Framework [82.36466358313025]
We propose a primal-dual FL algorithm, termed FedVRA, that allows one to adaptively control the variance-reduction level and biasness of the global model.
Experiments based on (semi-supervised) image classification tasks demonstrate superiority of FedVRA over the existing schemes.
arXiv Detail & Related papers (2022-12-03T03:27:51Z) - Fed-CBS: A Heterogeneity-Aware Client Sampling Mechanism for Federated
Learning via Class-Imbalance Reduction [76.26710990597498]
We show that the class-imbalance of the grouped data from randomly selected clients can lead to significant performance degradation.
Based on our key observation, we design an efficient client sampling mechanism, i.e., Federated Class-balanced Sampling (Fed-CBS)
In particular, we propose a measure of class-imbalance and then employ homomorphic encryption to derive this measure in a privacy-preserving way.
arXiv Detail & Related papers (2022-09-30T05:42:56Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - Robustness and Personalization in Federated Learning: A Unified Approach
via Regularization [4.7234844467506605]
We present a class of methods for robust, personalized federated learning, called Fed+.
The principal advantage of Fed+ is to better accommodate the real-world characteristics found in federated training.
We demonstrate the benefits of Fed+ through extensive experiments on benchmark datasets.
arXiv Detail & Related papers (2020-09-14T10:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.