Related papers: When to Stop Federated Learning: Zero-Shot Generation of Synthetic Validation Data with Generative AI for Early Stopping

When to Stop Federated Learning: Zero-Shot Generation of Synthetic Validation Data with Generative AI for Early Stopping

URL: http://arxiv.org/abs/2511.11208v1
Date: Fri, 14 Nov 2025 12:07:32 GMT
Title: When to Stop Federated Learning: Zero-Shot Generation of Synthetic Validation Data with Generative AI for Early Stopping
Authors: Youngjoon Lee, Hyukjoon Lee, Jinu Gong, Yang Cao, Joonhyuk Kang,
Abstract summary: Federated Learning (FL) enables collaborative model training across decentralized devices.<n>We introduce a zero-shot synthetic validation framework that leverages generative AI to monitor model performance.<n>Our approach adaptively stops training near the optimal round, thereby conserving computational resources.
Score: 5.0740578889286105
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Federated Learning (FL) enables collaborative model training across decentralized devices while preserving data privacy. However, FL methods typically run for a predefined number of global rounds, often leading to unnecessary computation when optimal performance is reached earlier. In addition, training may continue even when the model fails to achieve meaningful performance. To address this inefficiency, we introduce a zero-shot synthetic validation framework that leverages generative AI to monitor model performance and determine early stopping points. Our approach adaptively stops training near the optimal round, thereby conserving computational resources and enabling rapid hyperparameter adjustments. Numerical results on multi-label chest X-ray classification demonstrate that our method reduces training rounds by up to 74% while maintaining accuracy within 1% of the optimal.

Related papers

Beyond Fixed Rounds: Data-Free Early Stopping for Practical Federated Learning [4.643684319119214]
Federated Learning (FL) facilitates decentralized collaborative learning without transmitting raw data.<n>We propose a data-free early stopping framework that determines the optimal stopping point by monitoring the task vector's growth rate using solely server-side parameters.
arXiv Detail & Related papers (2026-01-30T07:42:13Z)
FLOP-Efficient Training: Early Stopping Based on Test-Time Compute Awareness [5.2612663135589175]
Scaling training compute, measured in FLOPs, has long been shown to improve the accuracy of large language models.<n>We introduce TTC-aware training, where an intermediate checkpoint and a corresponding TTC configuration can together match or exceed the accuracy of a fully trained model.<n>Building on this insight, we propose an early stopping algorithm that jointly selects a checkpoint and TTC configuration to minimize training compute without sacrificing accuracy.
arXiv Detail & Related papers (2026-01-04T02:33:30Z)
FedEL: Federated Elastic Learning for Heterogeneous Devices [14.499606660793239]
Federated learning (FL) enables distributed devices to collaboratively train machine learning models while maintaining data privacy.<n>Existing solutions such as client selection, asynchronous FL, and partial training partially address these challenges but encounter issues such as reduced accuracy, stale updates, and compromised model performance due to inconsistent training contributions.<n>We propose FedEL, a federated elastic learning framework that enhances training efficiency while maintaining model accuracy.
arXiv Detail & Related papers (2025-09-21T03:25:46Z)
Instance-dependent Early Stopping [57.912273923450726]
We propose an Instance-dependent Early Stopping (IES) method that adapts the early stopping mechanism from the entire training set to the instance level.<n>IES considers an instance as mastered if the second-order differences of its loss value remain within a small range around zero.<n>IES can reduce backpropagation instances by 10%-50% while maintaining or even slightly improving the test accuracy and transfer learning performance of a model.
arXiv Detail & Related papers (2025-02-11T13:34:09Z)
Feasible Learning [78.6167929413604]
We introduce Feasible Learning (FL), a sample-centric learning paradigm where models are trained by solving a feasibility problem that bounds the loss for each training sample.<n>Our empirical analysis, spanning image classification, age regression, and preference optimization in large language models, demonstrates that models trained via FL can learn from data while displaying improved tail behavior compared to ERM, with only a marginal impact on average performance.
arXiv Detail & Related papers (2025-01-24T20:39:38Z)
The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws [51.608402959163925]
We present the first systematic exploration of optimal sparse pre-training configurations for large language models.<n>We find that initiating pruning at 25% of total training compute and concluding at 75% achieves near-optimal final evaluation loss.<n>We propose a new scaling law that modifies the Chinchilla scaling law to use the average parameter count over pre-training.
arXiv Detail & Related papers (2025-01-21T20:23:22Z)
Self-Contrastive Forward-Forward Algorithm [3.1361717406527667]
Forward-Forward (FF) algorithm relies on feedforward operations to optimize layer-wise objectives.<n>FF has failed to reach state-of-the-art performance on most standard benchmark tasks.<n>We propose Self-Contrastive Forward-Forward (SCFF) algorithm, a competitive training method aimed at closing this performance gap.
arXiv Detail & Related papers (2024-09-17T22:58:20Z)
Fast-Convergent Federated Learning via Cyclic Aggregation [10.658882342481542]
Federated learning (FL) aims at optimizing a shared global model over multiple edge devices without transmitting (private) data to the central server. This paper utilizes cyclic learning rate at the server side to reduce the number of training iterations with increased performance. Numerical results validate that, simply plugging-in the proposed cyclic aggregation to the existing FL algorithms effectively reduces the number of training iterations with improved performance.
arXiv Detail & Related papers (2022-10-29T07:20:59Z)
FINETUNA: Fine-tuning Accelerated Molecular Simulations [5.543169726358164]
We present an online active learning framework for accelerating the simulation of atomic systems efficiently and accurately. A method of transfer learning to incorporate prior information from pre-trained models accelerates simulations by reducing the number of DFT calculations by 91%. Experiments on 30 benchmark adsorbate-catalyst systems show that our method of transfer learning to incorporate prior information from pre-trained models accelerates simulations by reducing the number of DFT calculations by 91%.
arXiv Detail & Related papers (2022-05-02T21:36:01Z)
A Simple Fine-tuning Is All You Need: Towards Robust Deep Learning Via Adversarial Fine-tuning [90.44219200633286]
We propose a simple yet very effective adversarial fine-tuning approach based on a $textitslow start, fast decay$ learning rate scheduling strategy. Experimental results show that the proposed adversarial fine-tuning approach outperforms the state-of-the-art methods on CIFAR-10, CIFAR-100 and ImageNet datasets.
arXiv Detail & Related papers (2020-12-25T20:50:15Z)
Predicting Training Time Without Training [120.92623395389255]
We tackle the problem of predicting the number of optimization steps that a pre-trained deep network needs to converge to a given value of the loss function. We leverage the fact that the training dynamics of a deep network during fine-tuning are well approximated by those of a linearized model. We are able to predict the time it takes to fine-tune a model to a given loss without having to perform any training.
arXiv Detail & Related papers (2020-08-28T04:29:54Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.