Learn2Mix: Training Neural Networks Using Adaptive Data Integration
- URL: http://arxiv.org/abs/2412.16482v2
- Date: Thu, 13 Feb 2025 21:25:23 GMT
- Title: Learn2Mix: Training Neural Networks Using Adaptive Data Integration
- Authors: Shyam Venkatasubramanian, Vahid Tarokh,
- Abstract summary: learn2mix is a novel training strategy that adaptively adjusts class proportions within batches, focusing on classes with higher error rates.<n> Empirical evaluations conducted on benchmark datasets show that neural networks trained with learn2mix converge faster than those trained with existing approaches.
- Score: 24.082008483056462
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accelerating model convergence within resource-constrained environments is critical to ensure fast and efficient neural network training. This work presents learn2mix, a novel training strategy that adaptively adjusts class proportions within batches, focusing on classes with higher error rates. Unlike classical training methods that use static class proportions, learn2mix continually adapts class proportions during training, leading to faster convergence. Empirical evaluations conducted on benchmark datasets show that neural networks trained with learn2mix converge faster than those trained with existing approaches, achieving improved results for classification, regression, and reconstruction tasks under limited training resources and with imbalanced classes. Our empirical findings are supported by theoretical analysis.
Related papers
- OmniLearn: A Framework for Distributed Deep Learning over Heterogeneous Clusters [1.4131700241686853]
We develop an adaptive batch-scaling framework called OmniLearn to mitigate the effects of heterogeneous resources.
Our approach is inspired by proportional controllers to balance across heterogeneous servers, and works under varying resource availability.
arXiv Detail & Related papers (2025-03-21T18:26:24Z) - Adaptive Class Emergence Training: Enhancing Neural Network Stability and Generalization through Progressive Target Evolution [0.0]
We propose a novel training methodology for neural networks in classification problems.
We evolve the target outputs from a null vector to one-hot encoded vectors throughout the training process.
This gradual transition allows the network to adapt more smoothly to the increasing complexity of the classification task.
arXiv Detail & Related papers (2024-09-04T03:25:48Z) - Stability and Generalization in Free Adversarial Training [9.831489366502302]
We analyze the interconnections between generalization and optimization in adversarial training using the algorithmic stability framework.
We compare the generalization gap of neural networks trained using the vanilla adversarial training method and the free adversarial training method.
Our empirical findings suggest that the free adversarial training method could lead to a smaller generalization gap over a similar number of training iterations.
arXiv Detail & Related papers (2024-04-13T12:07:20Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Simplifying Neural Network Training Under Class Imbalance [77.39968702907817]
Real-world datasets are often highly class-imbalanced, which can adversely impact the performance of deep learning models.
The majority of research on training neural networks under class imbalance has focused on specialized loss functions, sampling techniques, or two-stage training procedures.
We demonstrate that simply tuning existing components of standard deep learning pipelines, such as the batch size, data augmentation, and label smoothing, can achieve state-of-the-art performance without any such specialized class imbalance methods.
arXiv Detail & Related papers (2023-12-05T05:52:44Z) - Accurate Neural Network Pruning Requires Rethinking Sparse Optimization [87.90654868505518]
We show the impact of high sparsity on model training using the standard computer vision and natural language processing sparsity benchmarks.
We provide new approaches for mitigating this issue for both sparse pre-training of vision models and sparse fine-tuning of language models.
arXiv Detail & Related papers (2023-08-03T21:49:14Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Efficient Augmentation for Imbalanced Deep Learning [8.38844520504124]
We study a convolutional neural network's internal representation of imbalanced image data.
We measure the generalization gap between a model's feature embeddings in the training and test sets, showing that the gap is wider for minority classes.
This insight enables us to design an efficient three-phase CNN training framework for imbalanced data.
arXiv Detail & Related papers (2022-07-13T09:43:17Z) - Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z) - Training Sparse Neural Networks using Compressed Sensing [13.84396596420605]
We develop and test a novel method based on compressed sensing which combines the pruning and training into a single step.
Specifically, we utilize an adaptively weighted $ell1$ penalty on the weights during training, which we combine with a generalization of the regularized dual averaging (RDA) algorithm in order to train sparse neural networks.
arXiv Detail & Related papers (2020-08-21T19:35:54Z) - Step-Ahead Error Feedback for Distributed Training with Compressed
Gradient [99.42912552638168]
We show that a new "gradient mismatch" problem is raised by the local error feedback in centralized distributed training.
We propose two novel techniques, 1) step ahead and 2) error averaging, with rigorous theoretical analysis.
arXiv Detail & Related papers (2020-08-13T11:21:07Z) - A Hybrid Method for Training Convolutional Neural Networks [3.172761915061083]
We propose a hybrid method that uses both backpropagation and evolutionary strategies to train Convolutional Neural Networks.
We show that the proposed hybrid method is capable of improving upon regular training in the task of image classification.
arXiv Detail & Related papers (2020-04-15T17:52:48Z) - Understanding the Effects of Data Parallelism and Sparsity on Neural
Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity.
Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.