Balanced Training for Sparse GANs
- URL: http://arxiv.org/abs/2302.14670v2
- Date: Sat, 18 Nov 2023 17:39:52 GMT
- Title: Balanced Training for Sparse GANs
- Authors: Yite Wang, Jing Wu, Naira Hovakimyan, Ruoyu Sun
- Abstract summary: We propose a novel metric called the balance ratio (BR) to study the balance between the sparse generator and discriminator.
We also introduce a new method called balanced dynamic sparse training (ADAPT), which seeks to control the BR during GAN training to achieve a good trade-off between performance and computational cost.
- Score: 16.045866864231417
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the past few years, there has been growing interest in developing larger
and deeper neural networks, including deep generative models like generative
adversarial networks (GANs). However, GANs typically come with high
computational complexity, leading researchers to explore methods for reducing
the training and inference costs. One such approach gaining popularity in
supervised learning is dynamic sparse training (DST), which maintains good
performance while enjoying excellent training efficiency. Despite its potential
benefits, applying DST to GANs presents challenges due to the adversarial
nature of the training process. In this paper, we propose a novel metric called
the balance ratio (BR) to study the balance between the sparse generator and
discriminator. We also introduce a new method called balanced dynamic sparse
training (ADAPT), which seeks to control the BR during GAN training to achieve
a good trade-off between performance and computational cost. Our proposed
method shows promising results on multiple datasets, demonstrating its
effectiveness.
Related papers
- Efficient Training of Deep Neural Operator Networks via Randomized Sampling [0.0]
Deep operator network (DeepNet) has demonstrated success in the real-time prediction of complex dynamics across various scientific and engineering applications.
We introduce a random sampling technique to be adopted the training of DeepONet, aimed at improving generalization ability of the model, while significantly reducing computational time.
Our results indicate that incorporating randomization in the trunk network inputs during training enhances the efficiency and robustness of DeepONet, offering a promising avenue for improving the framework's performance in modeling complex physical systems.
arXiv Detail & Related papers (2024-09-20T07:18:31Z) - Always-Sparse Training by Growing Connections with Guided Stochastic
Exploration [46.4179239171213]
We propose an efficient always-sparse training algorithm with excellent scaling to larger and sparser models.
We evaluate our method on CIFAR-10/100 and ImageNet using VGG, and ViT models, and compare it against a range of sparsification methods.
arXiv Detail & Related papers (2024-01-12T21:32:04Z) - Accurate Neural Network Pruning Requires Rethinking Sparse Optimization [87.90654868505518]
We show the impact of high sparsity on model training using the standard computer vision and natural language processing sparsity benchmarks.
We provide new approaches for mitigating this issue for both sparse pre-training of vision models and sparse fine-tuning of language models.
arXiv Detail & Related papers (2023-08-03T21:49:14Z) - SPIDE: A Purely Spike-based Method for Training Feedback Spiking Neural
Networks [56.35403810762512]
Spiking neural networks (SNNs) with event-based computation are promising brain-inspired models for energy-efficient applications on neuromorphic hardware.
We study spike-based implicit differentiation on the equilibrium state (SPIDE) that extends the recently proposed training method.
arXiv Detail & Related papers (2023-02-01T04:22:59Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Don't Be So Dense: Sparse-to-Sparse GAN Training Without Sacrificing
Performance [47.94567935516651]
Generative adversarial networks (GANs) have received an upsurging interest since being proposed due to the high quality of the generated data.
For inference, the existing model compression techniques can reduce the model complexity with comparable performance.
In this paper, we explore the possibility of directly training sparse GAN from scratch without involving any dense or pre-training steps.
arXiv Detail & Related papers (2022-03-05T15:18:03Z) - Class Balancing GAN with a Classifier in the Loop [58.29090045399214]
We introduce a novel theoretically motivated Class Balancing regularizer for training GANs.
Our regularizer makes use of the knowledge from a pre-trained classifier to ensure balanced learning of all the classes in the dataset.
We demonstrate the utility of our regularizer in learning representations for long-tailed distributions via achieving better performance than existing approaches over multiple datasets.
arXiv Detail & Related papers (2021-06-17T11:41:30Z) - Regularizing Generative Adversarial Networks under Limited Data [88.57330330305535]
This work proposes a regularization approach for training robust GAN models on limited data.
We show a connection between the regularized loss and an f-divergence called LeCam-divergence, which we find is more robust under limited training data.
arXiv Detail & Related papers (2021-04-07T17:59:06Z) - DBS: Dynamic Batch Size For Distributed Deep Neural Network Training [19.766163856388694]
We propose the Dynamic Batch Size (DBS) strategy for the distributedtraining of Deep Neural Networks (DNNs)
Specifically, the performance of each worker is evaluatedfirst based on the fact in the previous epoch, and then the batch size and dataset partition are dynamically adjusted.
The experimental results indicate that the proposed strategy can fully utilizethe performance of the cluster, reduce the training time, and have good robustness with disturbance by irrelevant tasks.
arXiv Detail & Related papers (2020-07-23T07:31:55Z) - Distributed Training of Deep Neural Network Acoustic Models for
Automatic Speech Recognition [33.032361181388886]
We provide an overview of distributed training techniques for deep neural network acoustic models for ASR.
Experiments are carried out on a popular public benchmark to study the convergence, speedup and recognition performance of the investigated strategies.
arXiv Detail & Related papers (2020-02-24T19:31:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.