Related papers: Balanced Training for Sparse GANs

Balanced Training for Sparse GANs

URL: http://arxiv.org/abs/2302.14670v2
Date: Sat, 18 Nov 2023 17:39:52 GMT
Title: Balanced Training for Sparse GANs
Authors: Yite Wang, Jing Wu, Naira Hovakimyan, Ruoyu Sun
Abstract summary: We propose a novel metric called the balance ratio (BR) to study the balance between the sparse generator and discriminator. We also introduce a new method called balanced dynamic sparse training (ADAPT), which seeks to control the BR during GAN training to achieve a good trade-off between performance and computational cost.
Score: 16.045866864231417
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Over the past few years, there has been growing interest in developing larger and deeper neural networks, including deep generative models like generative adversarial networks (GANs). However, GANs typically come with high computational complexity, leading researchers to explore methods for reducing the training and inference costs. One such approach gaining popularity in supervised learning is dynamic sparse training (DST), which maintains good performance while enjoying excellent training efficiency. Despite its potential benefits, applying DST to GANs presents challenges due to the adversarial nature of the training process. In this paper, we propose a novel metric called the balance ratio (BR) to study the balance between the sparse generator and discriminator. We also introduce a new method called balanced dynamic sparse training (ADAPT), which seeks to control the BR during GAN training to achieve a good trade-off between performance and computational cost. Our proposed method shows promising results on multiple datasets, demonstrating its effectiveness.

Related papers

Efficient Training of Deep Neural Operator Networks via Randomized Sampling [0.0]
Deep operator network (DeepNet) has demonstrated success in the real-time prediction of complex dynamics across various scientific and engineering applications. We introduce a random sampling technique to be adopted the training of DeepONet, aimed at improving generalization ability of the model, while significantly reducing computational time. Our results indicate that incorporating randomization in the trunk network inputs during training enhances the efficiency and robustness of DeepONet, offering a promising avenue for improving the framework's performance in modeling complex physical systems.
arXiv Detail & Related papers (2024-09-20T07:18:31Z)
Always-Sparse Training by Growing Connections with Guided Stochastic Exploration [46.4179239171213]
We propose an efficient always-sparse training algorithm with excellent scaling to larger and sparser models. We evaluate our method on CIFAR-10/100 and ImageNet using VGG, and ViT models, and compare it against a range of sparsification methods.
arXiv Detail & Related papers (2024-01-12T21:32:04Z)
Accurate Neural Network Pruning Requires Rethinking Sparse Optimization [87.90654868505518]
We show the impact of high sparsity on model training using the standard computer vision and natural language processing sparsity benchmarks. We provide new approaches for mitigating this issue for both sparse pre-training of vision models and sparse fine-tuning of language models.
arXiv Detail & Related papers (2023-08-03T21:49:14Z)
SPIDE: A Purely Spike-based Method for Training Feedback Spiking Neural Networks [56.35403810762512]
Spiking neural networks (SNNs) with event-based computation are promising brain-inspired models for energy-efficient applications on neuromorphic hardware. We study spike-based implicit differentiation on the equilibrium state (SPIDE) that extends the recently proposed training method.
arXiv Detail & Related papers (2023-02-01T04:22:59Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Don't Be So Dense: Sparse-to-Sparse GAN Training Without Sacrificing Performance [47.94567935516651]
Generative adversarial networks (GANs) have received an upsurging interest since being proposed due to the high quality of the generated data. For inference, the existing model compression techniques can reduce the model complexity with comparable performance. In this paper, we explore the possibility of directly training sparse GAN from scratch without involving any dense or pre-training steps.
arXiv Detail & Related papers (2022-03-05T15:18:03Z)
Class Balancing GAN with a Classifier in the Loop [58.29090045399214]
We introduce a novel theoretically motivated Class Balancing regularizer for training GANs. Our regularizer makes use of the knowledge from a pre-trained classifier to ensure balanced learning of all the classes in the dataset. We demonstrate the utility of our regularizer in learning representations for long-tailed distributions via achieving better performance than existing approaches over multiple datasets.
arXiv Detail & Related papers (2021-06-17T11:41:30Z)
Regularizing Generative Adversarial Networks under Limited Data [88.57330330305535]
This work proposes a regularization approach for training robust GAN models on limited data. We show a connection between the regularized loss and an f-divergence called LeCam-divergence, which we find is more robust under limited training data.
arXiv Detail & Related papers (2021-04-07T17:59:06Z)
DBS: Dynamic Batch Size For Distributed Deep Neural Network Training [19.766163856388694]
We propose the Dynamic Batch Size (DBS) strategy for the distributedtraining of Deep Neural Networks (DNNs) Specifically, the performance of each worker is evaluatedfirst based on the fact in the previous epoch, and then the batch size and dataset partition are dynamically adjusted. The experimental results indicate that the proposed strategy can fully utilizethe performance of the cluster, reduce the training time, and have good robustness with disturbance by irrelevant tasks.
arXiv Detail & Related papers (2020-07-23T07:31:55Z)
Distributed Training of Deep Neural Network Acoustic Models for Automatic Speech Recognition [33.032361181388886]
We provide an overview of distributed training techniques for deep neural network acoustic models for ASR. Experiments are carried out on a popular public benchmark to study the convergence, speedup and recognition performance of the investigated strategies.
arXiv Detail & Related papers (2020-02-24T19:31:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.