Mind the (optimality) Gap: A Gap-Aware Learning Rate Scheduler for
Adversarial Nets
- URL: http://arxiv.org/abs/2302.00089v1
- Date: Tue, 31 Jan 2023 20:36:40 GMT
- Title: Mind the (optimality) Gap: A Gap-Aware Learning Rate Scheduler for
Adversarial Nets
- Authors: Hussein Hazimeh, Natalia Ponomareva
- Abstract summary: Adversarial nets have proved to be powerful in various domains including generative modeling (GANs)
In this paper, we design a novel learning rate scheduler that dynamically adapts the learning rate of the adversary to maintain the right balance.
We run large-scale experiments to study the effectiveness of the scheduler on two popular applications: GANs for image generation and adversarial nets for domain adaptation.
- Score: 3.8073142980733
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adversarial nets have proved to be powerful in various domains including
generative modeling (GANs), transfer learning, and fairness. However,
successfully training adversarial nets using first-order methods remains a
major challenge. Typically, careful choices of the learning rates are needed to
maintain the delicate balance between the competing networks. In this paper, we
design a novel learning rate scheduler that dynamically adapts the learning
rate of the adversary to maintain the right balance. The scheduler is driven by
the fact that the loss of an ideal adversarial net is a constant known a
priori. The scheduler is thus designed to keep the loss of the optimized
adversarial net close to that of an ideal network. We run large-scale
experiments to study the effectiveness of the scheduler on two popular
applications: GANs for image generation and adversarial nets for domain
adaptation. Our experiments indicate that adversarial nets trained with the
scheduler are less likely to diverge and require significantly less tuning. For
example, on CelebA, a GAN with the scheduler requires only one-tenth of the
tuning budget needed without a scheduler. Moreover, the scheduler leads to
statistically significant improvements in model quality, reaching up to $27\%$
in Frechet Inception Distance for image generation and $3\%$ in test accuracy
for domain adaptation.
Related papers
- Pruning In Time (PIT): A Lightweight Network Architecture Optimizer for
Temporal Convolutional Networks [20.943095081056857]
Temporal Convolutional Networks (TCNs) are promising Deep Learning models for time-series processing tasks.
We propose an automatic dilation, which tackles the problem as a weight pruning on the time-axis, and learns dilation factors together with weights, in a single training.
arXiv Detail & Related papers (2022-03-28T14:03:16Z) - Continual Test-Time Domain Adaptation [94.51284735268597]
Test-time domain adaptation aims to adapt a source pre-trained model to a target domain without using any source data.
CoTTA is easy to implement and can be readily incorporated in off-the-shelf pre-trained models.
arXiv Detail & Related papers (2022-03-25T11:42:02Z) - Joint inference and input optimization in equilibrium networks [68.63726855991052]
deep equilibrium model is a class of models that foregoes traditional network depth and instead computes the output of a network by finding the fixed point of a single nonlinear layer.
We show that there is a natural synergy between these two settings.
We demonstrate this strategy on various tasks such as training generative models while optimizing over latent codes, training models for inverse problems like denoising and inpainting, adversarial training and gradient based meta-learning.
arXiv Detail & Related papers (2021-11-25T19:59:33Z) - LRTuner: A Learning Rate Tuner for Deep Neural Networks [10.913790890826785]
The choice of learning rate schedule determines the computational cost getting close to a minima, how close you actually get to the minima, and most importantly the kind of local minima (wide/narrow) attained.
Current systems employ hand tuned learning rate schedules, which are painstakingly tuned for each network and dataset.
We present LRTuner, a method for tuning the learning rate schedule as training proceeds.
arXiv Detail & Related papers (2021-05-30T13:06:26Z) - Better than the Best: Gradient-based Improper Reinforcement Learning for
Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay.
We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z) - Smart Scheduling based on Deep Reinforcement Learning for Cellular
Networks [18.04856086228028]
We propose a smart scheduling scheme based on deep reinforcement learning (DRL)
We provide implementation-friend designs, i.e., a scalable neural network design for the agent and a virtual environment training framework.
We show that the DRL-based smart scheduling outperforms the conventional scheduling method and can be adopted in practical systems.
arXiv Detail & Related papers (2021-03-22T02:09:16Z) - S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution.
Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs.
Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z) - A Simple Fine-tuning Is All You Need: Towards Robust Deep Learning Via
Adversarial Fine-tuning [90.44219200633286]
We propose a simple yet very effective adversarial fine-tuning approach based on a $textitslow start, fast decay$ learning rate scheduling strategy.
Experimental results show that the proposed adversarial fine-tuning approach outperforms the state-of-the-art methods on CIFAR-10, CIFAR-100 and ImageNet datasets.
arXiv Detail & Related papers (2020-12-25T20:50:15Z) - Side-Tuning: A Baseline for Network Adaptation via Additive Side
Networks [95.51368472949308]
Adaptation can be useful in cases when training data is scarce, or when one wishes to encode priors in the network.
In this paper, we propose a straightforward alternative: side-tuning.
arXiv Detail & Related papers (2019-12-31T18:52:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.