Multilevel Minimization for Deep Residual Networks
- URL: http://arxiv.org/abs/2004.06196v1
- Date: Mon, 13 Apr 2020 20:52:26 GMT
- Title: Multilevel Minimization for Deep Residual Networks
- Authors: Lisa Gaedke-Merzh\"auser and Alena Kopani\v{c}\'akov\'a and Rolf
Krause
- Abstract summary: We present a new multilevel minimization framework for the training of deep residual networks (ResNets)
Our framework is based on the dynamical system's viewpoint, which formulates a ResNet as the discretization of an initial value problem.
By design, our framework is conveniently independent of the choice of the training strategy chosen on each level of the multilevel hierarchy.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a new multilevel minimization framework for the training of deep
residual networks (ResNets), which has the potential to significantly reduce
training time and effort. Our framework is based on the dynamical system's
viewpoint, which formulates a ResNet as the discretization of an initial value
problem. The training process is then formulated as a time-dependent optimal
control problem, which we discretize using different time-discretization
parameters, eventually generating multilevel-hierarchy of auxiliary networks
with different resolutions. The training of the original ResNet is then
enhanced by training the auxiliary networks with reduced resolutions. By
design, our framework is conveniently independent of the choice of the training
strategy chosen on each level of the multilevel hierarchy. By means of
numerical examples, we analyze the convergence behavior of the proposed method
and demonstrate its robustness. For our examples we employ a multilevel
gradient-based methods. Comparisons with standard single level methods show a
speedup of more than factor three while achieving the same validation accuracy.
Related papers
- GFN: A graph feedforward network for resolution-invariant reduced operator learning in multifidelity applications [0.0]
This work presents a novel resolution-invariant model order reduction strategy for multifidelity applications.
We base our architecture on a novel neural network layer developed in this work, the graph feedforward network.
We exploit the method's capability of training and testing on different mesh sizes in an autoencoder-based reduction strategy for parametrised partial differential equations.
arXiv Detail & Related papers (2024-06-05T18:31:37Z) - Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents.
Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z) - Adaptive Depth Networks with Skippable Sub-Paths [1.8416014644193066]
We present a practical approach to adaptive depth networks with minimal training effort.
Our approach does not train every target sub-network in an iterative manner.
We provide a formal rationale for why the proposed training method can reduce overall prediction errors.
arXiv Detail & Related papers (2023-12-27T03:43:38Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Multilevel-in-Layer Training for Deep Neural Network Regression [1.6185544531149159]
We present a multilevel regularization strategy that constructs and trains a hierarchy of neural networks.
We experimentally show with PDE regression problems that our multilevel training approach is an effective regularizer.
arXiv Detail & Related papers (2022-11-11T23:53:46Z) - Globally Convergent Multilevel Training of Deep Residual Networks [0.0]
We propose a globally convergent multilevel training method for deep residual networks (ResNets)
The devised method operates in hybrid (stochastic-deterministic) settings by adaptively adjusting mini-batch sizes during the training.
arXiv Detail & Related papers (2021-07-15T19:08:58Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - All at Once Network Quantization via Collaborative Knowledge Transfer [56.95849086170461]
We develop a novel collaborative knowledge transfer approach for efficiently training the all-at-once quantization network.
Specifically, we propose an adaptive selection strategy to choose a high-precision enquoteteacher for transferring knowledge to the low-precision student.
To effectively transfer knowledge, we develop a dynamic block swapping method by randomly replacing the blocks in the lower-precision student network with the corresponding blocks in the higher-precision teacher network.
arXiv Detail & Related papers (2021-03-02T03:09:03Z) - Deep Unfolding Network for Image Super-Resolution [159.50726840791697]
This paper proposes an end-to-end trainable unfolding network which leverages both learning-based methods and model-based methods.
The proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model.
arXiv Detail & Related papers (2020-03-23T17:55:42Z) - Subset Sampling For Progressive Neural Network Learning [106.12874293597754]
Progressive Neural Network Learning is a class of algorithms that incrementally construct the network's topology and optimize its parameters based on the training data.
We propose to speed up this process by exploiting subsets of training data at each incremental training step.
Experimental results in object, scene and face recognition problems demonstrate that the proposed approach speeds up the optimization procedure considerably.
arXiv Detail & Related papers (2020-02-17T18:57:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.