Controlled Descent Training
- URL: http://arxiv.org/abs/2303.09216v1
- Date: Thu, 16 Mar 2023 10:45:24 GMT
- Title: Controlled Descent Training
- Authors: Viktor Andersson, Bal\'azs Varga, Vincent Szolnoky, Andreas Syr\'en,
Rebecka J\"ornsten, Bal\'azs Kulcs\'ar
- Abstract summary: A novel and model-based artificial neural network (ANN) training method is developed supported by optimal control theory.
The method augments training labels in order to robustly guarantee training loss convergence and improve training convergence rate.
The applicability of the method is demonstrated on standard regression and classification problems.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, a novel and model-based artificial neural network (ANN)
training method is developed supported by optimal control theory. The method
augments training labels in order to robustly guarantee training loss
convergence and improve training convergence rate. Dynamic label augmentation
is proposed within the framework of gradient descent training where the
convergence of training loss is controlled. First, we capture the training
behavior with the help of empirical Neural Tangent Kernels (NTK) and borrow
tools from systems and control theory to analyze both the local and global
training dynamics (e.g. stability, reachability). Second, we propose to
dynamically alter the gradient descent training mechanism via fictitious labels
as control inputs and an optimal state feedback policy. In this way, we enforce
locally $\mathcal{H}_2$ optimal and convergent training behavior. The novel
algorithm, \textit{Controlled Descent Training} (CDT), guarantees local
convergence. CDT unleashes new potentials in the analysis, interpretation, and
design of ANN architectures. The applicability of the method is demonstrated on
standard regression and classification problems.
Related papers
- Adaptive Class Emergence Training: Enhancing Neural Network Stability and Generalization through Progressive Target Evolution [0.0]
We propose a novel training methodology for neural networks in classification problems.
We evolve the target outputs from a null vector to one-hot encoded vectors throughout the training process.
This gradual transition allows the network to adapt more smoothly to the increasing complexity of the classification task.
arXiv Detail & Related papers (2024-09-04T03:25:48Z) - Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - Harnessing Orthogonality to Train Low-Rank Neural Networks [0.07538606213726905]
This study explores the learning dynamics of neural networks by analyzing the singular value decomposition (SVD) of their weights throughout training.
We introduce Orthogonality-Informed Adaptive Low-Rank (OIALR) training, a novel training method exploiting the intrinsic orthogonality of neural networks.
arXiv Detail & Related papers (2024-01-16T17:07:22Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Optimization-Derived Learning with Essential Convergence Analysis of
Training and Hyper-training [52.39882976848064]
We design a Generalized Krasnoselskii-Mann (GKM) scheme based on fixed-point iterations as our fundamental ODL module.
Under the GKM scheme, a Bilevel Meta Optimization (BMO) algorithmic framework is constructed to solve the optimal training and hyper-training variables together.
arXiv Detail & Related papers (2022-06-16T01:50:25Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Learning in Feedback-driven Recurrent Spiking Neural Networks using
full-FORCE Training [4.124948554183487]
We propose a supervised training procedure for RSNNs, where a second network is introduced only during the training.
The proposed training procedure consists of generating targets for both recurrent and readout layers.
We demonstrate the improved performance and noise robustness of the proposed full-FORCE training procedure to model 8 dynamical systems.
arXiv Detail & Related papers (2022-05-26T19:01:19Z) - Self-Progressing Robust Training [146.8337017922058]
Current robust training methods such as adversarial training explicitly uses an "attack" to generate adversarial examples.
We propose a new framework called SPROUT, self-progressing robust training.
Our results shed new light on scalable, effective and attack-independent robust training methods.
arXiv Detail & Related papers (2020-12-22T00:45:24Z) - Training Generative Adversarial Networks by Solving Ordinary
Differential Equations [54.23691425062034]
We study the continuous-time dynamics induced by GAN training.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error.
We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
arXiv Detail & Related papers (2020-10-28T15:23:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.