Related papers: TAOTF: A Two-stage Approximately Orthogonal Training Framework in Deep Neural Networks

TAOTF: A Two-stage Approximately Orthogonal Training Framework in Deep Neural Networks

URL: http://arxiv.org/abs/2211.13902v1
Date: Fri, 25 Nov 2022 05:22:43 GMT
Title: TAOTF: A Two-stage Approximately Orthogonal Training Framework in Deep Neural Networks
Authors: Taoyong Cui, Jianze Li, Yuhan Dong and Li Liu
Abstract summary: We propose a novel two-stage approximately orthogonal training framework (TAOTF) to solve the problem in noisy data scenarios. We evaluate the proposed model-agnostic framework both on the natural image and medical image datasets, which show that our method achieves stable and superior performances to existing methods.
Score: 8.663152066918821
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The orthogonality constraints, including the hard and soft ones, have been used to normalize the weight matrices of Deep Neural Network (DNN) models, especially the Convolutional Neural Network (CNN) and Vision Transformer (ViT), to reduce model parameter redundancy and improve training stability. However, the robustness to noisy data of these models with constraints is not always satisfactory. In this work, we propose a novel two-stage approximately orthogonal training framework (TAOTF) to find a trade-off between the orthogonal solution space and the main task solution space to solve this problem in noisy data scenarios. In the first stage, we propose a novel algorithm called polar decomposition-based orthogonal initialization (PDOI) to find a good initialization for the orthogonal optimization. In the second stage, unlike other existing methods, we apply soft orthogonal constraints for all layers of DNN model. We evaluate the proposed model-agnostic framework both on the natural image and medical image datasets, which show that our method achieves stable and superior performances to existing methods.

Related papers

Training Deep Learning Models with Norm-Constrained LMOs [56.00317694850397]
We study optimization methods that leverage the linear minimization oracle (LMO) over a norm-ball. We propose a new family of algorithms that uses the LMO to adapt to the geometry of the problem and, perhaps surprisingly, show that they can be applied to unconstrained problems.
arXiv Detail & Related papers (2025-02-11T13:10:34Z)
A Primal-dual algorithm for image reconstruction with ICNNs [3.4797100095791706]
We address the optimization problem in a data-driven variational framework, where the regularizer is parameterized by an input- neural network (ICNN) While gradient-based methods are commonly used to solve such problems, they struggle to effectively handle nonsmoothness. We show that a proposed approach outperforms subgradient methods in terms of both speed and stability.
arXiv Detail & Related papers (2024-10-16T10:36:29Z)
Solving Inverse Problems with Model Mismatch using Untrained Neural Networks within Model-based Architectures [14.551812310439004]
We introduce an untrained forward model residual block within the model-based architecture to match the data consistency in the measurement domain for each instance. Our approach offers a unified solution that is less parameter-sensitive, requires no additional data, and enables simultaneous fitting of the forward model and reconstruction in a single pass.
arXiv Detail & Related papers (2024-03-07T19:02:13Z)
A Two-Stage Training Method for Modeling Constrained Systems With Neural Networks [3.072340427031969]
This paper describes in detail the two-stage training method for Neural ODEs. The first stage aims at finding feasible NN parameters by minimizing a measure of constraints violation. The second stage aims to find the optimal NN parameters by minimizing the loss function while keeping inside the feasible region.
arXiv Detail & Related papers (2024-03-05T07:37:47Z)
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning [132.7040981721302]
We study the Constrained Convex Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure. Design algorithms for a constrained convex MDP faces several challenges, including handling the large state space.
arXiv Detail & Related papers (2024-02-16T16:35:18Z)
The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes. In this paper we examine the use of convex neural recovery models. We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program. We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z)
An Optimization-based Deep Equilibrium Model for Hyperspectral Image Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem. A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network. The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z)
Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures. This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead. We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z)
An Adaptive and Stability-Promoting Layerwise Training Approach for Sparse Deep Neural Network Architecture [0.0]
This work presents a two-stage adaptive framework for developing deep neural network (DNN) architectures that generalize well for a given training data set. In the first stage, a layerwise training approach is adopted where a new layer is added each time and trained independently by freezing parameters in the previous layers. We introduce a epsilon-delta stability-promoting concept as a desirable property for a learning algorithm and show that employing manifold regularization yields a epsilon-delta stability-promoting algorithm.
arXiv Detail & Related papers (2022-11-13T09:51:16Z)
An alternative approach to train neural networks using monotone variational inequality [22.320632565424745]
We propose an alternative approach to neural network training using the monotone vector field. Our approach can be used for more efficient fine-tuning of a pre-trained neural network.
arXiv Detail & Related papers (2022-02-17T19:24:20Z)
A Flexible Framework for Designing Trainable Priors with Adaptive Smoothing and Game Encoding [57.1077544780653]
We introduce a general framework for designing and training neural network layers whose forward passes can be interpreted as solving non-smooth convex optimization problems. We focus on convex games, solved by local agents represented by the nodes of a graph and interacting through regularization functions. This approach is appealing for solving imaging problems, as it allows the use of classical image priors within deep models that are trainable end to end.
arXiv Detail & Related papers (2020-06-26T08:34:54Z)
Dense Non-Rigid Structure from Motion: A Manifold Viewpoint [162.88686222340962]
Non-Rigid Structure-from-Motion (NRSfM) problem aims to recover 3D geometry of a deforming object from its 2D feature correspondences across multiple frames. We show that our approach significantly improves accuracy, scalability, and robustness against noise.
arXiv Detail & Related papers (2020-06-15T09:15:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.