TAOTF: A Two-stage Approximately Orthogonal Training Framework in Deep
Neural Networks
- URL: http://arxiv.org/abs/2211.13902v1
- Date: Fri, 25 Nov 2022 05:22:43 GMT
- Title: TAOTF: A Two-stage Approximately Orthogonal Training Framework in Deep
Neural Networks
- Authors: Taoyong Cui, Jianze Li, Yuhan Dong and Li Liu
- Abstract summary: We propose a novel two-stage approximately orthogonal training framework (TAOTF) to solve the problem in noisy data scenarios.
We evaluate the proposed model-agnostic framework both on the natural image and medical image datasets, which show that our method achieves stable and superior performances to existing methods.
- Score: 8.663152066918821
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The orthogonality constraints, including the hard and soft ones, have been
used to normalize the weight matrices of Deep Neural Network (DNN) models,
especially the Convolutional Neural Network (CNN) and Vision Transformer (ViT),
to reduce model parameter redundancy and improve training stability. However,
the robustness to noisy data of these models with constraints is not always
satisfactory. In this work, we propose a novel two-stage approximately
orthogonal training framework (TAOTF) to find a trade-off between the
orthogonal solution space and the main task solution space to solve this
problem in noisy data scenarios. In the first stage, we propose a novel
algorithm called polar decomposition-based orthogonal initialization (PDOI) to
find a good initialization for the orthogonal optimization. In the second
stage, unlike other existing methods, we apply soft orthogonal constraints for
all layers of DNN model. We evaluate the proposed model-agnostic framework both
on the natural image and medical image datasets, which show that our method
achieves stable and superior performances to existing methods.
Related papers
- Solving Inverse Problems with Model Mismatch using Untrained Neural Networks within Model-based Architectures [14.551812310439004]
We introduce an untrained forward model residual block within the model-based architecture to match the data consistency in the measurement domain for each instance.
Our approach offers a unified solution that is less parameter-sensitive, requires no additional data, and enables simultaneous fitting of the forward model and reconstruction in a single pass.
arXiv Detail & Related papers (2024-03-07T19:02:13Z) - A Two-Stage Training Method for Modeling Constrained Systems With Neural
Networks [3.072340427031969]
This paper describes in detail the two-stage training method for Neural ODEs.
The first stage aims at finding feasible NN parameters by minimizing a measure of constraints violation.
The second stage aims to find the optimal NN parameters by minimizing the loss function while keeping inside the feasible region.
arXiv Detail & Related papers (2024-03-05T07:37:47Z) - Double Duality: Variational Primal-Dual Policy Optimization for
Constrained Reinforcement Learning [132.7040981721302]
We study the Constrained Convex Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure.
Design algorithms for a constrained convex MDP faces several challenges, including handling the large state space.
arXiv Detail & Related papers (2024-02-16T16:35:18Z) - The Convex Landscape of Neural Networks: Characterizing Global Optima
and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes.
In this paper we examine the use of convex neural recovery models.
We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z) - An Optimization-based Deep Equilibrium Model for Hyperspectral Image
Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem.
A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network.
The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - An alternative approach to train neural networks using monotone
variational inequality [22.320632565424745]
We propose an alternative approach to neural network training using the monotone vector field.
Our approach can be used for more efficient fine-tuning of a pre-trained neural network.
arXiv Detail & Related papers (2022-02-17T19:24:20Z) - Optimal Transport Based Refinement of Physics-Informed Neural Networks [0.0]
We propose a refinement strategy to the well-known Physics-Informed Neural Networks (PINNs) for solving partial differential equations (PDEs) based on the concept of Optimal Transport (OT)
PINNs solvers have been found to suffer from a host of issues: spectral bias in fully-connected pathologies, unstable gradient, and difficulties with convergence and accuracy.
We present a novel training strategy for solving the Fokker-Planck-Kolmogorov Equation (FPKE) using OT-based sampling to supplement the existing PINNs framework.
arXiv Detail & Related papers (2021-05-26T02:51:20Z) - A Flexible Framework for Designing Trainable Priors with Adaptive
Smoothing and Game Encoding [57.1077544780653]
We introduce a general framework for designing and training neural network layers whose forward passes can be interpreted as solving non-smooth convex optimization problems.
We focus on convex games, solved by local agents represented by the nodes of a graph and interacting through regularization functions.
This approach is appealing for solving imaging problems, as it allows the use of classical image priors within deep models that are trainable end to end.
arXiv Detail & Related papers (2020-06-26T08:34:54Z) - Dense Non-Rigid Structure from Motion: A Manifold Viewpoint [162.88686222340962]
Non-Rigid Structure-from-Motion (NRSfM) problem aims to recover 3D geometry of a deforming object from its 2D feature correspondences across multiple frames.
We show that our approach significantly improves accuracy, scalability, and robustness against noise.
arXiv Detail & Related papers (2020-06-15T09:15:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.