Non-Gradient Manifold Neural Network
- URL: http://arxiv.org/abs/2106.07905v1
- Date: Tue, 15 Jun 2021 06:39:13 GMT
- Title: Non-Gradient Manifold Neural Network
- Authors: Rui Zhang and Ziheng Jiao and Hongyuan Zhang and Xuelong Li
- Abstract summary: Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent.
We propose a novel manifold neural network based on non-gradient optimization.
- Score: 79.44066256794187
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural network (DNN) generally takes thousands of iterations to optimize
via gradient descent and thus has a slow convergence. In addition, softmax, as
a decision layer, may ignore the distribution information of the data during
classification. Aiming to tackle the referred problems, we propose a novel
manifold neural network based on non-gradient optimization, i.e., the
closed-form solutions. Considering that the activation function is generally
invertible, we reconstruct the network via forward ridge regression and low
rank backward approximation, which achieve the rapid convergence. Moreover, by
unifying the flexible Stiefel manifold and adaptive support vector machine, we
devise the novel decision layer which efficiently fits the manifold structure
of the data and label information. Consequently, a jointly non-gradient
optimization method is designed to generate the network with closed-form
results. Eventually, extensive experiments validate the superior performance of
the model.
Related papers
- Explicit Foundation Model Optimization with Self-Attentive Feed-Forward
Neural Units [4.807347156077897]
Iterative approximation methods using backpropagation enable the optimization of neural networks, but they remain computationally expensive when used at scale.
This paper presents an efficient alternative for optimizing neural networks that reduces the costs of scaling neural networks and provides high-efficiency optimizations for low-resource applications.
arXiv Detail & Related papers (2023-11-13T17:55:07Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data [63.34506218832164]
In this work, we investigate the implicit bias of gradient flow and gradient descent in two-layer fully-connected neural networks with ReLU activations.
For gradient flow, we leverage recent work on the implicit bias for homogeneous neural networks to show that leakyally, gradient flow produces a neural network with rank at most two.
For gradient descent, provided the random variance is small enough, we show that a single step of gradient descent suffices to drastically reduce the rank of the network, and that the rank remains small throughout training.
arXiv Detail & Related papers (2022-10-13T15:09:54Z) - Variational Inference for Infinitely Deep Neural Networks [0.4061135251278187]
unbounded depth neural network (UDN)
We introduce the unbounded depth neural network (UDN), an infinitely deep probabilistic model that adapts its complexity to the training data.
We study the UDN on real and synthetic data.
arXiv Detail & Related papers (2022-09-21T03:54:34Z) - Deep Manifold Learning with Graph Mining [80.84145791017968]
We propose a novel graph deep model with a non-gradient decision layer for graph mining.
The proposed model has achieved state-of-the-art performance compared to the current models.
arXiv Detail & Related papers (2022-07-18T04:34:08Z) - EvoPruneDeepTL: An Evolutionary Pruning Model for Transfer Learning
based Deep Neural Networks [15.29595828816055]
We propose an evolutionary pruning model for Transfer Learning based Deep Neural Networks.
EvoPruneDeepTL replaces the last fully-connected layers with sparse layers optimized by a genetic algorithm.
Results show the contribution of EvoPruneDeepTL and feature selection to the overall computational efficiency of the network.
arXiv Detail & Related papers (2022-02-08T13:07:55Z) - Acceleration techniques for optimization over trained neural network
ensembles [1.0323063834827415]
We study optimization problems where the objective function is modeled through feedforward neural networks with rectified linear unit activation.
We present a mixed-integer linear program based on existing popular big-$M$ formulations for optimizing over a single neural network.
arXiv Detail & Related papers (2021-12-13T20:50:54Z) - Short-Term Memory Optimization in Recurrent Neural Networks by
Autoencoder-based Initialization [79.42778415729475]
We explore an alternative solution based on explicit memorization using linear autoencoders for sequences.
We show how such pretraining can better support solving hard classification tasks with long sequences.
We show that the proposed approach achieves a much lower reconstruction error for long sequences and a better gradient propagation during the finetuning phase.
arXiv Detail & Related papers (2020-11-05T14:57:16Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.