Related papers: A ReLU Dense Layer to Improve the Performance of Neural Networks

A ReLU Dense Layer to Improve the Performance of Neural Networks

URL: http://arxiv.org/abs/2010.13572v1
Date: Thu, 22 Oct 2020 11:56:01 GMT
Title: A ReLU Dense Layer to Improve the Performance of Neural Networks
Authors: Alireza M. Javid, Sandipan Das, Mikael Skoglund, and Saikat Chatterjee
Abstract summary: We propose ReDense as a simple and low complexity way to improve the performance of trained neural networks. We experimentally show that ReDense can improve the training and testing performance of various neural network architectures.
Score: 40.2470651460466
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose ReDense as a simple and low complexity way to improve the performance of trained neural networks. We use a combination of random weights and rectified linear unit (ReLU) activation function to add a ReLU dense (ReDense) layer to the trained neural network such that it can achieve a lower training loss. The lossless flow property (LFP) of ReLU is the key to achieve the lower training loss while keeping the generalization error small. ReDense does not suffer from vanishing gradient problem in the training due to having a shallow structure. We experimentally show that ReDense can improve the training and testing performance of various neural network architectures with different optimization loss and activation functions. Finally, we test ReDense on some of the state-of-the-art architectures and show the performance improvement on benchmark datasets.

Related papers

NeRF-based CBCT Reconstruction needs Normalization and Initialization [53.58395475423445]
NeRF-based methods suffer from a local-global training mismatch between their two key components: the hash encoder and the neural network.<n>We introduce a Normalized Hash, which enhances feature consistency and mitigates the mismatch.<n>The neural network exhibits improved stability during early training, enabling faster convergence and enhanced reconstruction performance.
arXiv Detail & Related papers (2025-06-24T16:01:45Z)
Deep Learning Optimization Using Self-Adaptive Weighted Auxiliary Variables [20.09691024284159]
In this paper, we develop a new framework for learning via neural networks or physics-informed networks. The robustness of our framework guarantees that the new loss helps optimize the original problem.
arXiv Detail & Related papers (2025-04-30T10:43:13Z)
SensLI: Sensitivity-Based Layer Insertion for Neural Networks [0.36157708183789494]
We propose a systematic approach to inserting new layers during the training process of neural networks.<n>Our method eliminates the need to choose a fixed network size before training.<n>Our proposed sensitivity-based layer insertion technique (SensLI) exhibits improved performance on training loss and test error.
arXiv Detail & Related papers (2023-11-27T16:44:13Z)
A Coefficient Makes SVRG Effective [55.104068027239656]
Variance Reduced Gradient (SVRG) is a theoretically compelling optimization method. In this work, we demonstrate the potential of SVRG in optimizing real-world neural networks. Our analysis finds that, for deeper networks, the strength of the variance reduction term in SVRG should be smaller and decrease as training progresses.
arXiv Detail & Related papers (2023-11-09T18:47:44Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Adaptive Self-supervision Algorithms for Physics-informed Neural Networks [59.822151945132525]
Physics-informed neural networks (PINNs) incorporate physical knowledge from the problem domain as a soft constraint on the loss function. We study the impact of the location of the collocation points on the trainability of these models. We propose a novel adaptive collocation scheme which progressively allocates more collocation points to areas where the model is making higher errors.
arXiv Detail & Related papers (2022-07-08T18:17:06Z)
Learning in Feedback-driven Recurrent Spiking Neural Networks using full-FORCE Training [4.124948554183487]
We propose a supervised training procedure for RSNNs, where a second network is introduced only during the training. The proposed training procedure consists of generating targets for both recurrent and readout layers. We demonstrate the improved performance and noise robustness of the proposed full-FORCE training procedure to model 8 dynamical systems.
arXiv Detail & Related papers (2022-05-26T19:01:19Z)
The Impact of Reinitialization on Generalization in Convolutional Neural Networks [3.462210753108297]
We study the impact of different reinitialization methods in several convolutional architectures across 12 benchmark image classification datasets. We introduce a new layerwise reinitialization algorithm that outperforms previous methods. Our takeaway message is that the accuracy of convolutional neural networks can be improved for small datasets using bottom-up layerwise reinitialization.
arXiv Detail & Related papers (2021-09-01T09:25:57Z)
Over-and-Under Complete Convolutional RNN for MRI Reconstruction [57.95363471940937]
Recent deep learning-based methods for MR image reconstruction usually leverage a generic auto-encoder architecture. We propose an Over-and-Under Complete Convolu?tional Recurrent Neural Network (OUCR), which consists of an overcomplete and an undercomplete Convolutional Recurrent Neural Network(CRNN) The proposed method achieves significant improvements over the compressed sensing and popular deep learning-based methods with less number of trainable parameters.
arXiv Detail & Related papers (2021-06-16T15:56:34Z)
Enabling Incremental Training with Forward Pass for Edge Devices [0.0]
We introduce a method using evolutionary strategy (ES) that can partially retrain the network enabling it to adapt to changes and recover after an error has occurred. This technique enables training on an inference-only hardware without the need to use backpropagation and with minimal resource overhead.
arXiv Detail & Related papers (2021-03-25T17:43:04Z)
Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER) SEER is a simple modification of existing off-policy deep reinforcement learning methods. We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z)
Accelerated MRI with Un-trained Neural Networks [29.346778609548995]
We address the reconstruction problem arising in accelerated MRI with un-trained neural networks. We propose a highly optimized un-trained recovery approach based on a variation of the Deep Decoder. We find that our un-trained algorithm achieves similar performance to a baseline trained neural network, but a state-of-the-art trained network outperforms the un-trained one.
arXiv Detail & Related papers (2020-07-06T00:01:25Z)
Retrospective Loss: Looking Back to Improve Training of Deep Neural Networks [15.329684157845872]
We introduce a new retrospective loss to improve the training of deep neural network models. Minimizing the retrospective loss, along with the task-specific loss, pushes the parameter state at the current training step towards the optimal parameter state. Although a simple idea, we analyze the method as well as to conduct comprehensive sets of experiments across domains.
arXiv Detail & Related papers (2020-06-24T10:16:36Z)
Multi-fidelity Neural Architecture Search with Knowledge Distillation [69.09782590880367]
We propose a bayesian multi-fidelity method for neural architecture search: MF-KD. Knowledge distillation adds to a loss function a term forcing a network to mimic some teacher network. We show that training for a few epochs with such a modified loss function leads to a better selection of neural architectures than training for a few epochs with a logistic loss.
arXiv Detail & Related papers (2020-06-15T12:32:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.