Retrospective Loss: Looking Back to Improve Training of Deep Neural
Networks
- URL: http://arxiv.org/abs/2006.13593v1
- Date: Wed, 24 Jun 2020 10:16:36 GMT
- Title: Retrospective Loss: Looking Back to Improve Training of Deep Neural
Networks
- Authors: Surgan Jandial, Ayush Chopra, Mausoom Sarkar, Piyush Gupta, Balaji
Krishnamurthy, Vineeth Balasubramanian
- Abstract summary: We introduce a new retrospective loss to improve the training of deep neural network models.
Minimizing the retrospective loss, along with the task-specific loss, pushes the parameter state at the current training step towards the optimal parameter state.
Although a simple idea, we analyze the method as well as to conduct comprehensive sets of experiments across domains.
- Score: 15.329684157845872
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Deep neural networks (DNNs) are powerful learning machines that have enabled
breakthroughs in several domains. In this work, we introduce a new
retrospective loss to improve the training of deep neural network models by
utilizing the prior experience available in past model states during training.
Minimizing the retrospective loss, along with the task-specific loss, pushes
the parameter state at the current training step towards the optimal parameter
state while pulling it away from the parameter state at a previous training
step. Although a simple idea, we analyze the method as well as to conduct
comprehensive sets of experiments across domains - images, speech, text, and
graphs - to show that the proposed loss results in improved performance across
input domains, tasks, and architectures.
Related papers
- Efficient Training of Deep Neural Operator Networks via Randomized Sampling [0.0]
Deep operator network (DeepNet) has demonstrated success in the real-time prediction of complex dynamics across various scientific and engineering applications.
We introduce a random sampling technique to be adopted the training of DeepONet, aimed at improving generalization ability of the model, while significantly reducing computational time.
Our results indicate that incorporating randomization in the trunk network inputs during training enhances the efficiency and robustness of DeepONet, offering a promising avenue for improving the framework's performance in modeling complex physical systems.
arXiv Detail & Related papers (2024-09-20T07:18:31Z) - A Novel Method for improving accuracy in neural network by reinstating
traditional back propagation technique [0.0]
We propose a novel instant parameter update methodology that eliminates the need for computing gradients at each layer.
Our approach accelerates learning, avoids the vanishing gradient problem, and outperforms state-of-the-art methods on benchmark data sets.
arXiv Detail & Related papers (2023-08-09T16:41:00Z) - Example Forgetting: A Novel Approach to Explain and Interpret Deep
Neural Networks in Seismic Interpretation [12.653673008542155]
deep neural networks are an attractive component for the common interpretation pipeline.
Deep neural networks are frequently met with distrust due to their property of producing semantically incorrect outputs when exposed to sections the model was not trained on.
We introduce a method that effectively relates semantically malfunctioned predictions to their respectful positions within the neural network representation manifold.
arXiv Detail & Related papers (2023-02-24T19:19:22Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Recursive Least-Squares Estimator-Aided Online Learning for Visual
Tracking [58.14267480293575]
We propose a simple yet effective online learning approach for few-shot online adaptation without requiring offline training.
It allows an in-built memory retention mechanism for the model to remember the knowledge about the object seen before.
We evaluate our approach based on two networks in the online learning families for tracking, i.e., multi-layer perceptrons in RT-MDNet and convolutional neural networks in DiMP.
arXiv Detail & Related papers (2021-12-28T06:51:18Z) - Is Deep Image Prior in Need of a Good Education? [57.3399060347311]
Deep image prior was introduced as an effective prior for image reconstruction.
Despite its impressive reconstructive properties, the approach is slow when compared to learned or traditional reconstruction techniques.
We develop a two-stage learning paradigm to address the computational challenge.
arXiv Detail & Related papers (2021-11-23T15:08:26Z) - Analytically Tractable Inference in Deep Neural Networks [0.0]
Tractable Approximate Inference (TAGI) algorithm was shown to be a viable and scalable alternative to backpropagation for shallow fully-connected neural networks.
We are demonstrating how TAGI matches or exceeds the performance of backpropagation, for training classic deep neural network architectures.
arXiv Detail & Related papers (2021-03-09T14:51:34Z) - A ReLU Dense Layer to Improve the Performance of Neural Networks [40.2470651460466]
We propose ReDense as a simple and low complexity way to improve the performance of trained neural networks.
We experimentally show that ReDense can improve the training and testing performance of various neural network architectures.
arXiv Detail & Related papers (2020-10-22T11:56:01Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Auto-Rectify Network for Unsupervised Indoor Depth Estimation [119.82412041164372]
We establish that the complex ego-motions exhibited in handheld settings are a critical obstacle for learning depth.
We propose a data pre-processing method that rectifies training images by removing their relative rotations for effective learning.
Our results outperform the previous unsupervised SOTA method by a large margin on the challenging NYUv2 dataset.
arXiv Detail & Related papers (2020-06-04T08:59:17Z) - Binary Neural Networks: A Survey [126.67799882857656]
The binary neural network serves as a promising technique for deploying deep models on resource-limited devices.
The binarization inevitably causes severe information loss, and even worse, its discontinuity brings difficulty to the optimization of the deep network.
We present a survey of these algorithms, mainly categorized into the native solutions directly conducting binarization, and the optimized ones using techniques like minimizing the quantization error, improving the network loss function, and reducing the gradient error.
arXiv Detail & Related papers (2020-03-31T16:47:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.