Related papers: Adma: A Flexible Loss Function for Neural Networks

Adma: A Flexible Loss Function for Neural Networks

URL: http://arxiv.org/abs/2007.12499v1
Date: Thu, 23 Jul 2020 02:41:09 GMT
Title: Adma: A Flexible Loss Function for Neural Networks
Authors: Aditya Shrivastava
Abstract summary: We come up with the idea that instead of static plugins that the currently available loss functions are, they should by default be flexible in nature. A flexible loss function can be a more insightful navigator for neural networks leading to higher convergence rates. We introduce a novel flexible loss function for neural networks.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Highly increased interest in Artificial Neural Networks (ANNs) have resulted in impressively wide-ranging improvements in its structure. In this work, we come up with the idea that instead of static plugins that the currently available loss functions are, they should by default be flexible in nature. A flexible loss function can be a more insightful navigator for neural networks leading to higher convergence rates and therefore reaching the optimum accuracy more quickly. The insights to help decide the degree of flexibility can be derived from the complexity of ANNs, the data distribution, selection of hyper-parameters and so on. In the wake of this, we introduce a novel flexible loss function for neural networks. The function is shown to characterize a range of fundamentally unique properties from which, much of the properties of other loss functions are only a subset and varying the flexibility parameter in the function allows it to emulate the loss curves and the learning behavior of prevalent static loss functions. The extensive experimentation performed with the loss function demonstrates that it is able to give state-of-the-art performance on selected data sets. Thus, in all the idea of flexibility itself and the proposed function built upon it carry the potential to open to a new interesting chapter in deep learning research.

Related papers

Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization [66.03821840425539]
In this paper, we investigate the training dynamics of $L$-layer neural networks using the tensor gradient program (SGD) framework. We show that SGD enables these networks to learn linearly independent features that substantially deviate from their initial values. This rich feature space captures relevant data information and ensures that any convergent point of the training process is a global minimum.
arXiv Detail & Related papers (2025-03-12T17:33:13Z)
High-Fidelity Transfer of Functional Priors for Wide Bayesian Neural Networks by Learning Activations [1.0468715529145969]
We show how trainable activations can accommodate complex function-space priors on BNNs. We discuss critical learning challenges, including identifiability, loss construction, and symmetries. Our empirical findings demonstrate that even BNNs with a single wide hidden layer, can effectively achieve high-fidelity function-space priors.
arXiv Detail & Related papers (2024-10-21T08:42:10Z)
Dynamical loss functions shape landscape topography and improve learning in artificial neural networks [0.9208007322096533]
We show how to transform cross-entropy and mean squared error into dynamical loss functions. We show how they significantly improve validation accuracy for networks of varying sizes.
arXiv Detail & Related papers (2024-10-14T16:27:03Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors. LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task. Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Empirical Loss Landscape Analysis of Neural Network Activation Functions [0.0]
Activation functions play a significant role in neural network design by enabling non-linearity. This study empirically investigates neural network loss landscapes associated with hyperbolic tangent, rectified linear unit, and exponential linear unit activation functions.
arXiv Detail & Related papers (2023-06-28T10:46:14Z)
Evaluating the Impact of Loss Function Variation in Deep Learning for Classification [0.0]
The loss function is arguably among the most important hyper parameters for a neural network. We consider deep neural networks in a supervised classification setting and analyze the impact the choice of loss function has onto the training result. While certain loss functions perform suboptimally, our work empirically shows that under-represented losses can outperform the State-of-the-Art choices significantly.
arXiv Detail & Related papers (2022-10-28T09:10:10Z)
Adaptive Self-supervision Algorithms for Physics-informed Neural Networks [59.822151945132525]
Physics-informed neural networks (PINNs) incorporate physical knowledge from the problem domain as a soft constraint on the loss function. We study the impact of the location of the collocation points on the trainability of these models. We propose a novel adaptive collocation scheme which progressively allocates more collocation points to areas where the model is making higher errors.
arXiv Detail & Related papers (2022-07-08T18:17:06Z)
Data-Driven Learning of Feedforward Neural Networks with Different Activation Functions [0.0]
This work contributes to the development of a new data-driven method (D-DM) of feedforward neural networks (FNNs) learning.
arXiv Detail & Related papers (2021-07-04T18:20:27Z)
Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent. We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z)
The Connection Between Approximation, Depth Separation and Learnability in Neural Networks [70.55686685872008]
We study the connection between learnability and approximation capacity. We show that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target.
arXiv Detail & Related papers (2021-01-31T11:32:30Z)
Topological obstructions in neural networks learning [67.8848058842671]
We study global properties of the loss gradient function flow. We use topological data analysis of the loss function and its Morse complex to relate local behavior along gradient trajectories with global properties of the loss surface.
arXiv Detail & Related papers (2020-12-31T18:53:25Z)
Flexible Transmitter Network [84.90891046882213]
Current neural networks are mostly built upon the MP model, which usually formulates the neuron as executing an activation function on the real-valued weighted aggregation of signals received from other neurons. We propose the Flexible Transmitter (FT) model, a novel bio-plausible neuron model with flexible synaptic plasticity. We present the Flexible Transmitter Network (FTNet), which is built on the most common fully-connected feed-forward architecture.
arXiv Detail & Related papers (2020-04-08T06:55:12Z)
Rational neural networks [3.4376560669160394]
We consider neural networks with rational activation functions. We prove that rational neural networks approximate smooth functions more efficiently than ReLU networks with exponentially smaller depth.
arXiv Detail & Related papers (2020-04-04T10:36:11Z)
Beyond Dropout: Feature Map Distortion to Regularize Deep Neural Networks [107.77595511218429]
In this paper, we investigate the empirical Rademacher complexity related to intermediate layers of deep neural networks. We propose a feature distortion method (Disout) for addressing the aforementioned problem. The superiority of the proposed feature map distortion for producing deep neural network with higher testing performance is analyzed and demonstrated.
arXiv Detail & Related papers (2020-02-23T13:59:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.