Hybridised Loss Functions for Improved Neural Network Generalisation
- URL: http://arxiv.org/abs/2204.12244v1
- Date: Tue, 26 Apr 2022 11:52:11 GMT
- Title: Hybridised Loss Functions for Improved Neural Network Generalisation
- Authors: Matthew C. Dickson, Anna S. Bosman and Katherine M. Malan
- Abstract summary: Loss functions play an important role in the training of artificial neural networks (ANNs)
It has been shown that the cross entropy and sum squared error loss functions result in different training dynamics.
A hybrid of the entropy and sum squared error loss functions could combine the advantages of the two functions, while limiting their disadvantages.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Loss functions play an important role in the training of artificial neural
networks (ANNs), and can affect the generalisation ability of the ANN model,
among other properties. Specifically, it has been shown that the cross entropy
and sum squared error loss functions result in different training dynamics, and
exhibit different properties that are complementary to one another. It has
previously been suggested that a hybrid of the entropy and sum squared error
loss functions could combine the advantages of the two functions, while
limiting their disadvantages. The effectiveness of such hybrid loss functions
is investigated in this study. It is shown that hybridisation of the two loss
functions improves the generalisation ability of the ANNs on all problems
considered. The hybrid loss function that starts training with the sum squared
error loss function and later switches to the cross entropy error loss function
is shown to either perform the best on average, or to not be significantly
different than the best loss function tested for all problems considered. This
study shows that the minima discovered by the sum squared error loss function
can be further exploited by switching to cross entropy error loss function. It
can thus be concluded that hybridisation of the two loss functions could lead
to better performance in ANNs.
Related papers
- Enhancing Hypergradients Estimation: A Study of Preconditioning and
Reparameterization [49.73341101297818]
Bilevel optimization aims to optimize an outer objective function that depends on the solution to an inner optimization problem.
The conventional method to compute the so-called hypergradient of the outer problem is to use the Implicit Function Theorem (IFT)
We study the error of the IFT method and analyze two strategies to reduce this error.
arXiv Detail & Related papers (2024-02-26T17:09:18Z) - Alternate Loss Functions for Classification and Robust Regression Can Improve the Accuracy of Artificial Neural Networks [6.452225158891343]
This paper shows that training speed and final accuracy of neural networks can significantly depend on the loss function used to train neural networks.
Two new classification loss functions that significantly improve performance on a wide variety of benchmark tasks are proposed.
arXiv Detail & Related papers (2023-03-17T12:52:06Z) - A survey and taxonomy of loss functions in machine learning [60.41650195728953]
Most state-of-the-art machine learning techniques revolve around the optimisation of loss functions.
This survey aims to provide a reference of the most essential loss functions for both beginner and advanced machine learning practitioners.
arXiv Detail & Related papers (2023-01-13T14:38:24Z) - Xtreme Margin: A Tunable Loss Function for Binary Classification
Problems [0.0]
We provide an overview of a novel loss function, the Xtreme Margin loss function.
Unlike the binary cross-entropy and the hinge loss functions, this loss function provides researchers and practitioners flexibility with their training process.
arXiv Detail & Related papers (2022-10-31T22:39:32Z) - Evaluating the Impact of Loss Function Variation in Deep Learning for
Classification [0.0]
The loss function is arguably among the most important hyper parameters for a neural network.
We consider deep neural networks in a supervised classification setting and analyze the impact the choice of loss function has onto the training result.
While certain loss functions perform suboptimally, our work empirically shows that under-represented losses can outperform the State-of-the-Art choices significantly.
arXiv Detail & Related papers (2022-10-28T09:10:10Z) - The Geometry and Calculus of Losses [10.451984251615512]
We develop the theory of loss functions for binary and multiclass classification and class probability estimation problems.
The perspective provides three novel opportunities.
It enables the development of a fundamental relationship between losses and (anti)-norms that appears to have not been noticed before.
Second, it enables the development of a calculus of losses induced by the calculus of convex sets.
Third, the perspective leads to a natural theory of polar'' loss functions, which are derived from the polar dual of the convex set defining the loss.
arXiv Detail & Related papers (2022-09-01T05:57:19Z) - Inference on Strongly Identified Functionals of Weakly Identified
Functions [71.42652863687117]
We study a novel condition for the functional to be strongly identified even when the nuisance function is not.
We propose penalized minimax estimators for both the primary and debiasing nuisance functions.
arXiv Detail & Related papers (2022-08-17T13:38:31Z) - Loss Function Discovery for Object Detection via Convergence-Simulation
Driven Search [101.73248560009124]
We propose an effective convergence-simulation driven evolutionary search algorithm, CSE-Autoloss, for speeding up the search progress.
We conduct extensive evaluations of loss function search on popular detectors and validate the good generalization capability of searched losses.
Our experiments show that the best-discovered loss function combinations outperform default combinations by 1.1% and 0.8% in terms of mAP for two-stage and one-stage detectors.
arXiv Detail & Related papers (2021-02-09T08:34:52Z) - $\sigma^2$R Loss: a Weighted Loss by Multiplicative Factors using
Sigmoidal Functions [0.9569316316728905]
We introduce a new loss function called squared reduction loss ($sigma2$R loss), which is regulated by a sigmoid function to inflate/deflate the error per instance.
Our loss has clear intuition and geometric interpretation, we demonstrate by experiments the effectiveness of our proposal.
arXiv Detail & Related papers (2020-09-18T12:34:40Z) - An Equivalence between Loss Functions and Non-Uniform Sampling in
Experience Replay [72.23433407017558]
We show that any loss function evaluated with non-uniformly sampled data can be transformed into another uniformly sampled loss function.
Surprisingly, we find in some environments PER can be replaced entirely by this new loss function without impact to empirical performance.
arXiv Detail & Related papers (2020-07-12T17:45:24Z) - Approximation Schemes for ReLU Regression [80.33702497406632]
We consider the fundamental problem of ReLU regression.
The goal is to output the best fitting ReLU with respect to square loss given to draws from some unknown distribution.
arXiv Detail & Related papers (2020-05-26T16:26:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.