Related papers: Next Generation Loss Function for Image Classification

Next Generation Loss Function for Image Classification

URL: http://arxiv.org/abs/2404.12948v1
Date: Fri, 19 Apr 2024 15:26:36 GMT
Title: Next Generation Loss Function for Image Classification
Authors: Shakhnaz Akhmedova, Nils Körber,
Abstract summary: We experimentally challenge the well-known loss functions, including cross entropy (CE) loss, by utilizing the genetic programming (GP) approach. One function, denoted as Next Generation Loss (NGL), clearly stood out showing same or better performance for all tested datasets.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural networks are trained by minimizing a loss function that defines the discrepancy between the predicted model output and the target value. The selection of the loss function is crucial to achieve task-specific behaviour and highly influences the capability of the model. A variety of loss functions have been proposed for a wide range of tasks affecting training and model performance. For classification tasks, the cross entropy is the de-facto standard and usually the first choice. Here, we try to experimentally challenge the well-known loss functions, including cross entropy (CE) loss, by utilizing the genetic programming (GP) approach, a population-based evolutionary algorithm. GP constructs loss functions from a set of operators and leaf nodes and these functions are repeatedly recombined and mutated to find an optimal structure. Experiments were carried out on different small-sized datasets CIFAR-10, CIFAR-100 and Fashion-MNIST using an Inception model. The 5 best functions found were evaluated for different model architectures on a set of standard datasets ranging from 2 to 102 classes and very different sizes. One function, denoted as Next Generation Loss (NGL), clearly stood out showing same or better performance for all tested datasets compared to CE. To evaluate the NGL function on a large-scale dataset, we tested its performance on the Imagenet-1k dataset where it showed improved top-1 accuracy compared to models trained with identical settings and other losses. Finally, the NGL was trained on a segmentation downstream task for Pascal VOC 2012 and COCO-Stuff164k datasets improving the underlying model performance.

Related papers

Enhancing Cross Entropy with a Linearly Adaptive Loss Function for Optimized Classification Performance [0.0]
The Linearly Adaptive Cross Entropy Loss function is a novel measure derived from the information theory.<n>The proposed one has an additional term that depends on the predicted probability of the true class.<n>Preliminary results show that the proposed one consistently outperforms the standard cross entropy loss function in terms of classification accuracy.
arXiv Detail & Related papers (2025-07-10T16:38:57Z)
Evaluating Loss Functions for Graph Neural Networks: Towards Pretraining and Generalization [1.2522462543913029]
The study looked at both inductive and transductive settings.<n>We meticulously analyzed the top ten model-loss combinations for each metric based on their average rank.<n>The GIN architecture always showed the highest-level average performance, especially with Cross-Entropy loss.
arXiv Detail & Related papers (2025-06-17T02:12:19Z)
Accelerated Neural Network Training with Rooted Logistic Objectives [13.400503928962756]
We derive a novel sequence of em strictly convex functions that are at least as strict as logistic loss. Our results illustrate that training with rooted loss function is converged faster and gains performance improvements.
arXiv Detail & Related papers (2023-10-05T20:49:48Z)
Alternate Loss Functions for Classification and Robust Regression Can Improve the Accuracy of Artificial Neural Networks [6.452225158891343]
This paper shows that training speed and final accuracy of neural networks can significantly depend on the loss function used to train neural networks. Two new classification loss functions that significantly improve performance on a wide variety of benchmark tasks are proposed.
arXiv Detail & Related papers (2023-03-17T12:52:06Z)
Xtreme Margin: A Tunable Loss Function for Binary Classification Problems [0.0]
We provide an overview of a novel loss function, the Xtreme Margin loss function. Unlike the binary cross-entropy and the hinge loss functions, this loss function provides researchers and practitioners flexibility with their training process.
arXiv Detail & Related papers (2022-10-31T22:39:32Z)
Adaptive Self-supervision Algorithms for Physics-informed Neural Networks [59.822151945132525]
Physics-informed neural networks (PINNs) incorporate physical knowledge from the problem domain as a soft constraint on the loss function. We study the impact of the location of the collocation points on the trainability of these models. We propose a novel adaptive collocation scheme which progressively allocates more collocation points to areas where the model is making higher errors.
arXiv Detail & Related papers (2022-07-08T18:17:06Z)
Self-Supervised Pre-Training for Transformer-Based Person Re-Identification [54.55281692768765]
Transformer-based supervised pre-training achieves great performance in person re-identification (ReID) Due to the domain gap between ImageNet and ReID datasets, it usually needs a larger pre-training dataset to boost the performance. This work aims to mitigate the gap between the pre-training and ReID datasets from the perspective of data and model structure.
arXiv Detail & Related papers (2021-11-23T18:59:08Z)
Adversarial Feature Augmentation and Normalization for Visual Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models. Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings. We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)
Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search [101.73248560009124]
We propose an effective convergence-simulation driven evolutionary search algorithm, CSE-Autoloss, for speeding up the search progress. We conduct extensive evaluations of loss function search on popular detectors and validate the good generalization capability of searched losses. Our experiments show that the best-discovered loss function combinations outperform default combinations by 1.1% and 0.8% in terms of mAP for two-stage and one-stage detectors.
arXiv Detail & Related papers (2021-02-09T08:34:52Z)
FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging [112.19994766375231]
Influence functions approximate the 'influences' of training data-points for test predictions. We present FastIF, a set of simple modifications to influence functions that significantly improves their run-time. Our experiments demonstrate the potential of influence functions in model interpretation and correcting model errors.
arXiv Detail & Related papers (2020-12-31T18:02:34Z)
Deep F-measure Maximization for End-to-End Speech Understanding [52.36496114728355]
We propose a differentiable approximation to the F-measure and train the network with this objective using standard backpropagation. We perform experiments on two standard fairness datasets, Adult, Communities and Crime, and also on speech-to-intent detection on the ATIS dataset and speech-to-image concept classification on the Speech-COCO dataset. In all four of these tasks, F-measure results in improved micro-F1 scores, with absolute improvements of up to 8% absolute, as compared to models trained with the cross-entropy loss function.
arXiv Detail & Related papers (2020-08-08T03:02:27Z)
Optimizing Loss Functions Through Multivariate Taylor Polynomial Parameterization [16.8615211682877]
Loss functions are a type of metaknowledge that is crucial to effective training of deep neural network (DNN) architectures. This paper proposes continuous CMA-ES optimization of Taylor parameterizations. In MNIST, CIFAR-10, and SVHN benchmark tasks, TaylorGLO finds new loss functions that outperform functions previously discovered through GP.
arXiv Detail & Related papers (2020-01-31T21:25:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.