Enhancing Cross Entropy with a Linearly Adaptive Loss Function for Optimized Classification Performance
- URL: http://arxiv.org/abs/2507.10574v1
- Date: Thu, 10 Jul 2025 16:38:57 GMT
- Title: Enhancing Cross Entropy with a Linearly Adaptive Loss Function for Optimized Classification Performance
- Authors: Jae Wan Shim,
- Abstract summary: The Linearly Adaptive Cross Entropy Loss function is a novel measure derived from the information theory.<n>The proposed one has an additional term that depends on the predicted probability of the true class.<n>Preliminary results show that the proposed one consistently outperforms the standard cross entropy loss function in terms of classification accuracy.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose the Linearly Adaptive Cross Entropy Loss function. This is a novel measure derived from the information theory. In comparison to the standard cross entropy loss function, the proposed one has an additional term that depends on the predicted probability of the true class. This feature serves to enhance the optimization process in classification tasks involving one-hot encoded class labels. The proposed one has been evaluated on a ResNet-based model using the CIFAR-100 dataset. Preliminary results show that the proposed one consistently outperforms the standard cross entropy loss function in terms of classification accuracy. Moreover, the proposed one maintains simplicity, achieving practically the same efficiency to the traditional cross entropy loss. These findings suggest that our approach could broaden the scope for future research into loss function design.
Related papers
- Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z) - Next Generation Loss Function for Image Classification [0.0]
We experimentally challenge the well-known loss functions, including cross entropy (CE) loss, by utilizing the genetic programming (GP) approach.
One function, denoted as Next Generation Loss (NGL), clearly stood out showing same or better performance for all tested datasets.
arXiv Detail & Related papers (2024-04-19T15:26:36Z) - $f$-Divergence Based Classification: Beyond the Use of Cross-Entropy [4.550290285002704]
We adopt a Bayesian perspective and formulate the classification task as a maximum a posteriori probability problem.
We propose a class of objective functions based on the variational representation of the $f$-divergence.
We theoretically analyze the objective functions proposed and numerically test them in three application scenarios.
arXiv Detail & Related papers (2024-01-02T16:14:02Z) - Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Cut your Losses with Squentropy [19.924900110707284]
We propose the "squentropy" loss, which is the sum of two terms: the cross-entropy loss and the average square loss over the incorrect classes.
We show that the squentropy loss outperforms both the pure cross entropy and rescaled square losses in terms of the classification accuracy.
arXiv Detail & Related papers (2023-02-08T09:21:13Z) - Contrastive Classification and Representation Learning with
Probabilistic Interpretation [5.979778557940212]
Cross entropy loss has served as the main objective function for classification-based tasks.
We propose a new version of the supervised contrastive training that learns jointly the parameters of the classifier and the backbone of the network.
arXiv Detail & Related papers (2022-11-07T15:57:24Z) - Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory.
We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures.
We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z) - Fine-grained Retrieval Prompt Tuning [149.9071858259279]
Fine-grained Retrieval Prompt Tuning steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompt and feature adaptation.
Our FRPT with fewer learnable parameters achieves the state-of-the-art performance on three widely-used fine-grained datasets.
arXiv Detail & Related papers (2022-07-29T04:10:04Z) - EXACT: How to Train Your Accuracy [6.144680854063938]
We propose a new optimization framework by introducing ascentity to a model's output and optimizing expected accuracy.
Experiments on linear models and deep image classification show that the proposed optimization method is a powerful alternative to widely used classification losses.
arXiv Detail & Related papers (2022-05-19T15:13:00Z) - A Flatter Loss for Bias Mitigation in Cross-dataset Facial Age
Estimation [37.107335288543624]
We advocate a cross-dataset protocol for age estimation benchmarking.
We propose a novel loss function that is more effective for neural network training.
arXiv Detail & Related papers (2020-10-20T15:22:29Z) - Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence
Lip-Reading [96.48553941812366]
Lip-reading aims to infer the speech content from the lip movement sequence.
Traditional learning process of seq2seq models suffers from two problems.
We propose a novel pseudo-convolutional policy gradient (PCPG) based method to address these two problems.
arXiv Detail & Related papers (2020-03-09T09:12:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.