Related papers: Rethinking Loss Functions for Fact Verification

Rethinking Loss Functions for Fact Verification

URL: http://arxiv.org/abs/2403.08174v1
Date: Wed, 13 Mar 2024 01:56:32 GMT
Title: Rethinking Loss Functions for Fact Verification
Authors: Yuta Mukobara, Yutaro Shigeto, Masashi Shimbo
Abstract summary: We develop two task-specific objectives tailored to FEVER. Experimental results confirm that the proposed objective functions outperform the standard cross-entropy.
Score: 1.2983290324156112
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We explore loss functions for fact verification in the FEVER shared task. While the cross-entropy loss is a standard objective for training verdict predictors, it fails to capture the heterogeneity among the FEVER verdict classes. In this paper, we develop two task-specific objectives tailored to FEVER. Experimental results confirm that the proposed objective functions outperform the standard cross-entropy. Performance is further improved when these objectives are combined with simple class weighting, which effectively overcomes the imbalance in the training data. The souce code is available at https://github.com/yuta-mukobara/RLF-KGAT

Related papers

Enhancing Cross Entropy with a Linearly Adaptive Loss Function for Optimized Classification Performance [0.0]
The Linearly Adaptive Cross Entropy Loss function is a novel measure derived from the information theory.<n>The proposed one has an additional term that depends on the predicted probability of the true class.<n>Preliminary results show that the proposed one consistently outperforms the standard cross entropy loss function in terms of classification accuracy.
arXiv Detail & Related papers (2025-07-10T16:38:57Z)
Task-recency bias strikes back: Adapting covariances in Exemplar-Free Class Incremental Learning [0.3281128493853064]
We tackle the problem of training a model on a sequence of tasks without access to past data. Existing methods represent classes as Gaussian distributions in the feature extractor's latent space. We propose AdaGauss -- a novel method that adapts covariance matrices from task to task.
arXiv Detail & Related papers (2024-09-26T20:18:14Z)
Next Generation Loss Function for Image Classification [0.0]
We experimentally challenge the well-known loss functions, including cross entropy (CE) loss, by utilizing the genetic programming (GP) approach. One function, denoted as Next Generation Loss (NGL), clearly stood out showing same or better performance for all tested datasets.
arXiv Detail & Related papers (2024-04-19T15:26:36Z)
Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs. We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z)
Contrastive Classification and Representation Learning with Probabilistic Interpretation [5.979778557940212]
Cross entropy loss has served as the main objective function for classification-based tasks. We propose a new version of the supervised contrastive training that learns jointly the parameters of the classifier and the backbone of the network.
arXiv Detail & Related papers (2022-11-07T15:57:24Z)
Bridging the Gap Between Target Networks and Functional Regularization [61.051716530459586]
We propose an explicit Functional Regularization that is a convex regularizer in function space and can easily be tuned. We analyze the convergence of our method theoretically and empirically demonstrate that replacing Target Networks with the more theoretically grounded Functional Regularization approach leads to better sample efficiency and performance improvements.
arXiv Detail & Related papers (2022-10-21T22:27:07Z)
On Training Targets and Activation Functions for Deep Representation Learning in Text-Dependent Speaker Verification [18.19207291891767]
Key considerations include training targets, activation functions, and loss functions. We study a range of loss functions when speaker identity is used as the training target. We experimentally show that GELU is able to reduce the error rates of TD-SV significantly compared to sigmoid.
arXiv Detail & Related papers (2022-01-17T14:32:51Z)
Mixing between the Cross Entropy and the Expectation Loss Terms [89.30385901335323]
Cross entropy loss tends to focus on hard to classify samples during training. We show that adding to the optimization goal the expectation loss helps the network to achieve better accuracy. Our experiments show that the new training protocol improves performance across a diverse set of classification domains.
arXiv Detail & Related papers (2021-09-12T23:14:06Z)
Learning Stable Classifiers by Transferring Unstable Features [59.06169363181417]
We study transfer learning in the presence of spurious correlations. We experimentally demonstrate that directly transferring the stable feature extractor learned on the source task may not eliminate these biases for the target task. We hypothesize that the unstable features in the source task and those in the target task are directly related.
arXiv Detail & Related papers (2021-06-15T02:41:12Z)
Optimized Loss Functions for Object detection: A Case Study on Nighttime Vehicle Detection [0.0]
In this paper, we optimize both two loss functions for classification and localization simultaneously. Compared to the existing studies, in which the correlation is only applied to improve the localization accuracy for positive samples, this paper utilizes the correlation to obtain the really hard negative samples. A novel localization loss named MIoU is proposed by incorporating a Mahalanobis distance between predicted box and target box, which eliminate the gradients inconsistency problem in the DIoU loss.
arXiv Detail & Related papers (2020-11-11T03:00:49Z)
Deep F-measure Maximization for End-to-End Speech Understanding [52.36496114728355]
We propose a differentiable approximation to the F-measure and train the network with this objective using standard backpropagation. We perform experiments on two standard fairness datasets, Adult, Communities and Crime, and also on speech-to-intent detection on the ATIS dataset and speech-to-image concept classification on the Speech-COCO dataset. In all four of these tasks, F-measure results in improved micro-F1 scores, with absolute improvements of up to 8% absolute, as compared to models trained with the cross-entropy loss function.
arXiv Detail & Related papers (2020-08-08T03:02:27Z)
A Unified Framework of Surrogate Loss by Refactoring and Interpolation [65.60014616444623]
We introduce UniLoss, a unified framework to generate surrogate losses for training deep networks with gradient descent. We validate the effectiveness of UniLoss on three tasks and four datasets.
arXiv Detail & Related papers (2020-07-27T21:16:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.