Related papers: A request for clarity over the End of Sequence token in the Self-Critical Sequence Training

A request for clarity over the End of Sequence token in the Self-Critical Sequence Training

URL: http://arxiv.org/abs/2305.12254v1
Date: Sat, 20 May 2023 18:01:47 GMT
Title: A request for clarity over the End of Sequence token in the Self-Critical Sequence Training
Authors: Jia Cheng Hu, Roberto Cavicchioli and Alessandro Capotondi
Abstract summary: This work proposes to solve the problem by spreading awareness of the issue itself. In particular, we invite future works to share a simple and informative signature with the help of a library called SacreEOS.
Score: 69.3939291118954
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The Image Captioning research field is currently compromised by the lack of transparency and awareness over the End-of-Sequence token (<Eos>) in the Self-Critical Sequence Training. If the <Eos> token is omitted, a model can boost its performance up to +4.1 CIDEr-D using trivial sentence fragments. While this phenomenon poses an obstacle to a fair evaluation and comparison of established works, people involved in new projects are given the arduous choice between lower scores and unsatisfactory descriptions due to the competitive nature of the research. This work proposes to solve the problem by spreading awareness of the issue itself. In particular, we invite future works to share a simple and informative signature with the help of a library called SacreEOS. Code available at \emph{\href{https://github.com/jchenghu/sacreeos}{https://github.com/jchenghu/sacreeos}}

Related papers

Dense Retrievers Can Fail on Simple Queries: Revealing The Granularity Dilemma of Embeddings [78.05609552686053]
This work focuses on an observed limitation of text encoders: embeddings may not be able to recognize fine-grained entities or events within the semantics.<n>We introduce a new evaluation dataset in Chinese, named CapRetrieval, whose passages are image captions, and queries are phrases inquiring entities or events in various forms.<n>Zero-shot evaluation suggests that encoders may fail on these fine-grained matching, regardless of training sources or model sizes.
arXiv Detail & Related papers (2025-06-10T09:00:33Z)
Keypoint Promptable Re-Identification [76.31113049256375]
Occluded Person Re-Identification (ReID) is a metric learning task that involves matching occluded individuals based on their appearance. We introduce Keypoint Promptable ReID (KPR), a novel formulation of the ReID problem that explicitly complements the input bounding box with a set of semantic keypoints. We release custom keypoint labels for four popular ReID benchmarks. Experiments on person retrieval, but also on pose tracking, demonstrate that our method systematically surpasses previous state-of-the-art approaches.
arXiv Detail & Related papers (2024-07-25T15:20:58Z)
DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning [59.4644086610381]
We propose a novel denoising objective that inherits from another perspective, i.e., the intra-sentence perspective. By introducing both discrete and continuous noise, we generate noisy sentences and then train our model to restore them to their original form. Our empirical evaluations demonstrate that this approach delivers competitive results on both semantic textual similarity (STS) and a wide range of transfer tasks.
arXiv Detail & Related papers (2024-01-24T17:48:45Z)
Divide & Bind Your Attention for Improved Generative Semantic Nursing [19.67265541441422]
We propose Divide & Bind to address the challenges posed by complex prompts and scenarios involving multiple entities. Our approach stands out in its ability to faithfully synthesize desired objects with improved attribute alignment from complex prompts.
arXiv Detail & Related papers (2023-07-20T13:33:28Z)
Task-Adaptive Saliency Guidance for Exemplar-free Class Incremental Learning [60.501201259732625]
We introduce task-adaptive saliency for EFCIL and propose a new framework, which we call Task-Adaptive Saliency Supervision (TASS) Our experiments demonstrate that our method can better preserve saliency maps across tasks and achieve state-of-the-art results on the CIFAR-100, Tiny-ImageNet, and ImageNet-Subset EFCIL benchmarks.
arXiv Detail & Related papers (2022-12-16T02:43:52Z)
A Sentence is Worth 128 Pseudo Tokens: A Semantic-Aware Contrastive Learning Framework for Sentence Embeddings [28.046786376565123]
We propose a semantics-aware contrastive learning framework for sentence embeddings, termed Pseudo-Token BERT (PT-BERT) We exploit the pseudo-token space (i.e., latent semantic space) representation of a sentence while eliminating the impact of superficial features such as sentence length and syntax. Our model outperforms the state-of-the-art baselines on six standard semantic textual similarity (STS) tasks.
arXiv Detail & Related papers (2022-03-11T12:29:22Z)
Consensus Synergizes with Memory: A Simple Approach for Anomaly Segmentation in Urban Scenes [132.16748656557013]
Anomaly segmentation is a crucial task for safety-critical applications, such as autonomous driving in urban scenes. We propose a novel and simple approach named Consensus Synergizes with Memory (CosMe) to address this challenge. Experimental results on several urban scene anomaly segmentation datasets show that CosMe outperforms previous approaches by large margins.
arXiv Detail & Related papers (2021-11-24T10:01:20Z)
Egocentric Action Recognition by Video Attention and Temporal Context [83.57475598382146]
We present the submission of Samsung AI Centre Cambridge to the CVPR 2020 EPIC-Kitchens Action Recognition Challenge. In this challenge, action recognition is posed as the problem of simultaneously predicting a single verb' and noun' class label given an input trimmed video clip. Our solution achieves strong performance on the challenge metrics without using object-specific reasoning nor extra training data.
arXiv Detail & Related papers (2020-07-03T18:00:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.