Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences
for Image-Text Retrieval
- URL: http://arxiv.org/abs/2111.03349v1
- Date: Fri, 5 Nov 2021 09:36:41 GMT
- Title: Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences
for Image-Text Retrieval
- Authors: Zhihao Fan, Zhongyu Wei, Zejun Li, Siyuan Wang, Jianqing Fan
- Abstract summary: We propose our TAiloring neGative Sentences with Discrimination and Correction (TAGS-DC) to generate synthetic sentences automatically as negative samples.
To keep the difficulty during training, we mutually improve the retrieval and generation through parameter sharing.
In experiments, we verify the effectiveness of our model on MS-COCO and Flickr30K compared with current state-of-the-art models.
- Score: 19.161248757493386
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Matching model is essential for Image-Text Retrieval framework. Existing
research usually train the model with a triplet loss and explore various
strategy to retrieve hard negative sentences in the dataset. We argue that
current retrieval-based negative sample construction approach is limited in the
scale of the dataset thus fail to identify negative sample of high difficulty
for every image. We propose our TAiloring neGative Sentences with
Discrimination and Correction (TAGS-DC) to generate synthetic sentences
automatically as negative samples. TAGS-DC is composed of masking and refilling
to generate synthetic negative sentences with higher difficulty. To keep the
difficulty during training, we mutually improve the retrieval and generation
through parameter sharing. To further utilize fine-grained semantic of mismatch
in the negative sentence, we propose two auxiliary tasks, namely word
discrimination and word correction to improve the training. In experiments, we
verify the effectiveness of our model on MS-COCO and Flickr30K compared with
current state-of-the-art models and demonstrates its robustness and
faithfulness in the further analysis. Our code is available in
https://github.com/LibertFan/TAGS.
Related papers
- Mitigating the Impact of False Negatives in Dense Retrieval with
Contrastive Confidence Regularization [15.204113965411777]
We propose a novel contrastive confidence regularizer for Noise Contrastive Estimation (NCE) loss.
Our analysis shows that the regularizer helps dense retrieval models be more robust against false negatives with a theoretical guarantee.
arXiv Detail & Related papers (2023-12-30T08:01:57Z) - Active Mining Sample Pair Semantics for Image-text Matching [6.370886833310617]
This paper proposes a novel image-text matching model, called Active Mining Sample Pair Semantics image-text matching model (AMSPS)
Compared with the single semantic learning mode of the commonsense learning model with triplet loss function, AMSPS is an active learning idea.
arXiv Detail & Related papers (2023-11-09T15:03:57Z) - Enhancing Multimodal Compositional Reasoning of Visual Language Models
with Generative Negative Mining [58.379339799777064]
Large-scale visual language models (VLMs) exhibit strong representation capacities, making them ubiquitous for enhancing image and text understanding tasks.
We propose a framework that not only mines in both directions but also generates challenging negative samples in both modalities.
Our code and dataset are released at https://ugorsahin.github.io/enhancing-multimodal-compositional-reasoning-of-vlm.html.
arXiv Detail & Related papers (2023-11-07T13:05:47Z) - Your Negative May not Be True Negative: Boosting Image-Text Matching
with False Negative Elimination [62.18768931714238]
We propose a novel False Negative Elimination (FNE) strategy to select negatives via sampling.
The results demonstrate the superiority of our proposed false negative elimination strategy.
arXiv Detail & Related papers (2023-08-08T16:31:43Z) - Debiased Contrastive Learning of Unsupervised Sentence Representations [88.58117410398759]
Contrastive learning is effective in improving pre-trained language models (PLM) to derive high-quality sentence representations.
Previous works mostly adopt in-batch negatives or sample from training data at random.
We present a new framework textbfDCLR to alleviate the influence of these improper negatives.
arXiv Detail & Related papers (2022-05-02T05:07:43Z) - Exploring Negatives in Contrastive Learning for Unpaired Image-to-Image
Translation [12.754320302262533]
We introduce a new negative Pruning technology for Unpaired image-to-image Translation (PUT) by sparsifying and ranking the patches.
The proposed algorithm is efficient, flexible and enables the model to learn essential information between corresponding patches stably.
arXiv Detail & Related papers (2022-04-23T08:31:18Z) - Exploring the Impact of Negative Samples of Contrastive Learning: A Case
Study of Sentence Embedding [14.295787044482136]
We present a momentum contrastive learning model with negative sample queue for sentence embedding, namely MoCoSE.
We define a maximum traceable distance metric, through which we learn to what extent the text contrastive learning benefits from the historical information of negative samples.
Our experiments find that the best results are obtained when the maximum traceable distance is at a certain range, demonstrating that there is an optimal range of historical information for a negative sample queue.
arXiv Detail & Related papers (2022-02-26T08:29:25Z) - Instance-wise Hard Negative Example Generation for Contrastive Learning
in Unpaired Image-to-Image Translation [102.99799162482283]
We present instance-wise hard Negative Example Generation for Contrastive learning in Unpaired image-to-image Translation (NEGCUT)
Specifically, we train a generator to produce negative examples online. The generator is novel from two perspectives: 1) it is instance-wise which means that the generated examples are based on the input image, and 2) it can generate hard negative examples since it is trained with an adversarial loss.
arXiv Detail & Related papers (2021-08-10T09:44:59Z) - Contrastive Learning with Hard Negative Samples [80.12117639845678]
We develop a new family of unsupervised sampling methods for selecting hard negative samples.
A limiting case of this sampling results in a representation that tightly clusters each class, and pushes different classes as far apart as possible.
The proposed method improves downstream performance across multiple modalities, requires only few additional lines of code to implement, and introduces no computational overhead.
arXiv Detail & Related papers (2020-10-09T14:18:53Z) - Adaptive Offline Quintuplet Loss for Image-Text Matching [102.50814151323965]
Existing image-text matching approaches typically leverage triplet loss with online hard negatives to train the model.
We propose solutions by sampling negatives offline from the whole training set.
We evaluate the proposed training approach on three state-of-the-art image-text models on the MS-COCO and Flickr30K datasets.
arXiv Detail & Related papers (2020-03-07T22:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.