Related papers: Generating Enhanced Negatives for Training Language-Based Object Detectors

Generating Enhanced Negatives for Training Language-Based Object Detectors

URL: http://arxiv.org/abs/2401.00094v2
Date: Sat, 13 Apr 2024 02:21:10 GMT
Title: Generating Enhanced Negatives for Training Language-Based Object Detectors
Authors: Shiyu Zhao, Long Zhao, Vijay Kumar B. G, Yumin Suh, Dimitris N. Metaxas, Manmohan Chandraker, Samuel Schulter,
Abstract summary: We propose to leverage the vast knowledge built into modern generative models to automatically build negatives that are more relevant to the original data. Specifically, we use large-language-models to generate negative text descriptions, and text-to-image diffusion models to also generate corresponding negative images. Our experimental analysis confirms the relevance of the generated negative data, and its use in language-based detectors improves performance on two complex benchmarks.
Score: 86.1914216335631
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The recent progress in language-based open-vocabulary object detection can be largely attributed to finding better ways of leveraging large-scale data with free-form text annotations. Training such models with a discriminative objective function has proven successful, but requires good positive and negative samples. However, the free-form nature and the open vocabulary of object descriptions make the space of negatives extremely large. Prior works randomly sample negatives or use rule-based techniques to build them. In contrast, we propose to leverage the vast knowledge built into modern generative models to automatically build negatives that are more relevant to the original data. Specifically, we use large-language-models to generate negative text descriptions, and text-to-image diffusion models to also generate corresponding negative images. Our experimental analysis confirms the relevance of the generated negative data, and its use in language-based detectors improves performance on two complex benchmarks. Code is available at \url{https://github.com/xiaofeng94/Gen-Enhanced-Negs}.

Related papers

ReNeg: Learning Negative Embedding with Reward Guidance [69.81219455975477]
In text-to-image (T2I) generation applications, negative embeddings have proven to be a simple yet effective approach for enhancing generation quality. We introduce ReNeg, an end-to-end method designed to learn improved Negative embeddings guided by a Reward model.
arXiv Detail & Related papers (2024-12-27T13:31:55Z)
Large, Small or Both: A Novel Data Augmentation Framework Based on Language Models for Debiasing Opinion Summarization [32.814792889137145]
Current opinion summarization approaches are reluctant to generate negative summaries given the input of negative texts. We propose a novel data augmentation framework based on both large and small language models. Our framework can effectively alleviate emotional bias same as using only large models, but more economically.
arXiv Detail & Related papers (2024-03-12T14:37:03Z)
Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining [58.379339799777064]
Large-scale visual language models (VLMs) exhibit strong representation capacities, making them ubiquitous for enhancing image and text understanding tasks. We propose a framework that not only mines in both directions but also generates challenging negative samples in both modalities. Our code and dataset are released at https://ugorsahin.github.io/enhancing-multimodal-compositional-reasoning-of-vlm.html.
arXiv Detail & Related papers (2023-11-07T13:05:47Z)
Can large language models generate salient negative statements? [18.577880767789097]
We examine the ability of large language models to generate salient (interesting) negative statements about real-world entities. We probe the LLMs using zero- and k-shot unconstrained probes, and compare with traditional methods for negation generation. We measure the correctness and salience of the generated lists about subjects from different domains.
arXiv Detail & Related papers (2023-05-26T09:13:59Z)
Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding. We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z)
Robust Contrastive Learning Using Negative Samples with Diminished Semantics [23.38896719740166]
We show that by generating carefully designed negative samples, contrastive learning can learn more robust representations. We develop two methods, texture-based and patch-based augmentations, to generate negative samples. We also analyze our method and the generated texture-based samples, showing that texture features are indispensable in classifying particular ImageNet classes.
arXiv Detail & Related papers (2021-10-27T05:38:00Z)
Contrastive Learning with Adversarial Perturbations for Conditional Text Generation [49.055659008469284]
We propose a principled method to generate positive and negative samples for contrastive learning of seq2seq models. Specifically, we generate negative examples by adding small perturbations to the input sequence to minimize its conditional likelihood. We empirically show that our proposed method significantly improves the generalization of the seq2seq on three text generation tasks.
arXiv Detail & Related papers (2020-12-14T06:20:27Z)
Detecting and Exorcising Statistical Demons from Language Models with Anti-Models of Negative Data [13.392212395386933]
We find that within a model family, as the number of parameters, training epochs, and data set size increase, so does a model's ability to generalize to negative n-gram data. We propose a form of inductive bias that attenuates such undesirable signals with negative data distributions automatically learned from positive data.
arXiv Detail & Related papers (2020-10-22T16:45:32Z)
Comparison of Interactive Knowledge Base Spelling Correction Models for Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict. This work shows a comparison of a neural model and character language models with varying amounts on target language data. Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z)
Reinforced Negative Sampling over Knowledge Graph for Recommendation [106.07209348727564]
We develop a new negative sampling model, Knowledge Graph Policy Network (kgPolicy), which works as a reinforcement learning agent to explore high-quality negatives. kgPolicy navigates from the target positive interaction, adaptively receives knowledge-aware negative signals, and ultimately yields a potential negative item to train the recommender.
arXiv Detail & Related papers (2020-03-12T12:44:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.