Related papers: Negation-Aware Test-Time Adaptation for Vision-Language Models

Negation-Aware Test-Time Adaptation for Vision-Language Models

URL: http://arxiv.org/abs/2507.19064v2
Date: Tue, 05 Aug 2025 03:36:50 GMT
Title: Negation-Aware Test-Time Adaptation for Vision-Language Models
Authors: Haochen Han, Alex Jinpeng Wang, Fangming Liu, Jun Zhu,
Abstract summary: We study a practical but less-touched problem in Vision-Language Models (VLMs)<n>Many real-world applications require models to explicitly identify what is false or non-existent.<n>We propose a Negation-Aware Test-Time Adaptation (NEAT) method to efficiently adjust distribution-related parameters during inference.
Score: 26.043679706381646
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we study a practical but less-touched problem in Vision-Language Models (VLMs), \ie, negation understanding. Specifically, many real-world applications require models to explicitly identify what is false or non-existent, \eg, radiologists may search for images that exclude specific conditions. Despite the impressive transferability of VLMs through large-scale training, they suffer from a critical limitation that fails to handle negation. To address this challenge, existing methods attribute its root cause to the scarcity of negation training data and propose to fine-tune VLMs on massive data containing explicit negation. Undoubtedly, such data-centric solutions demand substantial data and computational resources, limiting their sustainable widespread adoption. To tackle negation in a low-carbon manner, we empirically observe that the key obstacle lies in the dual-concept shifts between the affirmation and negation distributions. Therefore, we propose a Negation-Aware Test-Time Adaptation (NEAT) method to efficiently adjust distribution-related parameters during inference. In brief, NEAT can reduce distribution shift in consistent semantics while eliminating false distributional consistency in unrelated semantics. Extensive experiments on the various negation understanding tasks verify the effectiveness of the proposed method. Remarkably, with less than 0.01\% of trainable parameters, NEAT achieves comparable or superior performance to state-of-the-art post-training approaches. Our code is available at https://github.com/hhc1997/NEAT.

Related papers

What "Not" to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging [42.41372222021938]
State-of-the-art vision-language models (VLMs) suffer from a critical failure in understanding negation, often referred to as affirmative bias.<n>We introduce CoVAND, a dataset constructed with a systematic chain-of-thought (CoT) and VQA-based pipeline to generate high-quality, instance-grounded negation data.<n>Second, we propose NegToMe, a novel text token merging module that directly tackles the architectural cause of affirmative bias.
arXiv Detail & Related papers (2025-10-15T07:36:38Z)
A Sample Efficient Conditional Independence Test in the Presence of Discretization [54.047334792855345]
Conditional Independence (CI) tests directly to discretized data can lead to incorrect conclusions.<n>Recent advancements have sought to infer the correct CI relationship between the latent variables through binarizing observed data.<n>Motivated by this, this paper introduces a sample-efficient CI test that does not rely on the binarization process.
arXiv Detail & Related papers (2025-06-10T12:41:26Z)
From Invariant Representations to Invariant Data: Provable Robustness to Spurious Correlations via Noisy Counterfactual Matching [14.679039741833305]
Models that learn spurious correlations from training data often fail when deployed in new environments.<n>We propose a data-centric alternative that shifts the focus from learning invariant representations to leveraging invariant data pairs.<n>Noisy Counterfactual Matching improves robustness by leveraging even a small number of emphnoisy counterfactual pairs.
arXiv Detail & Related papers (2025-05-30T17:42:32Z)
Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric [99.56567010306807]
Large Language Models (LLMs) have become indispensable across academia, industry, and daily applications.<n>One core challenge of evaluation in the large language model (LLM) era is the generalization issue.<n>We propose Model Utilization Index (MUI), a mechanism interpretability enhanced metric that complements traditional performance scores.
arXiv Detail & Related papers (2025-04-10T04:09:47Z)
Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling [90.86991492288487]
evaluating constraint on every token can be prohibitively expensive.<n> LCD can distort the global distribution over strings, sampling tokens based only on local information.<n>We show that our approach is superior to state-of-the-art baselines.
arXiv Detail & Related papers (2025-04-07T18:30:18Z)
Know "No'' Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP [38.17750132434983]
We introduce data generation pipelines that employ a large language model (LLM) and a multimodal LLM to produce negation-inclusive captions.<n>Fine-tuning CLIP with data generated from our pipelines, we develop NegationCLIP, which enhances negation awareness while preserving the generality.<n>Experiments on various CLIP architectures validate the effectiveness of our data generation pipelines in enhancing CLIP's ability to perceive negation accurately.
arXiv Detail & Related papers (2025-01-19T01:17:05Z)
Vision-Language Models Do Not Understand Negation [50.27667000027403]
NegBench is a benchmark designed to evaluate negation understanding across 18 task variations and $79$k examples.<n>We show that this approach can result in a 10% increase in recall on negated queries and a 28% boost in accuracy on multiple-choice questions with negated captions.
arXiv Detail & Related papers (2025-01-16T09:55:42Z)
Revisiting subword tokenization: A case study on affixal negation in large language models [57.75279238091522]
We measure the impact of affixal negation on modern English large language models (LLMs) We conduct experiments using LLMs with different subword tokenization methods. We show that models can, on the whole, reliably recognize the meaning of affixal negation.
arXiv Detail & Related papers (2024-04-03T03:14:27Z)
Test-Time Distribution Normalization for Contrastively Learned Vision-language Models [39.66329310098645]
One of the most representative approaches proposed recently known as CLIP has garnered widespread adoption due to its effectiveness. This paper reveals that the common downstream practice of taking a dot product is only a zeroth-order approximation of the optimization goal, resulting in a loss of information during test-time. We propose Distribution Normalization (DN), where we approximate the mean representation of a batch of test samples and use such a mean to represent what would be analogous to negative samples in the InfoNCE loss.
arXiv Detail & Related papers (2023-02-22T01:14:30Z)
Contrastive Error Attribution for Finetuned Language Models [35.80256755393739]
noisy and misannotated data is a core cause of hallucinations and unfaithful outputs in Natural Language Generation (NLG) tasks. We introduce a framework to identify and remove low-quality training instances that lead to undesirable outputs. We show that existing approaches for error tracing, such as gradient-based influence measures, do not perform reliably for detecting faithfulness errors.
arXiv Detail & Related papers (2022-12-21T02:28:07Z)
Improving negation detection with negation-focused pre-training [58.32362243122714]
Negation is a common linguistic feature that is crucial in many language understanding tasks. Recent work has shown that state-of-the-art NLP models underperform on samples containing negation. We propose a new negation-focused pre-training strategy, involving targeted data augmentation and negation masking.
arXiv Detail & Related papers (2022-05-09T02:41:11Z)
NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task Models [54.184609286094044]
We propose noise entropy regularisation (NoiER) as an efficient learning paradigm that solves the problem without auxiliary models and additional data. The proposed approach improved traditional OOD detection evaluation metrics by 55% on average compared to the original fine-tuned models.
arXiv Detail & Related papers (2021-08-29T06:58:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.