Related papers: NevIR: Negation in Neural Information Retrieval

NevIR: Negation in Neural Information Retrieval

URL: http://arxiv.org/abs/2305.07614v2
Date: Mon, 26 Feb 2024 20:55:25 GMT
Title: NevIR: Negation in Neural Information Retrieval
Authors: Orion Weller, Dawn Lawrie, Benjamin Van Durme
Abstract summary: Negation is a common everyday phenomenon and has been a consistent area of weakness for language models (LMs) We construct a benchmark asking IR models to rank two documents that differ only by negation. We show that the results vary widely according to the type of IR architecture: cross-encoders perform best, followed by late-interaction models, and in last place are bi-encoder and sparse neural architectures.
Score: 45.9442701147499
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Negation is a common everyday phenomena and has been a consistent area of weakness for language models (LMs). Although the Information Retrieval (IR) community has adopted LMs as the backbone of modern IR architectures, there has been little to no research in understanding how negation impacts neural IR. We therefore construct a straightforward benchmark on this theme: asking IR models to rank two documents that differ only by negation. We show that the results vary widely according to the type of IR architecture: cross-encoders perform best, followed by late-interaction models, and in last place are bi-encoder and sparse neural architectures. We find that most information retrieval models (including SOTA ones) do not consider negation, performing the same or worse than a random ranking. We show that although the obvious approach of continued fine-tuning on a dataset of contrastive documents containing negations increases performance (as does model size), there is still a large gap between machine and human performance.

Related papers

A Comprehensive Taxonomy of Negation for NLP and Neural Retrievers [61.086220009192424]
We introduce a taxonomy of negation that derives from philosophical, linguistic, and logical definitions.<n>We generate two benchmark datasets that can be used to evaluate the performance of neural information retrieval models.<n>We propose a logic-based classification mechanism that can be used to analyze the performance of retrieval models on existing datasets.
arXiv Detail & Related papers (2025-07-30T02:44:20Z)
Reproducing NevIR: Negation in Neural Information Retrieval [5.950812862331131]
Negation is a fundamental aspect of human communication, yet it remains a challenge for Language Models in Information Retrieval (IR) We reproduce and extend the findings of NevIR, a benchmark study that revealed most IR models perform at or below the level of random ranking when dealing with negation. Our findings show that a recently emerging category-listwise Large Language Model (LLM) re-rankers-outperforms other models but still underperforms human performance.
arXiv Detail & Related papers (2025-02-19T07:50:59Z)
Vision-Language Models Do Not Understand Negation [50.27667000027403]
NegBench is a benchmark designed to evaluate negation understanding across 18 task variations and 79k examples spanning image, video, and medical datasets. We show that this approach can result in a 10% increase in recall on negated queries and a 40% boost in accuracy on multiple-choice questions with negated captions.
arXiv Detail & Related papers (2025-01-16T09:55:42Z)
Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective [111.58315434849047]
robustness of neural information retrieval models (IR) models has garnered significant attention. We view the robustness of IR to be a multifaceted concept, emphasizing its necessity against adversarial attacks, out-of-distribution (OOD) scenarios and performance variance. We provide an in-depth discussion of existing methods, datasets, and evaluation metrics, shedding light on challenges and future directions in the era of large language models.
arXiv Detail & Related papers (2024-07-09T16:07:01Z)
Explainable AI for Comparative Analysis of Intrusion Detection Models [20.683181384051395]
This research analyzes various machine learning models to the tasks of binary and multi-class classification for intrusion detection from network traffic. We trained all models to the accuracy of 90% on the UNSW-NB15 dataset. We also discover that Random Forest provides the best performance in terms of accuracy, time efficiency and robustness.
arXiv Detail & Related papers (2024-06-14T03:11:01Z)
Evaluating Machine Learning Models with NERO: Non-Equivariance Revealed on Orbits [19.45052971156096]
We propose a novel evaluation workflow, named Non-Equivariance Revealed on Orbits (NERO) Evaluation. NERO evaluation is consist of a task-agnostic interactive interface and a set of visualizations, called NERO plots. Case studies on how NERO evaluation can be applied to multiple research areas, including 2D digit recognition, object detection, particle image velocimetry (PIV), and 3D point cloud classification.
arXiv Detail & Related papers (2023-05-31T14:24:35Z)
Mind the Backbone: Minimizing Backbone Distortion for Robust Object Detection [52.355018626115346]
Building object detectors that are robust to domain shifts is critical for real-world applications. We propose to use Relative Gradient Norm as a way to measure the vulnerability of a backbone to feature distortion. We present recipes to boost OOD robustness for both types of backbones.
arXiv Detail & Related papers (2023-03-26T14:50:43Z)
Towards Regression-Free Neural Networks for Diverse Compute Platforms [50.64489250972764]
We introduce REGression constrained Neural Architecture Search (REG-NAS) to design a family of highly accurate models that engender fewer negative flips. REG-NAS consists of two components: (1) A novel architecture constraint that enables a larger model to contain all the weights of the smaller one thus maximizing weight sharing. We demonstrate that regnas can successfully find desirable architectures with few negative flips in three popular architecture search spaces.
arXiv Detail & Related papers (2022-09-27T23:19:16Z)
Entity-Conditioned Question Generation for Robust Attention Distribution in Neural Information Retrieval [51.53892300802014]
We show that supervised neural information retrieval models are prone to learning sparse attention patterns over passage tokens. Using a novel targeted synthetic data generation method, we teach neural IR to attend more uniformly and robustly to all entities in a given passage.
arXiv Detail & Related papers (2022-04-24T22:36:48Z)
Characterizing and Understanding the Behavior of Quantized Models for Reliable Deployment [32.01355605506855]
Quantization-aware training can produce more stable models than standard, adversarial, and Mixup training. Disagreements often have closer top-1 and top-2 output probabilities, and $Margin$ is a better indicator than the other uncertainty metrics to distinguish disagreements. We opensource our code and models as a new benchmark for further studying the quantized models.
arXiv Detail & Related papers (2022-04-08T11:19:16Z)
Match Your Words! A Study of Lexical Matching in Neural Information Retrieval [11.930815087240479]
We study the behavior of different state-of-the-art neural IR models, focusing on whether they are able to perform lexical matching when it's actually useful. We show that neural IR models fail to properly generalize term importance on out-of-domain collections or terms almost unseen during training.
arXiv Detail & Related papers (2021-12-10T16:49:49Z)
Learning from Context or Names? An Empirical Study on Neural Relation Extraction [112.06614505580501]
We study the effect of two main information sources in text: textual context and entity mentions (names) We propose an entity-masked contrastive pre-training framework for relation extraction (RE) Our framework can improve the effectiveness and robustness of neural models in different RE scenarios.
arXiv Detail & Related papers (2020-10-05T11:21:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.