INSTRUCTIR: A Benchmark for Instruction Following of Information
Retrieval Models
- URL: http://arxiv.org/abs/2402.14334v1
- Date: Thu, 22 Feb 2024 06:59:50 GMT
- Title: INSTRUCTIR: A Benchmark for Instruction Following of Information
Retrieval Models
- Authors: Hanseok Oh, Hyunji Lee, Seonghyeon Ye, Haebin Shin, Hansol Jang,
Changwook Jun, Minjoon Seo
- Abstract summary: retrievers often only prioritize query information without delving into the users' intended search context.
We propose a novel benchmark,INSTRUCTIR, specifically designed to evaluate instruction-following ability in information retrieval tasks.
We observe that retrievers fine-tuned to follow task-style instructions, such as INSTRUCTOR, can underperform compared to their non-instruction-tuned counterparts.
- Score: 32.16908034520376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the critical need to align search targets with users' intention,
retrievers often only prioritize query information without delving into the
users' intended search context. Enhancing the capability of retrievers to
understand intentions and preferences of users, akin to language model
instructions, has the potential to yield more aligned search targets. Prior
studies restrict the application of instructions in information retrieval to a
task description format, neglecting the broader context of diverse and evolving
search scenarios. Furthermore, the prevailing benchmarks utilized for
evaluation lack explicit tailoring to assess instruction-following ability,
thereby hindering progress in this field. In response to these limitations, we
propose a novel benchmark,INSTRUCTIR, specifically designed to evaluate
instruction-following ability in information retrieval tasks. Our approach
focuses on user-aligned instructions tailored to each query instance,
reflecting the diverse characteristics inherent in real-world search scenarios.
Through experimental analysis, we observe that retrievers fine-tuned to follow
task-style instructions, such as INSTRUCTOR, can underperform compared to their
non-instruction-tuned counterparts. This underscores potential overfitting
issues inherent in constructing retrievers trained on existing
instruction-aware retrieval datasets.
Related papers
- Beyond Content Relevance: Evaluating Instruction Following in Retrieval Models [17.202017214385826]
This study evaluates the instruction-following capabilities of various retrieval models beyond content relevance.
We develop a novel retrieval evaluation benchmark spanning six document-level attributes.
Our findings reveal that while reranking models generally surpass retrieval models in instruction following, they still face challenges in handling certain attributes.
arXiv Detail & Related papers (2024-10-31T11:47:21Z) - Understanding the User: An Intent-Based Ranking Dataset [2.6145315573431214]
This paper proposes an approach to augmenting such datasets to annotate informative query descriptions.
Our methodology involves utilizing state-of-the-art LLMs to analyze and comprehend the implicit intent within individual queries.
By extracting key semantic elements, we construct detailed and contextually rich descriptions for these queries.
arXiv Detail & Related papers (2024-08-30T08:40:59Z) - Beyond Relevance: Evaluate and Improve Retrievers on Perspective Awareness [56.42192735214931]
retrievers are expected to not only rely on the semantic relevance between the documents and the queries but also recognize the nuanced intents or perspectives behind a user query.
In this work, we study whether retrievers can recognize and respond to different perspectives of the queries.
We show that current retrievers have limited awareness of subtly different perspectives in queries and can also be biased toward certain perspectives.
arXiv Detail & Related papers (2024-05-04T17:10:00Z) - ExcluIR: Exclusionary Neural Information Retrieval [74.08276741093317]
We present ExcluIR, a set of resources for exclusionary retrieval.
evaluation benchmark includes 3,452 high-quality exclusionary queries.
training set contains 70,293 exclusionary queries, each paired with a positive document and a negative document.
arXiv Detail & Related papers (2024-04-26T09:43:40Z) - RAR-b: Reasoning as Retrieval Benchmark [7.275757292756447]
We transform reasoning tasks into retrieval tasks to evaluate reasoning abilities stored in retriever models.
Recent decoder-based embedding models show great promise in narrowing the gap.
We release Reasoning as Retrieval Benchmark (RAR-b), a holistic suite of tasks and settings to evaluate the reasoning abilities stored in retriever models.
arXiv Detail & Related papers (2024-04-09T14:34:48Z) - FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions [71.5977045423177]
We study the use of instructions in Information Retrieval systems.
We introduce our dataset FollowIR, which contains a rigorous instruction evaluation benchmark.
We show that it is possible for IR models to learn to follow complex instructions.
arXiv Detail & Related papers (2024-03-22T14:42:29Z) - Instruct and Extract: Instruction Tuning for On-Demand Information
Extraction [86.29491354355356]
On-Demand Information Extraction aims to fulfill the personalized demands of real-world users.
We present a benchmark named InstructIE, inclusive of both automatically generated training data, as well as the human-annotated test set.
Building on InstructIE, we further develop an On-Demand Information Extractor, ODIE.
arXiv Detail & Related papers (2023-10-24T17:54:25Z) - I3: Intent-Introspective Retrieval Conditioned on Instructions [83.91776238599824]
I3 is a unified retrieval system that performs Intent-Introspective retrieval across various tasks conditioned on Instructions without task-specific training.
I3 incorporates a pluggable introspector in a parameter-isolated manner to comprehend specific retrieval intents.
It utilizes extensive LLM-generated data to train I3 phase-by-phase, embodying two key designs: progressive structure pruning and drawback-based data refinement.
arXiv Detail & Related papers (2023-08-19T14:17:57Z) - Task-aware Retrieval with Instructions [91.87694020194316]
We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries.
We present TART, a multi-task retrieval system trained on the diverse retrieval tasks with instructions.
TART shows strong capabilities to adapt to a new task via instructions and advances the state of the art on two zero-shot retrieval benchmarks.
arXiv Detail & Related papers (2022-11-16T23:13:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.