Leveraging LLMs for Utility-Focused Annotation: Reducing Manual Effort for Retrieval and RAG
- URL: http://arxiv.org/abs/2504.05220v2
- Date: Tue, 08 Apr 2025 02:11:05 GMT
- Title: Leveraging LLMs for Utility-Focused Annotation: Reducing Manual Effort for Retrieval and RAG
- Authors: Hengran Zhang, Minghao Tang, Keping Bi, Jiafeng Guo, Shihao Liu, Daiting Shi, Dawei Yin, Xueqi Cheng,
- Abstract summary: We investigate whether Large Language Models (LLMs) can effectively replace human annotations in training retrieval models.<n>Our experiments show that retrievers trained on utility-focused annotations significantly outperform those trained on human annotations in the out-of-domain setting.<n>Just 20% human-annotated data enables retrievers trained with utility-focused annotations to match the performance of models trained entirely with human annotations.
- Score: 69.51637252264277
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval models typically rely on costly human-labeled query-document relevance annotations for training and evaluation. To reduce this cost and leverage the potential of Large Language Models (LLMs) in relevance judgments, we aim to explore whether LLM-generated annotations can effectively replace human annotations in training retrieval models. Retrieval usually emphasizes relevance, which indicates "topic-relatedness" of a document to a query, while in RAG, the value of a document (or utility) depends on how it contributes to answer generation. Recognizing this mismatch, some researchers use LLM performance on downstream tasks with documents as labels, but this approach requires manual answers for specific tasks, leading to high costs and limited generalization. In another line of work, prompting LLMs to select useful documents as RAG references eliminates the need for human annotation and is not task-specific. If we leverage LLMs' utility judgments to annotate retrieval data, we may retain cross-task generalization without human annotation in large-scale corpora. Therefore, we investigate utility-focused annotation via LLMs for large-scale retriever training data across both in-domain and out-of-domain settings on the retrieval and RAG tasks. To reduce the impact of low-quality positives labeled by LLMs, we design a novel loss function, i.e., Disj-InfoNCE. Our experiments reveal that: (1) Retrievers trained on utility-focused annotations significantly outperform those trained on human annotations in the out-of-domain setting on both tasks, demonstrating superior generalization capabilities. (2) LLM annotation does not replace human annotation in the in-domain setting. However, incorporating just 20% human-annotated data enables retrievers trained with utility-focused annotations to match the performance of models trained entirely with human annotations.
Related papers
- Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning [76.50690734636477]
We introduce Rank-R1, a novel LLM-based reranker that performs reasoning over both the user query and candidate documents before performing the ranking task.
Our experiments on the TREC DL and BRIGHT datasets show that Rank-R1 is highly effective, especially for complex queries.
arXiv Detail & Related papers (2025-03-08T03:14:26Z) - From Human Annotation to LLMs: SILICON Annotation Workflow for Management Research [13.818244562506138]
Large Language Models (LLMs) provide a cost-effective and efficient alternative to human annotation.
This paper introduces the SILICON" (Systematic Inference with LLMs for Information Classification and Notation) workflow.
The workflow integrates established principles of human annotation with systematic prompt optimization and model selection.
arXiv Detail & Related papers (2024-12-19T02:21:41Z) - Invar-RAG: Invariant LLM-aligned Retrieval for Better Generation [43.630437906898635]
We propose a novel two-stage fine-tuning architecture called Invar-RAG.
In the retrieval stage, an LLM-based retriever is constructed by integrating LoRA-based representation learning.
In the generation stage, a refined fine-tuning method is employed to improve LLM accuracy in generating answers based on retrieved information.
arXiv Detail & Related papers (2024-11-11T14:25:37Z) - DIRAS: Efficient LLM Annotation of Document Relevance in Retrieval Augmented Generation [37.823892101215684]
Domain-specific queries usually need nuanced definitions of relevance beyond shallow semantic relevance.<n>Human or GPT-4 annotation is costly and cannot cover all (query, document) pairs.<n>We propose DIRAS (Domain-specific Information Retrieval with Scalability), a manual-annotation-free schema.
arXiv Detail & Related papers (2024-06-20T10:04:09Z) - PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval [76.50690734636477]
We propose PromptReps, which combines the advantages of both categories: no need for training and the ability to retrieve from the whole corpus.
The retrieval system harnesses both dense text embedding and sparse bag-of-words representations.
arXiv Detail & Related papers (2024-04-29T04:51:30Z) - Large Language Models for Data Annotation and Synthesis: A Survey [49.8318827245266]
This survey focuses on the utility of Large Language Models for data annotation and synthesis.<n>It includes an in-depth taxonomy of data types that LLMs can annotate, a review of learning strategies for models utilizing LLM-generated annotations, and a detailed discussion of the primary challenges and limitations associated with using LLMs for data annotation and synthesis.
arXiv Detail & Related papers (2024-02-21T00:44:04Z) - LLatrieval: LLM-Verified Retrieval for Verifiable Generation [67.93134176912477]
Verifiable generation aims to let the large language model (LLM) generate text with supporting documents.
We propose LLatrieval (Large Language Model Verified Retrieval), where the LLM updates the retrieval result until it verifies that the retrieved documents can sufficiently support answering the question.
Experiments show that LLatrieval significantly outperforms extensive baselines and achieves state-of-the-art results.
arXiv Detail & Related papers (2023-11-14T01:38:02Z) - CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large
Language Models for Data Annotation [94.59630161324013]
We propose CoAnnotating, a novel paradigm for Human-LLM co-annotation of unstructured texts at scale.
Our empirical study shows CoAnnotating to be an effective means to allocate work from results on different datasets, with up to 21% performance improvement over random baseline.
arXiv Detail & Related papers (2023-10-24T08:56:49Z) - Automated Annotation with Generative AI Requires Validation [0.0]
Generative large language models (LLMs) can be a powerful tool for augmenting text annotation procedures.
We outline a workflow to harness the annotation potential of LLMs in a principled, efficient way.
We find that LLM performance for text annotation is promising but highly contingent on both the dataset and the type of annotation task.
arXiv Detail & Related papers (2023-05-31T20:50:45Z) - Query Rewriting for Retrieval-Augmented Large Language Models [139.242907155883]
Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline.
This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs.
arXiv Detail & Related papers (2023-05-23T17:27:50Z) - Description-Based Text Similarity [59.552704474862004]
We identify the need to search for texts based on abstract descriptions of their content.
We propose an alternative model that significantly improves when used in standard nearest neighbor search.
arXiv Detail & Related papers (2023-05-21T17:14:31Z) - Active Learning for Abstractive Text Summarization [50.79416783266641]
We propose the first effective query strategy for Active Learning in abstractive text summarization.
We show that using our strategy in AL annotation helps to improve the model performance in terms of ROUGE and consistency scores.
arXiv Detail & Related papers (2023-01-09T10:33:14Z) - Assisted Text Annotation Using Active Learning to Achieve High Quality
with Little Effort [9.379650501033465]
We propose a tool that enables researchers to create large, high-quality, annotated datasets with only a few manual annotations.
We combine an active learning (AL) approach with a pre-trained language model to semi-automatically identify annotation categories.
Our preliminary results show that employing AL strongly reduces the number of annotations for correct classification of even complex and subtle frames.
arXiv Detail & Related papers (2021-12-15T13:14:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.