Related papers: Crafting the Path: Robust Query Rewriting for Information Retrieval

Crafting the Path: Robust Query Rewriting for Information Retrieval

URL: http://arxiv.org/abs/2407.12529v2
Date: Mon, 26 Aug 2024 04:28:41 GMT
Title: Crafting the Path: Robust Query Rewriting for Information Retrieval
Authors: Ingeol Baek, Jimin Lee, Joonho Yang, Hwanhee Lee,
Abstract summary: We propose a novel structured query rewriting method called Crafting the Path tailored for retrieval systems. We demonstrate that our method is less dependent on the internal parameter knowledge of the model and generates queries with fewer factual inaccuracies.
Score: 4.252699657665555
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Query rewriting aims to generate a new query that can complement the original query to improve the information retrieval system. Recent studies on query rewriting, such as query2doc, query2expand and querey2cot, rely on the internal knowledge of Large Language Models (LLMs) to generate a relevant passage to add information to the query. Nevertheless, the efficacy of these methodologies may markedly decline in instances where the requisite knowledge is not encapsulated within the model's intrinsic parameters. In this paper, we propose a novel structured query rewriting method called Crafting the Path tailored for retrieval systems. Crafting the Path involves a three-step process that crafts query-related information necessary for finding the passages to be searched in each step. Specifically, the Crafting the Path begins with Query Concept Comprehension, proceeds to Query Type Identification, and finally conducts Expected Answer Extraction. Experimental results show that our method outperforms previous rewriting methods, especially in less familiar domains for LLMs. We demonstrate that our method is less dependent on the internal parameter knowledge of the model and generates queries with fewer factual inaccuracies. Furthermore, we observe that \name{} demonstrates superior performance in the retrieval-augmented generation scenarios.

Related papers

Rethinking On-policy Optimization for Query Augmentation [49.87723664806526]
We present the first systematic comparison of prompting-based and RL-based query augmentation across diverse benchmarks.<n>We introduce a novel hybrid method, On-policy Pseudo-document Query Expansion (OPQE), which learns to generate a pseudo-document that maximizes retrieval performance.
arXiv Detail & Related papers (2025-10-20T04:16:28Z)
DOGR: Leveraging Document-Oriented Contrastive Learning in Generative Retrieval [10.770281363775148]
We propose a novel and general generative retrieval framework, namely Leveraging Document-Oriented Contrastive Learning in Generative Retrieval (DOGR) It adopts a two-stage learning strategy that captures the relationship between queries and documents comprehensively through direct interactions. Negative sampling methods and corresponding contrastive learning objectives are implemented to enhance the learning of semantic representations.
arXiv Detail & Related papers (2025-02-11T03:25:42Z)
QUIDS: Query Intent Generation via Dual Space Modeling [12.572815037915348]
We propose a dual-space model that uses semantic relevance and irrelevance information in the returned documents to explain the understanding of the query intent. Our methods produce high-quality query intent descriptions, outperforming existing methods for this task, as well as state-of-the-art query-based summarization methods.
arXiv Detail & Related papers (2024-10-16T09:28:58Z)
Query-oriented Data Augmentation for Session Search [71.84678750612754]
We propose query-oriented data augmentation to enrich search logs and empower the modeling. We generate supplemental training pairs by altering the most important part of a search context. We develop several strategies to alter the current query, resulting in new training data with varying degrees of difficulty.
arXiv Detail & Related papers (2024-07-04T08:08:33Z)
A Surprisingly Simple yet Effective Multi-Query Rewriting Method for Conversational Passage Retrieval [14.389703823471574]
We propose the use of a neural query rewriter to generate multiple queries and show how to integrate those queries in the passage retrieval pipeline efficiently. The main strength of our approach lies in its simplicity: it leverages how the beam search algorithm works and can produce multiple query rewrites at no additional cost.
arXiv Detail & Related papers (2024-06-27T07:43:03Z)
LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency [65.01402723259098]
We propose a novel method of query rewrite named LLM-R2, adopting a large language model (LLM) to propose possible rewrite rules for a database rewrite system. Experimental results have shown that our method can significantly improve the query execution efficiency and outperform the baseline methods.
arXiv Detail & Related papers (2024-04-19T13:17:07Z)
DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge. Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z)
MILL: Mutual Verification with Large Language Models for Zero-Shot Query Expansion [39.24969189479343]
We propose a novel zero-shot query expansion framework utilizing large language models (LLMs) for mutual verification. Our proposed method is fully zero-shot, and extensive experiments on three public benchmark datasets are conducted to demonstrate its effectiveness.
arXiv Detail & Related papers (2023-10-29T16:04:10Z)
Enhancing Conversational Search: Large Language Model-Aided Informative Query Rewriting [42.35788605017555]
We propose utilizing large language models (LLMs) as query rewriters. We define four essential properties for well-formed rewrites and incorporate all of them into the instruction. We introduce the role of rewrite editors for LLMs when initial query rewrites are available, forming a "rewrite-then-edit" process.
arXiv Detail & Related papers (2023-10-15T03:04:17Z)
ConvGQR: Generative Query Reformulation for Conversational Search [37.54018632257896]
ConvGQR is a new framework to reformulate conversational queries based on generative pre-trained language models. We propose a knowledge infusion mechanism to optimize both query reformulation and retrieval.
arXiv Detail & Related papers (2023-05-25T01:45:06Z)
Query Rewriting for Retrieval-Augmented Large Language Models [139.242907155883]
Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline. This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs.
arXiv Detail & Related papers (2023-05-23T17:27:50Z)
CAPSTONE: Curriculum Sampling for Dense Retrieval with Document Expansion [68.19934563919192]
We propose a curriculum sampling strategy that utilizes pseudo queries during training and progressively enhances the relevance between the generated query and the real query. Experimental results on both in-domain and out-of-domain datasets demonstrate that our approach outperforms previous dense retrieval models.
arXiv Detail & Related papers (2022-12-18T15:57:46Z)
Query Understanding via Intent Description Generation [75.64800976586771]
We propose a novel Query-to-Intent-Description (Q2ID) task for query understanding. Unlike existing ranking tasks which leverage the query and its description to compute the relevance of documents, Q2ID is a reverse task which aims to generate a natural language intent description. We demonstrate the effectiveness of our model by comparing with several state-of-the-art generation models on the Q2ID task.
arXiv Detail & Related papers (2020-08-25T08:56:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.