Decomposing Complex Queries for Tip-of-the-tongue Retrieval
- URL: http://arxiv.org/abs/2305.15053v1
- Date: Wed, 24 May 2023 11:43:40 GMT
- Title: Decomposing Complex Queries for Tip-of-the-tongue Retrieval
- Authors: Kevin Lin and Kyle Lo and Joseph E. Gonzalez and Dan Klein
- Abstract summary: Complex queries describe content elements (e.g., book characters or events), information beyond the document text.
This retrieval setting, called tip of the tongue (TOT), is especially challenging for models reliant on lexical and semantic overlap between query and document text.
We introduce a simple yet effective framework for handling such complex queries by decomposing the query into individual clues, routing those as sub-queries to specialized retrievers, and ensembling the results.
- Score: 72.07449449115167
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: When re-finding items, users who forget or are uncertain about identifying
details often rely on creative strategies for expressing their information
needs -- complex queries that describe content elements (e.g., book characters
or events), information beyond the document text (e.g., descriptions of book
covers), or personal context (e.g., when they read a book). This retrieval
setting, called tip of the tongue (TOT), is especially challenging for models
heavily reliant on lexical and semantic overlap between query and document
text. In this work, we introduce a simple yet effective framework for handling
such complex queries by decomposing the query into individual clues, routing
those as sub-queries to specialized retrievers, and ensembling the results.
This approach allows us to take advantage of off-the-shelf retrievers (e.g.,
CLIP for retrieving images of book covers) or incorporate retriever-specific
logic (e.g., date constraints). We show that our framework incorportating query
decompositions into retrievers can improve gold book recall up to 7% relative
again for Recall@5 on a new collection of 14,441 real-world query-book pairs
from an online community for resolving TOT inquiries.
Related papers
- Conversational Query Reformulation with the Guidance of Retrieved Documents [4.438698005789677]
We introduce GuideCQR, a framework that utilizes guided documents to refine queries.
Specifically, we augment keywords, generate expected answers from the re-ranked documents, and unify them with the filtering process.
Experimental results show that queries enhanced by guided documents outperform previous CQR methods.
arXiv Detail & Related papers (2024-07-17T07:39:16Z) - Database-Augmented Query Representation for Information Retrieval [59.57065228857247]
We present a novel retrieval framework called Database-Augmented Query representation (DAQu)
DAQu augments the original query with various (query-related) metadata across multiple tables.
We validate DAQu in diverse retrieval scenarios that can incorporate metadata from the relational database.
arXiv Detail & Related papers (2024-06-23T05:02:21Z) - Redefining Information Retrieval of Structured Database via Large Language Models [9.65171883231521]
This paper introduces a novel retrieval augmentation framework called ChatLR.
It primarily employs the powerful semantic understanding ability of Large Language Models (LLMs) as retrievers to achieve precise and concise information retrieval.
Experimental results demonstrate the effectiveness of ChatLR in addressing user queries, achieving an overall information retrieval accuracy exceeding 98.8%.
arXiv Detail & Related papers (2024-05-09T02:37:53Z) - Ask Optimal Questions: Aligning Large Language Models with Retriever's
Preference in Conversational Search [25.16282868262589]
RetPO is designed to optimize a language model (LM) for reformulating search queries in line with the preferences of the target retrieval systems.
We construct a large-scale dataset called Retrievers' Feedback on over 410K query rewrites across 12K conversations.
The resulting model achieves state-of-the-art performance on two recent conversational search benchmarks.
arXiv Detail & Related papers (2024-02-19T04:41:31Z) - Beyond Extraction: Contextualising Tabular Data for Efficient
Summarisation by Language Models [0.0]
The conventional use of the Retrieval-Augmented Generation architecture has proven effective for retrieving information from diverse documents.
This research introduces an innovative approach to enhance the accuracy of complex table queries in RAG-based systems.
arXiv Detail & Related papers (2024-01-04T16:16:14Z) - Decoding a Neural Retriever's Latent Space for Query Suggestion [28.410064376447718]
We show that it is possible to decode a meaningful query from its latent representation and, when moving in the right direction in latent space, to decode a query that retrieves the relevant paragraph.
We employ the query decoder to generate a large synthetic dataset of query reformulations for MSMarco.
On this data, we train a pseudo-relevance feedback (PRF) T5 model for the application of query suggestion.
arXiv Detail & Related papers (2022-10-21T16:19:31Z) - Graph Enhanced BERT for Query Understanding [55.90334539898102]
query understanding plays a key role in exploring users' search intents and facilitating users to locate their most desired information.
In recent years, pre-trained language models (PLMs) have advanced various natural language processing tasks.
We propose a novel graph-enhanced pre-training framework, GE-BERT, which can leverage both query content and the query graph.
arXiv Detail & Related papers (2022-04-03T16:50:30Z) - Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval [98.62404433761432]
The rapid growth of user-generated videos on the Internet has intensified the need for text-based video retrieval systems.
Traditional methods mainly favor the concept-based paradigm on retrieval with simple queries.
We propose a Tree-augmented Cross-modal.
method by jointly learning the linguistic structure of queries and the temporal representation of videos.
arXiv Detail & Related papers (2020-07-06T02:50:27Z) - Query Resolution for Conversational Search with Limited Supervision [63.131221660019776]
We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers.
We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC.
arXiv Detail & Related papers (2020-05-24T11:37:22Z) - Open-Retrieval Conversational Question Answering [62.11228261293487]
We introduce an open-retrieval conversational question answering (ORConvQA) setting, where we learn to retrieve evidence from a large collection before extracting answers.
We build an end-to-end system for ORConvQA, featuring a retriever, a reranker, and a reader that are all based on Transformers.
arXiv Detail & Related papers (2020-05-22T19:39:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.