Related papers: HyQE: Ranking Contexts with Hypothetical Query Embeddings

HyQE: Ranking Contexts with Hypothetical Query Embeddings

URL: http://arxiv.org/abs/2410.15262v1
Date: Sun, 20 Oct 2024 03:15:01 GMT
Title: HyQE: Ranking Contexts with Hypothetical Query Embeddings
Authors: Weichao Zhou, Jiaxin Zhang, Hilaf Hasson, Anu Singh, Wenchao Li,
Abstract summary: In retrieval-augmented systems, context ranking techniques are commonly employed to reorder the retrieved contexts based on their relevance to a user query. Large language models (LLMs) have been used for ranking contexts. We introduce a scalable ranking framework that combines embedding similarity and LLM capabilities without requiring LLM fine-tuning.
Score: 9.23634055123276
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In retrieval-augmented systems, context ranking techniques are commonly employed to reorder the retrieved contexts based on their relevance to a user query. A standard approach is to measure this relevance through the similarity between contexts and queries in the embedding space. However, such similarity often fails to capture the relevance. Alternatively, large language models (LLMs) have been used for ranking contexts. However, they can encounter scalability issues when the number of candidate contexts grows and the context window sizes of the LLMs remain constrained. Additionally, these approaches require fine-tuning LLMs with domain-specific data. In this work, we introduce a scalable ranking framework that combines embedding similarity and LLM capabilities without requiring LLM fine-tuning. Our framework uses a pre-trained LLM to hypothesize the user query based on the retrieved contexts and ranks the context based on the similarity between the hypothesized queries and the user query. Our framework is efficient at inference time and is compatible with many other retrieval and ranking techniques. Experimental results show that our method improves the ranking performance across multiple benchmarks. The complete code and data are available at https://github.com/zwc662/hyqe

Related papers

Likert or Not: LLM Absolute Relevance Judgments on Fine-Grained Ordinal Scales [3.4068099825211986]
Two most common prompts to elicit relevance judgments are pointwise scoring and listwise ranking.<n>The current research community consensus is that listwise ranking yields superior performance.<n>In tension with this hypothesis, we find that the gap between pointwise scoring and listwise ranking shrinks when pointwise scoring is implemented using a sufficiently large ordinal relevance label space.
arXiv Detail & Related papers (2025-05-25T21:41:35Z)
Maybe you are looking for CroQS: Cross-modal Query Suggestion for Text-to-Image Retrieval [15.757140563856675]
This work introduces a novel task that focuses on suggesting minimal textual modifications needed to explore visually consistent subsets of the collection. To facilitate the evaluation and development of methods, we present a tailored benchmark named CroQS. Baseline methods from related fields, such as image captioning and content summarization, are adapted for this task to provide reference performance scores.
arXiv Detail & Related papers (2024-12-18T13:24:09Z)
Context-DPO: Aligning Language Models for Context-Faithfulness [80.62221491884353]
We propose the first alignment method specifically designed to enhance large language models' context-faithfulness. By leveraging faithful and stubborn responses to questions with provided context from ConFiQA, our Context-DPO aligns LLMs through direct preference optimization. Extensive experiments demonstrate that our Context-DPO significantly improves context-faithfulness, achieving 35% to 280% improvements on popular open-source models.
arXiv Detail & Related papers (2024-12-18T04:08:18Z)
Data Fusion of Synthetic Query Variants With Generative Large Language Models [1.864807003137943]
This work explores the feasibility of using synthetic query variants generated by instruction-tuned Large Language Models in data fusion experiments. We introduce a lightweight, unsupervised, and cost-efficient approach that exploits principled prompting and data fusion techniques. Our analysis shows that data fusion based on synthetic query variants is significantly better than baselines with single queries and also outperforms pseudo-relevance feedback methods.
arXiv Detail & Related papers (2024-11-06T12:54:27Z)
SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization [70.11167263638562]
Social relation reasoning aims to identify relation categories such as friends, spouses, and colleagues from images. We first present a simple yet well-crafted framework named name, which combines the perception capability of Vision Foundation Models (VFMs) and the reasoning capability of Large Language Models (LLMs) within a modular framework.
arXiv Detail & Related papers (2024-10-28T18:10:26Z)
RARe: Retrieval Augmented Retrieval with In-Context Examples [40.963703726988946]
We introduce a simple approach to enable retrievers to use in-context examples. RARE finetunes a pre-trained model with in-context examples whose query is semantically similar to the target query. We find RARe exhibits stronger out-of-domain generalization compared to models using queries without in-context examples.
arXiv Detail & Related papers (2024-10-26T05:46:20Z)
CHIQ: Contextual History Enhancement for Improving Query Rewriting in Conversational Search [67.6104548484555]
We introduce CHIQ, a two-step method that leverages the capabilities of open-source large language models (LLMs) to resolve ambiguities in the conversation history before query rewriting. We demonstrate on five well-established benchmarks that CHIQ leads to state-of-the-art results across most settings.
arXiv Detail & Related papers (2024-06-07T15:23:53Z)
ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models [46.07900122810749]
Large language models (LLMs) have achieved unprecedented performances in various applications, yet evaluating them is still challenging. We contend that utilizing existing relational databases is a promising approach for constructing benchmarks. We propose ERBench, which uses these integrity constraints to convert any database into an LLM benchmark.
arXiv Detail & Related papers (2024-03-08T12:42:36Z)
LLMs for Test Input Generation for Semantic Caches [1.8628177380024746]
Large language models (LLMs) enable state-of-the-art semantic capabilities to be added to software systems. At scale, the cost of serving thousands of users increases massively affecting also user experience. We present VaryGen, an approach for using LLMs for test input generation that produces similar questions from unstructured text documents.
arXiv Detail & Related papers (2024-01-16T06:16:33Z)
Retrieval meets Long Context Large Language Models [59.431200671427064]
Extending context window of large language models (LLMs) is getting popular recently. Retrieval-augmentation versus long context window, which one is better for downstream tasks? Can both methods be combined to get the best of both worlds? Our best model, retrieval-augmented Llama2-70B with 32K context window, outperforms GPT-3.5-turbo-16k and Davinci003 in terms of average score on nine long context tasks.
arXiv Detail & Related papers (2023-10-04T17:59:41Z)
Context Aware Query Rewriting for Text Rankers using LLM [5.164642900490078]
We analyze the utility of large-language models for improved query rewriting for text ranking tasks. We adopt a simple, yet surprisingly effective, approach called context aware query rewriting (CAR) We find that fine-tuning a ranker using re-written queries offers a significant improvement of up to 33% on the passage ranking task and up to 28% on the document ranking task.
arXiv Detail & Related papers (2023-08-31T14:19:50Z)
Allies: Prompting Large Language Model with Beam Search [107.38790111856761]
In this work, we propose a novel method called ALLIES. Given an input query, ALLIES leverages LLMs to iteratively generate new queries related to the original query. By iteratively refining and expanding the scope of the original query, ALLIES captures and utilizes hidden knowledge that may not be directly through retrieval.
arXiv Detail & Related papers (2023-05-24T06:16:44Z)
Large Language Models are Strong Zero-Shot Retriever [89.16756291653371]
We propose a simple method that applies a large language model (LLM) to large-scale retrieval in zero-shot scenarios. Our method, the Language language model as Retriever (LameR), is built upon no other neural models but an LLM.
arXiv Detail & Related papers (2023-04-27T14:45:55Z)
Query Expansion Using Contextual Clue Sampling with Language Models [69.51976926838232]
We propose a combination of an effective filtering strategy and fusion of the retrieved documents based on the generation probability of each context. Our lexical matching based approach achieves a similar top-5/top-20 retrieval accuracy and higher top-100 accuracy compared with the well-established dense retrieval model DPR. For end-to-end QA, the reader model also benefits from our method and achieves the highest Exact-Match score against several competitive baselines.
arXiv Detail & Related papers (2022-10-13T15:18:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.