Related papers: Revisiting Feedback Models for HyDE

Revisiting Feedback Models for HyDE

URL: http://arxiv.org/abs/2511.19349v1
Date: Mon, 24 Nov 2025 17:50:18 GMT
Title: Revisiting Feedback Models for HyDE
Authors: Nour Jedidi, Jimmy Lin,
Abstract summary: HyDE is a method that enriches query representations with LLM-generated hypothetical answer documents.<n>Our experiments show that HyDE's effectiveness can be substantially improved when leveraging feedback algorithms such as Rocchio to extract and weight expansion terms.
Score: 49.53124785319461
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent approaches that leverage large language models (LLMs) for pseudo-relevance feedback (PRF) have generally not utilized well-established feedback models like Rocchio and RM3 when expanding queries for sparse retrievers like BM25. Instead, they often opt for a simple string concatenation of the query and LLM-generated expansion content. But is this optimal? To answer this question, we revisit and systematically evaluate traditional feedback models in the context of HyDE, a popular method that enriches query representations with LLM-generated hypothetical answer documents. Our experiments show that HyDE's effectiveness can be substantially improved when leveraging feedback algorithms such as Rocchio to extract and weight expansion terms, providing a simple way to further enhance the accuracy of LLM-based PRF methods.

Related papers

Small Reward Models via Backward Inference [100.59075794599768]
FLIP (FLipped Inference for Prompt Reconstruction) is a reference-free and rubric-free reward modeling approach.<n>It reformulates reward modeling through backward inference: inferring the instruction that would most plausibly produce a given response.
arXiv Detail & Related papers (2026-02-14T01:55:39Z)
DiffuRank: Effective Document Reranking with Diffusion Language Models [71.16830004674513]
We propose DiffuRank, a reranking framework built upon diffusion language models (dLLMs)<n>dLLMs support more flexible decoding and generation processes that are not constrained to a left-to-right order.<n>We show dLLMs achieve performance comparable to, and in some cases exceeding, that of autoregressive LLMs with similar model sizes.
arXiv Detail & Related papers (2026-02-13T02:18:14Z)
LLM-Assisted Pseudo-Relevance Feedback [5.10348690267577]
Pseudo-relevance feedback methods, such as RM3, estimate an expanded query model from the top-ranked documents.<n>We propose a hybrid alternative that preserves the robustness and interpretability of classical PRF while leveraging semantic judgement.
arXiv Detail & Related papers (2026-01-16T12:31:43Z)
Generalized Pseudo-Relevance Feedback [29.669164314207947]
We introduce an assumption-relaxed framework: textitGeneralized Pseudo Relevance Feedback (GPRF)<n>GPRF performs model-free, natural language rewriting based on retrieved documents.<n>Experiments across multiple benchmarks and retrievers demonstrate GPRF consistently outperforms strong baselines.
arXiv Detail & Related papers (2025-10-29T13:08:35Z)
Rethinking On-policy Optimization for Query Augmentation [49.87723664806526]
We present the first systematic comparison of prompting-based and RL-based query augmentation across diverse benchmarks.<n>We introduce a novel hybrid method, On-policy Pseudo-document Query Expansion (OPQE), which learns to generate a pseudo-document that maximizes retrieval performance.
arXiv Detail & Related papers (2025-10-20T04:16:28Z)
How Do LLM-Generated Texts Impact Term-Based Retrieval Models? [76.92519309816008]
This paper investigates the influence of large language models (LLMs) on term-based retrieval models.<n>Our linguistic analysis reveals that LLM-generated texts exhibit smoother high-frequency and steeper low-frequency Zipf slopes.<n>Our study further explores whether term-based retrieval models demonstrate source bias, concluding that these models prioritize documents whose term distributions closely correspond to those of the queries.
arXiv Detail & Related papers (2025-08-25T06:43:27Z)
R$^2$ec: Towards Large Recommender Models with Reasoning [59.32598867813266]
We propose R$2$ec, a unified large recommender model with intrinsic reasoning capability.<n>R$2$ec introduces a dual-head architecture that supports both reasoning chain generation and efficient item prediction in a single model.<n>To overcome the lack of annotated reasoning data, we design RecPO, a reinforcement learning framework.
arXiv Detail & Related papers (2025-05-22T17:55:43Z)
LLM-VPRF: Large Language Model Based Vector Pseudo Relevance Feedback [31.017301950179295]
Vector Pseudo Relevance Feedback (VPRF) has shown promising results in improving BERT-based dense retrieval systems.<n>This paper investigates the generalizability of VPRF to Large Language Model (LLM) based dense retrievers.
arXiv Detail & Related papers (2025-04-02T08:02:01Z)
Pseudo Relevance Feedback is Enough to Close the Gap Between Small and Large Dense Retrieval Models [29.934928091542375]
Scaling dense retrievers to larger large language model (LLM) backbones has been a dominant strategy for improving their retrieval effectiveness.<n>We introduce PromptPRF, a feature-based pseudo-relevance feedback (PRF) framework that enables small LLM-based dense retrievers to achieve effectiveness comparable to much larger models.
arXiv Detail & Related papers (2025-03-19T04:30:20Z)
GenCRF: Generative Clustering and Reformulation Framework for Enhanced Intent-Driven Information Retrieval [20.807374287510623]
We propose GenCRF: a Generative Clustering and Reformulation Framework to capture diverse intentions adaptively. We show that GenCRF achieves state-of-the-art performance, surpassing previous query reformulation SOTAs by up to 12% on nDCG@10.
arXiv Detail & Related papers (2024-09-17T05:59:32Z)
RaFe: Ranking Feedback Improves Query Rewriting for RAG [83.24385658573198]
We propose a framework for training query rewriting models free of annotations. By leveraging a publicly available reranker, oursprovides feedback aligned well with the rewriting objectives.
arXiv Detail & Related papers (2024-05-23T11:00:19Z)
Regression-aware Inference with LLMs [52.764328080398805]
We show that an inference strategy can be sub-optimal for common regression and scoring evaluation metrics. We propose alternate inference strategies that estimate the Bayes-optimal solution for regression and scoring metrics in closed-form from sampled responses.
arXiv Detail & Related papers (2024-03-07T03:24:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.