OptiSet: Unified Optimizing Set Selection and Ranking for Retrieval-Augmented Generation
- URL: http://arxiv.org/abs/2601.05027v1
- Date: Thu, 08 Jan 2026 15:35:01 GMT
- Title: OptiSet: Unified Optimizing Set Selection and Ranking for Retrieval-Augmented Generation
- Authors: Yi Jiang, Sendong Zhao, Jianbo Li, Bairui Hu, Yanrui Du, Haochun Wang, Bing Qin,
- Abstract summary: Retrieval-Augmented Generation (RAG) improves generation quality by incorporating evidence retrieved from large external corpora.<n>We propose OptiSet, a set-centric framework that unifies set selection and set-level ranking for RAG.
- Score: 46.01696202049653
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval-Augmented Generation (RAG) improves generation quality by incorporating evidence retrieved from large external corpora. However, most existing methods rely on statically selecting top-k passages based on individual relevance, which fails to exploit combinatorial gains among passages and often introduces substantial redundancy. To address this limitation, we propose OptiSet, a set-centric framework that unifies set selection and set-level ranking for RAG. OptiSet adopts an "Expand-then-Refine" paradigm: it first expands a query into multiple perspectives to enable a diverse candidate pool and then refines the candidate pool via re-selection to form a compact evidence set. We then devise a self-synthesis strategy without strong LLM supervision to derive preference labels from the set conditional utility changes of the generator, thereby identifying complementary and redundant evidence. Finally, we introduce a set-list wise training strategy that jointly optimizes set selection and set-level ranking, enabling the model to favor compact, high-gain evidence sets. Extensive experiments demonstrate that OptiSet improves performance on complex combinatorial problems and makes generation more efficient. The source code is publicly available.
Related papers
- RecNet: Self-Evolving Preference Propagation for Agentic Recommender Systems [109.9061591263748]
RecNet is a self-evolving preference propagation framework for recommender systems.<n>It proactively propagates real-time preference updates across related users and items.<n>In the backward phase, the feedback-driven propagation optimization mechanism simulates a multi-agent reinforcement learning framework.
arXiv Detail & Related papers (2026-01-29T12:14:31Z) - Building Coding Agents via Entropy-Enhanced Multi-Turn Preference Optimization [13.271737599933147]
We introduce EntroPO, an entropy-enhanced framework that adapts existing preference optimization algorithms to the multi-turn, tool-assisted setting.<n>We validate EntroPO by fine-tuning a diverse suite of models from different families and sizes.<n>On the swebench leaderboard, our approach establishes new state-of-the-art results among open-weight models.
arXiv Detail & Related papers (2025-09-15T20:36:19Z) - Shifting from Ranking to Set Selection for Retrieval Augmented Generation [16.374737228461125]
Retrieval in Retrieval-Augmented Generation must ensure that retrieved passages are not only individually relevant but also collectively form a comprehensive set.<n>We propose a set-wise passage selection approach and introduce SETR, which explicitly identifies the information requirements of a query through Chain-of-Thought reasoning.<n>Experiments on multi-hop RAG benchmarks show that SETR outperforms both proprietary LLM-based rerankers and open-source baselines in terms of answer correctness and retrieval quality.
arXiv Detail & Related papers (2025-07-09T13:35:36Z) - Zooming from Context to Cue: Hierarchical Preference Optimization for Multi-Image MLLMs [74.74767980885758]
We propose Context-to-Cue Direct Preference Optimization (CcDPO), a multi-level preference optimization framework.<n>CcDPO enhances per-image perception in multi-image settings by zooming into visual clues -- from sequential context to local details.<n> Experiments show that CcDPO significantly reduces hallucinations and yields consistent performance gains.
arXiv Detail & Related papers (2025-05-28T14:24:02Z) - Rethinking LLM-Based Recommendations: A Personalized Query-Driven Parallel Integration [22.650609670923732]
We propose a parallel recommendation framework that decouples large language models from candidate pre-selection.<n>Our framework connects LLMs and recommendation models in a parallel manner, allowing each component to independently utilize its strengths.
arXiv Detail & Related papers (2025-04-16T09:17:45Z) - AMPO: Active Multi-Preference Optimization for Self-play Preference Selection [16.230186347702737]
Multi-preference optimization enriches language-model alignment beyond pairwise preferences by contrasting entire sets of helpful and undesired responses.<n>We propose $textitActive Multi-Preference Optimization$ (AMPO), a novel approach that combines on-policy generation, a multi-preference group-contrastive loss, and active subset selection.<n>AMPO achieves state-of-the-art results on $textitAlpacaEval$ using Llama 8B and Mistral Mist 7B.
arXiv Detail & Related papers (2025-02-25T15:29:51Z) - An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences.
We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration.
Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z) - Training Greedy Policy for Proposal Batch Selection in Expensive Multi-Objective Combinatorial Optimization [52.80408805368928]
We introduce a novel greedy-style subset selection algorithm for batch acquisition.
Our experiments on the red fluorescent proteins show that our proposed method achieves the baseline performance in 1.69x fewer queries.
arXiv Detail & Related papers (2024-06-21T05:57:08Z) - Efficient Prompt Optimization Through the Lens of Best Arm Identification [50.56113809171805]
This work provides a principled framework, TRIPLE, to efficiently perform prompt selection under an explicit budget constraint.
It is built on a novel connection established between prompt optimization and fixed-budget best arm identification (BAI-FB) in multi-armed bandits (MAB)
arXiv Detail & Related papers (2024-02-15T05:31:13Z) - Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts [95.09994361995389]
Relative Preference Optimization (RPO) is designed to discern between more and less preferred responses derived from both identical and related prompts.
RPO has demonstrated a superior ability to align large language models with user preferences and to improve their adaptability during the training process.
arXiv Detail & Related papers (2024-02-12T22:47:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.