Rationale-Augmented Retrieval with Constrained LLM Re-Ranking for Task Discovery
- URL: http://arxiv.org/abs/2510.05131v1
- Date: Wed, 01 Oct 2025 01:28:59 GMT
- Title: Rationale-Augmented Retrieval with Constrained LLM Re-Ranking for Task Discovery
- Authors: Bowen Wei,
- Abstract summary: Head Start programs utilizing GoEngage face significant challenges when new or rotating staff attempt to locate appropriate Tasks on the platform homepage.<n>These difficulties arise from domain-specific jargon, system-specific nomenclature, and the inherent limitations of lexical search in handling typos and varied word ordering.<n>We propose a pragmatic hybrid semantic search system that combines lightweight typo-tolerant lexical retrieval, embedding-based vector similarity, and constrained large language model (LLM) re-ranking.
- Score: 4.061135251278187
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Head Start programs utilizing GoEngage face significant challenges when new or rotating staff attempt to locate appropriate Tasks (modules) on the platform homepage. These difficulties arise from domain-specific jargon (e.g., IFPA, DRDP), system-specific nomenclature (e.g., Application Pool), and the inherent limitations of lexical search in handling typos and varied word ordering. We propose a pragmatic hybrid semantic search system that synergistically combines lightweight typo-tolerant lexical retrieval, embedding-based vector similarity, and constrained large language model (LLM) re-ranking. Our approach leverages the organization's existing Task Repository and Knowledge Base infrastructure while ensuring trustworthiness through low false-positive rates, evolvability to accommodate terminological changes, and economic efficiency via intelligent caching, shortlist generation, and graceful degradation mechanisms. We provide a comprehensive framework detailing required resources, a phased implementation strategy with concrete milestones, an offline evaluation protocol utilizing curated test cases (Hit@K, Precision@K, Recall@K, MRR), and an online measurement methodology incorporating query success metrics, zero-result rates, and dwell-time proxies.
Related papers
- Failure is Feedback: History-Aware Backtracking for Agentic Traversal in Multimodal Graphs [13.855117422052315]
Open-domain multimodal document retrieval aims to retrieve specific components from large and interconnected document corpora.<n>Existing graph-based retrieval approaches rely on a uniform similarity metric that overlooks hop-specific semantics.<n>We propose Failure is Feedback (FiF), which casts subgraph retrieval as a sequential decision process.<n>FiF achieves state-of-the-art retrieval on the benchmarks of MultimodalQA, MMCoQA and WebQA.
arXiv Detail & Related papers (2026-02-03T11:54:38Z) - Beyond Monolithic Architectures: A Multi-Agent Search and Knowledge Optimization Framework for Agentic Search [56.78490647843876]
Agentic search has emerged as a promising paradigm for complex information seeking by enabling Large Language Models (LLMs) to interleave reasoning with tool use.<n>We propose bfM-ASK, a framework that explicitly decouples agentic search into two complementary roles: Search Behavior Agents, which plan and execute search actions, and Knowledge Management Agents, which aggregate, filter, and maintain a compact internal context.
arXiv Detail & Related papers (2026-01-08T08:13:27Z) - KBQA-R1: Reinforcing Large Language Models for Knowledge Base Question Answering [64.62317305868264]
We present textbfKBQA-R1, a framework that shifts the paradigm from text imitation to interaction optimization via Reinforcement Learning.<n>Treating KBQA as a multi-turn decision process, our model learns to navigate the knowledge base using a list of actions.<n>Experiments on WebQSP, GrailQA, and GraphQuestions demonstrate that KBQA-R1 achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-12-10T17:45:42Z) - Retrieval Augmented Generation (RAG) for Fintech: Agentic Design and Evaluation [0.16754194618631593]
This paper introduces an agentic RAG architecture to address domain-specific and dense terminology challenges.<n>We evaluate our approach against a standard RAG baseline using a curated dataset of 85 question-answer-reference triples from an enterprise knowledge base.
arXiv Detail & Related papers (2025-10-29T13:41:36Z) - Semantic Caching for Low-Cost LLM Serving: From Offline Learning to Online Adaptation [54.61034867177997]
Caching inference responses allows them to be retrieved without another forward pass through the Large Language Models.<n>Traditional exact-match caching overlooks the semantic similarity between queries, leading to unnecessary recomputation.<n>We present a principled, learning-based framework for semantic cache eviction under unknown query and cost distributions.
arXiv Detail & Related papers (2025-08-11T06:53:27Z) - DAMR: Efficient and Adaptive Context-Aware Knowledge Graph Question Answering with LLM-Guided MCTS [28.828541350757714]
This paper proposes Dynamically Adaptive MCTS-based Reasoning (DAMR) for Knowledge Graph Question Answering (KGQA)<n>DAMR integrates Monte Carlo Tree Search (MCTS) with adaptive path evaluation to enable context-aware KGQA.<n>Experiments on multiple KGQA benchmarks show DAMR significantly outperforms SOTA methods.
arXiv Detail & Related papers (2025-08-01T15:38:21Z) - KinyaColBERT: A Lexically Grounded Retrieval Model for Low-Resource Retrieval-Augmented Generation [5.236553729261855]
We propose a new retriever model, KinyaColBERT, which integrates two key concepts: late word-level interactions between queries and documents, and a morphology-based tokenization coupled with two-tier transformer encoding.<n>Our evaluation results indicate that KinyaColBERT outperforms strong baselines and leading commercial text embedding APIs on a Kinyarwanda agricultural retrieval benchmark.
arXiv Detail & Related papers (2025-07-04T01:18:08Z) - Tree-Based Text Retrieval via Hierarchical Clustering in RAGFrameworks: Application on Taiwanese Regulations [0.0]
We propose a hierarchical clustering-based retrieval method that eliminates the need to predefine k.<n>Our approach maintains the accuracy and relevance of system responses while adaptively selecting semantically relevant content.<n>Our framework is simple to implement and easily integrates with existing RAG pipelines, making it a practical solution for real-world applications under limited resources.
arXiv Detail & Related papers (2025-06-16T15:34:29Z) - SweRank: Software Issue Localization with Code Ranking [109.3289316191729]
SweRank is an efficient retrieve-and-rerank framework for software issue localization.<n>We construct SweLoc, a large-scale dataset curated from public GitHub repositories.<n>We show that SweRank achieves state-of-the-art performance, outperforming both prior ranking models and costly agent-based systems.
arXiv Detail & Related papers (2025-05-07T19:44:09Z) - New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM Collaboration [49.180693704510006]
Referring Expression (REC) is a cross-modal task that evaluates the interplay of language understanding, image comprehension, and language-to-image grounding.<n>It serves as an essential testing ground for Multimodal Large Language Models (MLLMs)
arXiv Detail & Related papers (2025-02-27T13:58:44Z) - In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement [71.60563181678323]
Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality.<n>To handle these challenges, a direct solution is to generate high-confidence'' data from unsupervised downstream tasks.<n>We propose a novel approach, pseudo-supervised demonstrations aligned prompt optimization (PAPO) algorithm, which jointly refines both the prompt and the overall pseudo-supervision.
arXiv Detail & Related papers (2024-10-04T03:39:28Z) - Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? [54.667202878390526]
Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases.
We introduce LOFT, a benchmark of real-world tasks requiring context up to millions of tokens designed to evaluate LCLMs' performance on in-context retrieval and reasoning.
Our findings reveal LCLMs' surprising ability to rival state-of-the-art retrieval and RAG systems, despite never having been explicitly trained for these tasks.
arXiv Detail & Related papers (2024-06-19T00:28:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.