PolicyBot - Reliable Question Answering over Policy Documents
- URL: http://arxiv.org/abs/2511.13489v1
- Date: Mon, 17 Nov 2025 15:26:10 GMT
- Title: PolicyBot - Reliable Question Answering over Policy Documents
- Authors: Gautam Nagarajan, Omir Kumar, Sudarsun Santhiappan,
- Abstract summary: This work presents PolicyBot, a retrieval-augmented generation (RAG) system designed to answer user queries over policy documents.<n>The system combines domain-specific semantic chunking, multilingual dense embeddings, multi-stage retrieval with reranking, and source-aware generation to provide responses grounded in the original documents.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: All citizens of a country are affected by the laws and policies introduced by their government. These laws and policies serve essential functions for citizens. Such as granting them certain rights or imposing specific obligations. However, these documents are often lengthy, complex, and difficult to navigate, making it challenging for citizens to locate and understand relevant information. This work presents PolicyBot, a retrieval-augmented generation (RAG) system designed to answer user queries over policy documents with a focus on transparency and reproducibility. The system combines domain-specific semantic chunking, multilingual dense embeddings, multi-stage retrieval with reranking, and source-aware generation to provide responses grounded in the original documents. We implemented citation tracing to reduce hallucinations and improve user trust, and evaluated alternative retrieval and generation configurations to identify effective design choices. The end-to-end pipeline is built entirely with open-source tools, enabling easy adaptation to other domains requiring document-grounded question answering. This work highlights design considerations, practical challenges, and lessons learned in deploying trustworthy RAG systems for governance-related contexts.
Related papers
- DAVE: A Policy-Enforcing LLM Spokesperson for Secure Multi-Document Data Sharing [0.0]
DAVE is a usage policy-enforcing spokesperson that answers questions over private documents on behalf of a data provider.<n>We formalize policy-violating information disclosure in this setting, drawing on usage control and information flow security.<n>Our contribution is primarily architectural: we do not yet implement or empirically evaluate the full enforcement pipeline.
arXiv Detail & Related papers (2026-02-19T14:43:48Z) - Long-Context Long-Form Question Answering for Legal Domain [1.2776569352615768]
We address the challenges of long-context question answering in context of long-form answers given the idiosyncrasies of legal documents.<n>We propose a question answering system that can (a) deconstruct domain-specific vocabulary for better retrieval from source documents, (b) parse complex document layouts while isolating sections and footnotes and linking them appropriately, (c) generate comprehensive answers using precise domain-specific vocabulary.
arXiv Detail & Related papers (2026-02-06T20:51:13Z) - Doc-PP: Document Policy Preservation Benchmark for Large Vision-Language Models [13.70855540464427]
We introduce Doc-PP, a novel benchmark constructed from real-world reports requiring reasoning across heterogeneous visual and textual elements under strict non-disclosure policies.<n>Our evaluation highlights a systemic Reasoning-Induced Safety Gap: models frequently leak sensitive information when answers must be inferred through complex synthesis or aggregated across modalities.<n>We propose DVA, a structural inference framework that decouples reasoning from policy verification.
arXiv Detail & Related papers (2026-01-07T13:45:39Z) - Query Decomposition for RAG: Balancing Exploration-Exploitation [83.79639293409802]
RAG systems address complex user requests by decomposing them into subqueries, retrieving potentially relevant documents for each, and then aggregating them to generate an answer.<n>We formulate query decomposition and document retrieval in an exploitation-exploration setting, where retrieving one document at a time builds a belief about the utility of a given sub-queries.<n>Our main finding is that estimating document relevance using rank information and human judgments yields a 35% gain in document-level precision, 15% increase in alpha-nDCG, and better performance on the downstream task of long-form generation.
arXiv Detail & Related papers (2025-10-21T13:37:11Z) - Fishing for Answers: Exploring One-shot vs. Iterative Retrieval Strategies for Retrieval Augmented Generation [11.180502261031789]
Retrieval-Augmented Generation (RAG) based on Large Language Models (LLMs) is a powerful solution to understand and query the industry's closed-source documents.<n>However, basic RAG often struggles with complex QA tasks in legal and regulatory domains.<n>We explore two strategies to improve evidence coverage and answer quality.
arXiv Detail & Related papers (2025-09-05T05:44:50Z) - All for law and law for all: Adaptive RAG Pipeline for Legal Research [0.8819595592190884]
Retrieval-Augmented Generation (RAG) has transformed how we approach text generation tasks.<n>This work introduces a novel end-to-end RAG pipeline that improves upon previous baselines.
arXiv Detail & Related papers (2025-08-18T17:14:03Z) - RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation [35.981443744108255]
We propose a novel RAG framework, namely RichRAG.
It includes a sub-aspect explorer to identify potential sub-aspects of input questions, a retriever to build a candidate pool of diverse external documents related to these sub-aspects, and a generative list-wise ranker.
Experimental results on two publicly available datasets prove that our framework effectively and efficiently provides comprehensive and satisfying responses to users.
arXiv Detail & Related papers (2024-06-18T12:52:51Z) - DAPR: A Benchmark on Document-Aware Passage Retrieval [57.45793782107218]
We propose and name this task emphDocument-Aware Passage Retrieval (DAPR)
While analyzing the errors of the State-of-The-Art (SoTA) passage retrievers, we find the major errors (53.5%) are due to missing document context.
Our created benchmark enables future research on developing and comparing retrieval systems for the new task.
arXiv Detail & Related papers (2023-05-23T10:39:57Z) - Generate rather than Retrieve: Large Language Models are Strong Context
Generators [74.87021992611672]
We present a novel perspective for solving knowledge-intensive tasks by replacing document retrievers with large language model generators.
We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
arXiv Detail & Related papers (2022-09-21T01:30:59Z) - GERE: Generative Evidence Retrieval for Fact Verification [57.78768817972026]
We propose GERE, the first system that retrieves evidences in a generative fashion.
The experimental results on the FEVER dataset show that GERE achieves significant improvements over the state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-12T03:49:35Z) - Design Challenges for a Multi-Perspective Search Engine [44.48345943046946]
We study a new perspective-oriented document retrieval paradigm.
We discuss and assess the inherent natural language understanding challenges in order to achieve the goal.
We use the prototype system to conduct a user survey in order to assess the utility of our paradigm.
arXiv Detail & Related papers (2021-12-15T18:59:57Z) - Privacy Policy Question Answering Assistant: A Query-Guided Extractive
Summarization Approach [18.51811191325837]
We propose an automated privacy policy question answering assistant that extracts a summary in response to the input user query.
This is a challenging task because users articulate their privacy-related questions in a very different language than the legal language of the policy.
Our pipeline is able to find an answer for 89% of the user queries in the privacyQA dataset.
arXiv Detail & Related papers (2021-09-29T18:00:09Z) - PolicyQA: A Reading Comprehension Dataset for Privacy Policies [77.79102359580702]
We present PolicyQA, a dataset that contains 25,017 reading comprehension style examples curated from an existing corpus of 115 website privacy policies.
We evaluate two existing neural QA models and perform rigorous analysis to reveal the advantages and challenges offered by PolicyQA.
arXiv Detail & Related papers (2020-10-06T09:04:58Z) - Knowledge-Aided Open-Domain Question Answering [58.712857964048446]
We propose a knowledge-aided open-domain QA (KAQA) method which targets at improving relevant document retrieval and answer reranking.
During document retrieval, a candidate document is scored by considering its relationship to the question and other documents.
During answer reranking, a candidate answer is reranked using not only its own context but also the clues from other documents.
arXiv Detail & Related papers (2020-06-09T13:28:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.