LegalRAG: A Hybrid RAG System for Multilingual Legal Information Retrieval
- URL: http://arxiv.org/abs/2504.16121v1
- Date: Sat, 19 Apr 2025 06:09:54 GMT
- Title: LegalRAG: A Hybrid RAG System for Multilingual Legal Information Retrieval
- Authors: Muhammad Rafsan Kabir, Rafeed Mohammad Sultan, Fuad Rahman, Mohammad Ruhul Amin, Sifat Momen, Nabeel Mohammed, Shafin Rahman,
- Abstract summary: We develop an efficient bilingual question-answering framework for regulatory documents, specifically the Bangladesh Police Gazettes.<n>Our approach employs modern Retrieval Augmented Generation (RAG) pipelines to enhance information retrieval and response generation.<n>This system enables efficient searching for specific government legal notices, making legal information more accessible.
- Score: 7.059964549363294
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Natural Language Processing (NLP) and computational linguistic techniques are increasingly being applied across various domains, yet their use in legal and regulatory tasks remains limited. To address this gap, we develop an efficient bilingual question-answering framework for regulatory documents, specifically the Bangladesh Police Gazettes, which contain both English and Bangla text. Our approach employs modern Retrieval Augmented Generation (RAG) pipelines to enhance information retrieval and response generation. In addition to conventional RAG pipelines, we propose an advanced RAG-based approach that improves retrieval performance, leading to more precise answers. This system enables efficient searching for specific government legal notices, making legal information more accessible. We evaluate both our proposed and conventional RAG systems on a diverse test set on Bangladesh Police Gazettes, demonstrating that our approach consistently outperforms existing methods across all evaluation metrics.
Related papers
- Multilingual Retrieval-Augmented Generation for Knowledge-Intensive Task [73.35882908048423]
Retrieval-augmented generation (RAG) has become a cornerstone of contemporary NLP.<n>This paper investigates the effectiveness of RAG across multiple languages by proposing novel approaches for multilingual open-domain question-answering.
arXiv Detail & Related papers (2025-04-04T17:35:43Z) - LexRAG: Benchmarking Retrieval-Augmented Generation in Multi-Turn Legal Consultation Conversation [19.633769905100113]
Retrieval-augmented generation (RAG) has proven highly effective in improving large language models (LLMs) across various domains.<n>There is no benchmark specifically designed to assess the effectiveness of RAG in the legal domain.<n>We propose LexRAG, the first benchmark to evaluate RAG systems for multi-turn legal consultations.
arXiv Detail & Related papers (2025-02-28T01:46:32Z) - Optimizing Multi-Stage Language Models for Effective Text Retrieval [0.0]
We introduce a novel two-phase text retrieval pipeline optimized for Japanese legal datasets.<n>Our method leverages advanced language models to achieve state-of-the-art performance.<n>To further enhance robustness and adaptability, we incorporate an ensemble model that integrates multiple retrieval strategies.
arXiv Detail & Related papers (2024-12-26T16:05:19Z) - CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation [68.81271028921647]
We introduce CORAL, a benchmark designed to assess RAG systems in realistic multi-turn conversational settings.
CORAL includes diverse information-seeking conversations automatically derived from Wikipedia.
It supports three core tasks of conversational RAG: passage retrieval, response generation, and citation labeling.
arXiv Detail & Related papers (2024-10-30T15:06:32Z) - KRAG Framework for Enhancing LLMs in the Legal Domain [0.48451657575793666]
This paper introduces Knowledge Representation Augmented Generation (KRAG)
KRAG is a framework designed to enhance the capabilities of Large Language Models (LLMs) within domain-specific applications.
We present Soft PROLEG, an implementation model under KRAG, which uses inference graphs to aid LLMs in delivering structured legal reasoning.
arXiv Detail & Related papers (2024-10-10T02:48:06Z) - Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation [51.8188846284153]
RAG has been widely adopted to enhance Large Language Models (LLMs)
Attributed Text Generation (ATG) has attracted growing attention, which provides citations to support the model's responses in RAG.
This paper proposes a fine-grained ATG method called ReClaim(Refer & Claim), which alternates the generation of references and answers step by step.
arXiv Detail & Related papers (2024-07-01T20:47:47Z) - InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws.
We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries.
InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z) - CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models [49.16989035566899]
Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of large language models (LLMs) by incorporating external knowledge sources.
This paper constructs a large-scale and more comprehensive benchmark, and evaluates all the components of RAG systems in various RAG application scenarios.
arXiv Detail & Related papers (2024-01-30T14:25:32Z) - Finding the Law: Enhancing Statutory Article Retrieval via Graph Neural
Networks [3.5880535198436156]
We propose a novel graph-augmented dense statute retriever (G-DSR) model that incorporates the structure of legislation via a graph neural network to improve dense retrieval performance.
Experimental results show that our approach outperforms strong retrieval baselines on a real-world expert-annotated SAR dataset.
arXiv Detail & Related papers (2023-01-30T12:59:09Z) - LEGAL-BERT: The Muppets straight out of Law School [52.53830441117363]
We explore approaches for applying BERT models to downstream legal tasks, evaluating on multiple datasets.
Our findings indicate that the previous guidelines for pre-training and fine-tuning, often blindly followed, do not always generalize well in the legal domain.
We release LEGAL-BERT, a family of BERT models intended to assist legal NLP research, computational law, and legal technology applications.
arXiv Detail & Related papers (2020-10-06T09:06:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.