Bringing legal knowledge to the public by constructing a legal question bank using large-scale pre-trained language model
- URL: http://arxiv.org/abs/2505.04132v1
- Date: Wed, 07 May 2025 05:07:38 GMT
- Title: Bringing legal knowledge to the public by constructing a legal question bank using large-scale pre-trained language model
- Authors: Mingruo Yuan, Ben Kao, Tien-Hsuan Wu, Michael M. K. Cheung, Henry W. H. Chan, Anne S. Y. Cheung, Felix W. H. Chan, Yongxi Chen,
- Abstract summary: We formulate a three-step approach for bringing legal knowledge to laypersons.<n>First, we translate selected sections of the law into snippets (called CLIC-pages)<n>Second, we construct a Legal Question Bank (LQB), which is a collection of legal questions whose answers can be found in the CLIC-pages.<n>Third, we design an interactive CLIC Recommender (CRec)
- Score: 5.4204929130712145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Access to legal information is fundamental to access to justice. Yet accessibility refers not only to making legal documents available to the public, but also rendering legal information comprehensible to them. A vexing problem in bringing legal information to the public is how to turn formal legal documents such as legislation and judgments, which are often highly technical, to easily navigable and comprehensible knowledge to those without legal education. In this study, we formulate a three-step approach for bringing legal knowledge to laypersons, tackling the issues of navigability and comprehensibility. First, we translate selected sections of the law into snippets (called CLIC-pages), each being a small piece of article that focuses on explaining certain technical legal concept in layperson's terms. Second, we construct a Legal Question Bank (LQB), which is a collection of legal questions whose answers can be found in the CLIC-pages. Third, we design an interactive CLIC Recommender (CRec). Given a user's verbal description of a legal situation that requires a legal solution, CRec interprets the user's input and shortlists questions from the question bank that are most likely relevant to the given legal situation and recommends their corresponding CLIC pages where relevant legal knowledge can be found. In this paper we focus on the technical aspects of creating an LQB. We show how large-scale pre-trained language models, such as GPT-3, can be used to generate legal questions. We compare machine-generated questions (MGQs) against human-composed questions (HCQs) and find that MGQs are more scalable, cost-effective, and more diversified, while HCQs are more precise. We also show a prototype of CRec and illustrate through an example how our 3-step approach effectively brings relevant legal knowledge to the public.
Related papers
- LEXam: Benchmarking Legal Reasoning on 340 Law Exams [61.344330783528015]
LEXam is a novel benchmark derived from 340 law exams spanning 116 law school courses across a range of subjects and degree levels.<n>The dataset comprises 4,886 law exam questions in English and German, including 2,841 long-form, open-ended questions and 2,045 multiple-choice questions.
arXiv Detail & Related papers (2025-05-19T08:48:12Z) - AnnoCaseLaw: A Richly-Annotated Dataset For Benchmarking Explainable Legal Judgment Prediction [56.797874973414636]
AnnoCaseLaw is a first-of-its-kind dataset of 471 meticulously annotated U.S. Appeals Court negligence cases.<n>Our dataset lays the groundwork for more human-aligned, explainable Legal Judgment Prediction models.<n>Results demonstrate that LJP remains a formidable task, with application of legal precedent proving particularly difficult.
arXiv Detail & Related papers (2025-02-28T19:14:48Z) - LawPal : A Retrieval Augmented Generation Based System for Enhanced Legal Accessibility in India [0.0]
Access to legal knowledge in India is often hindered by a lack of awareness, misinformation and limited accessibility to judicial resources.<n>We propose a Retrieval-Augmented Generation (RAG)-based legal chatbots powered by vectorstore oriented FAISS.<n>Our model is trained using an extensive dataset comprising legal books, official documentation and the Indian Constitution.
arXiv Detail & Related papers (2025-02-23T13:45:47Z) - DeliLaw: A Chinese Legal Counselling System Based on a Large Language Model [16.63238943983347]
DeliLaw is a Chinese legal counselling system based on a large language model.
Users can consult professional legal questions, search for legal articles and relevant judgement cases, etc. on the DeliLaw system in a dialogue mode.
arXiv Detail & Related papers (2024-08-01T07:54:52Z) - LawLuo: A Multi-Agent Collaborative Framework for Multi-Round Chinese Legal Consultation [1.9857357818932064]
LawLuo is a multi-agent framework for multi-turn Chinese legal consultations.<n>LawLuo includes four agents: the receptionist agent, which assesses user intent and selects a lawyer agent; the lawyer agent, which interacts with the user; the secretary agent, which organizes conversation records and generates consultation reports.<n>These agents' interactions mimic the operations of real law firms.
arXiv Detail & Related papers (2024-07-23T07:40:41Z) - LeKUBE: A Legal Knowledge Update BEnchmark [30.62956609611883]
How to update the legal knowledge of Large Language Models (LLMs) has become an important research problem in practice.
Existing benchmarks for evaluating knowledge update methods are mostly designed for the open domain.
We introduce the Legal Knowledge Update BEnchmark, i.e. LeKUBE, which evaluates knowledge update methods for legal LLMs across five dimensions.
arXiv Detail & Related papers (2024-07-19T10:40:10Z) - InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws.
We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries.
InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z) - DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval.
We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability.
Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z) - Interpretable Long-Form Legal Question Answering with
Retrieval-Augmented Large Language Models [10.834755282333589]
Long-form Legal Question Answering dataset comprises 1,868 expert-annotated legal questions in the French language.
Our experimental results demonstrate promising performance on automatic evaluation metrics.
As one of the only comprehensive, expert-annotated long-form LQA dataset, LLeQA has the potential to not only accelerate research towards resolving a significant real-world issue, but also act as a rigorous benchmark for evaluating NLP models in specialized domains.
arXiv Detail & Related papers (2023-09-29T08:23:19Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents [56.40163943394202]
We release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding.
We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering.
arXiv Detail & Related papers (2021-05-09T09:39:25Z) - How Does NLP Benefit Legal System: A Summary of Legal Artificial
Intelligence [81.04070052740596]
Legal Artificial Intelligence (LegalAI) focuses on applying the technology of artificial intelligence, especially natural language processing, to benefit tasks in the legal domain.
This paper introduces the history, the current state, and the future directions of research in LegalAI.
arXiv Detail & Related papers (2020-04-25T14:45:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.