Exploring Possibilities of AI-Powered Legal Assistance in Bangladesh through Large Language Modeling
- URL: http://arxiv.org/abs/2410.17210v1
- Date: Tue, 22 Oct 2024 17:34:59 GMT
- Title: Exploring Possibilities of AI-Powered Legal Assistance in Bangladesh through Large Language Modeling
- Authors: Azmine Toushik Wasi, Wahid Faisal, Mst Rafia Islam, Mahathir Mohammad Bappy,
- Abstract summary: This research seeks to develop a specialized Large Language Model (LLM) to assist in the Bangladeshi legal system.
We created UKIL-DB-EN, an English corpus of Bangladeshi legal documents, by collecting and scraping data on various legal acts.
We fine-tuned the GPT-2 model on this dataset to develop GPT2-UKIL-EN, an LLM focused on providing legal assistance in English.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Purpose: Bangladesh's legal system struggles with major challenges like delays, complexity, high costs, and millions of unresolved cases, which deter many from pursuing legal action due to lack of knowledge or financial constraints. This research seeks to develop a specialized Large Language Model (LLM) to assist in the Bangladeshi legal system. Methods: We created UKIL-DB-EN, an English corpus of Bangladeshi legal documents, by collecting and scraping data on various legal acts. We fine-tuned the GPT-2 model on this dataset to develop GPT2-UKIL-EN, an LLM focused on providing legal assistance in English. Results: The model was rigorously evaluated using semantic assessments, including case studies supported by expert opinions. The evaluation provided promising results, demonstrating the potential for the model to assist in legal matters within Bangladesh. Conclusion: Our work represents the first structured effort toward building an AI-based legal assistant for Bangladesh. While the results are encouraging, further refinements are necessary to improve the model's accuracy, credibility, and safety. This is a significant step toward creating a legal AI capable of serving the needs of a population of 180 million.
Related papers
- LegalOne: A Family of Foundation Models for Reliable Legal Reasoning [54.57434222018289]
We present LegalOne, a family of foundational models specifically tailored for the Chinese legal domain.<n>LegalOne is developed through a comprehensive three-phase pipeline designed to master legal reasoning.<n>We publicly release the LegalOne weights and the LegalKit evaluation framework to advance the field of Legal AI.
arXiv Detail & Related papers (2026-01-31T10:18:32Z) - Large Language Models' Complicit Responses to Illicit Instructions across Socio-Legal Contexts [54.15982476754607]
Large language models (LLMs) are now deployed at unprecedented scale, assisting millions of users in daily tasks.<n>This study defines complicit facilitation as the provision of guidance or support that enables illicit user instructions.<n>Using real-world legal cases and established legal frameworks, we construct an evaluation benchmark spanning 269 illicit scenarios and 50 illicit intents.
arXiv Detail & Related papers (2025-11-25T16:01:31Z) - Assessing the Reliability of Large Language Models in the Bengali Legal Context: A Comparative Evaluation Using LLM-as-Judge and Legal Experts [0.0]
Generative AI models like OpenAI GPT-4.1 Mini, Gemini 2.0 Flash, Meta Llama 3 70B, and DeepSeek R1 could potentially democratize legal assistance.<n>In this study, we collected 250 authentic legal questions from the Facebook group "Know Your Rights"<n>We evaluated each AI-generated response across four critical dimensions: factual accuracy, legal appropriateness, completeness, and clarity.
arXiv Detail & Related papers (2025-11-07T02:44:00Z) - Mina: A Multilingual LLM-Powered Legal Assistant Agent for Bangladesh for Empowering Access to Justice [5.215285027585101]
Existing AI legal assistants lack Bengali-language support and jurisdiction-specific adaptation.<n>We developed Mina, a multilingual LLM-based legal assistant tailored for the Bangladeshi context.<n>It employs multilingual embeddings and a RAG-based chain-of-tools framework for retrieval, reasoning, translation, and document generation.
arXiv Detail & Related papers (2025-11-04T20:43:28Z) - ClaimGen-CN: A Large-scale Chinese Dataset for Legal Claim Generation [56.79698529022327]
Legal claims refer to the plaintiff's demands in a case and are essential to guiding judicial reasoning and case resolution.<n>This paper explores the problem of legal claim generation based on the given case's facts.<n>We construct ClaimGen-CN, the first dataset for Chinese legal claim generation task.
arXiv Detail & Related papers (2025-08-24T07:19:25Z) - GLARE: Agentic Reasoning for Legal Judgment Prediction [60.13483016810707]
Legal judgment prediction (LJP) has become increasingly important in the legal field.<n>Existing large language models (LLMs) have significant problems of insufficient reasoning due to a lack of legal knowledge.<n>We introduce GLARE, an agentic legal reasoning framework that dynamically acquires key legal knowledge by invoking different modules.
arXiv Detail & Related papers (2025-08-22T13:38:12Z) - A Data Science Approach to Calcutta High Court Judgments: An Efficient LLM and RAG-powered Framework for Summarization and Similar Cases Retrieval [2.359291431338925]
This research presents a framework to analyze Calcutta High Court verdicts.<n>By fine-tuning the Pegasus model, we achieve significant improvements in the summarization of legal cases.<n>The RAG-powered framework efficiently retrieves similar cases in response to user queries, offering thorough overviews and summaries.
arXiv Detail & Related papers (2025-06-28T20:24:34Z) - Legal Assist AI: Leveraging Transformer-Based Model for Effective Legal Assistance [0.18749305679160366]
Many citizens in India struggle to leverage their legal rights due to limited awareness and access to relevant legal information.<n>This paper introduces Legal Assist AI, a transformer-based model designed to bridge this gap by offering effective legal assistance through large language models (LLMs)<n>The model was evaluated against state-of-the-art models such as GPT-3.5 Turbo and Mistral 7B, achieving a 60.08% score on the AIBE.
arXiv Detail & Related papers (2025-05-28T06:06:53Z) - A Law Reasoning Benchmark for LLM with Tree-Organized Structures including Factum Probandum, Evidence and Experiences [76.73731245899454]
We propose a transparent law reasoning schema enriched with hierarchical factum probandum, evidence, and implicit experience.
Inspired by this schema, we introduce the challenging task, which takes a textual case description and outputs a hierarchical structure justifying the final decision.
This benchmark paves the way for transparent and accountable AI-assisted law reasoning in the Intelligent Court''
arXiv Detail & Related papers (2025-03-02T10:26:54Z) - AnnoCaseLaw: A Richly-Annotated Dataset For Benchmarking Explainable Legal Judgment Prediction [56.797874973414636]
AnnoCaseLaw is a first-of-its-kind dataset of 471 meticulously annotated U.S. Appeals Court negligence cases.
Our dataset lays the groundwork for more human-aligned, explainable Legal Judgment Prediction models.
Results demonstrate that LJP remains a formidable task, with application of legal precedent proving particularly difficult.
arXiv Detail & Related papers (2025-02-28T19:14:48Z) - AI Adoption to Combat Financial Crime: Study on Natural Language Processing in Adverse Media Screening of Financial Services in English and Bangla multilingual interpretation [0.0]
This report measures the effectiveness of NLP is promising with an accuracy around 94%.
Some AML & CFT concerns are already being addressed by AI technology.
This investigation underscores the potential of AI-driven NLP solutions in fortifying efforts to prevent financial crimes in Bangladesh.
arXiv Detail & Related papers (2024-12-12T07:17:05Z) - NyayaAnumana & INLegalLlama: The Largest Indian Legal Judgment Prediction Dataset and Specialized Language Model for Enhanced Decision Analysis [5.790242888372048]
This paper introduces NyayaAnumana, the largest and most diverse corpus of Indian legal cases compiled for legal judgment prediction (LJP)
NyayaAnumana includes a wide range of cases from the Supreme Court, High Courts, Tribunal Courts, District Courts, and Daily Orders.
In addition to the dataset, we present INLegalLlama, a domain-specific generative large language model (LLM) tailored to the intricacies of the Indian legal system.
arXiv Detail & Related papers (2024-12-11T13:50:17Z) - CLERC: A Dataset for Legal Case Retrieval and Retrieval-Augmented Analysis Generation [44.67578050648625]
We transform a large open-source legal corpus into a dataset supporting information retrieval (IR) and retrieval-augmented generation (RAG)
This dataset CLERC is constructed for training and evaluating models on their ability to (1) find corresponding citations for a given piece of legal analysis and to (2) compile the text of these citations into a cogent analysis that supports a reasoning goal.
arXiv Detail & Related papers (2024-06-24T23:57:57Z) - InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws.
We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries.
InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z) - The Power of Question Translation Training in Multilingual Reasoning: Broadened Scope and Deepened Insights [108.40766216456413]
We propose a question alignment framework to bridge the gap between large language models' English and non-English performance.
Experiment results show it can boost multilingual performance across diverse reasoning scenarios, model families, and sizes.
We analyze representation space, generated response and data scales, and reveal how question translation training strengthens language alignment within LLMs.
arXiv Detail & Related papers (2024-05-02T14:49:50Z) - INACIA: Integrating Large Language Models in Brazilian Audit Courts:
Opportunities and Challenges [7.366861473623427]
INACIA is a groundbreaking system designed to integrate Large Language Models (LLMs) into the operational framework of Brazilian Federal Court of Accounts (TCU)
We demonstrate INACIA's potential in extracting relevant information from case documents, evaluating its legal plausibility, and formulating propositions for judicial decision-making.
arXiv Detail & Related papers (2024-01-10T17:13:28Z) - Report of the 1st Workshop on Generative AI and Law [78.62063815165968]
This report presents the takeaways of the inaugural Workshop on Generative AI and Law (GenLaw)
A cross-disciplinary group of practitioners and scholars from computer science and law convened to discuss the technical, doctrinal, and policy challenges presented by law for Generative AI.
arXiv Detail & Related papers (2023-11-11T04:13:37Z) - Automated Argument Generation from Legal Facts [6.057773749499076]
The number of cases submitted to the law system is far greater than the available number of legal professionals in a country.
In this study we partcularly focus on helping legal professionals in the process of analyzing a legal case.
Experimental results show that the generated arguments from the best performing method have on average 63% overlap with the benchmark set gold standard annotations.
arXiv Detail & Related papers (2023-10-09T12:49:35Z) - Chatlaw: A Multi-Agent Collaborative Legal Assistant with Knowledge Graph Enhanced Mixture-of-Experts Large Language Model [30.30848216845138]
Chatlaw is an innovative legal assistant utilizing a Mixture-of-Experts (MoE) model and a multi-agent system.
By integrating knowledge graphs with artificial screening, we construct a high-quality legal dataset to train the MoE model.
Our MoE model outperforms GPT-4 in the Lawbench and Unified Exam Qualification for Legal Professionals by 7.73% in accuracy and 11 points, respectively.
arXiv Detail & Related papers (2023-06-28T10:48:34Z) - Large Language Models as Tax Attorneys: A Case Study in Legal
Capabilities Emergence [5.07013500385659]
This paper explores Large Language Models' (LLMs) capabilities in applying tax law.
Our experiments demonstrate emerging legal understanding capabilities, with improved performance in each subsequent OpenAI model release.
Findings indicate that LLMs, particularly when combined with prompting enhancements and the correct legal texts, can perform at high levels of accuracy but not yet at expert tax lawyer levels.
arXiv Detail & Related papers (2023-06-12T12:40:48Z) - Understand Legal Documents with Contextualized Large Language Models [16.416510744265086]
We present our systems for SemEval-2023 Task 6: understanding legal texts.
We first develop the Legal-BERT-HSLN model that considers the comprehensive context information in both intra- and inter-sentence levels.
We then train a Legal-LUKE model, which is legal-contextualized and entity-aware, to recognize legal entities.
arXiv Detail & Related papers (2023-03-21T18:48:11Z) - Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents [56.40163943394202]
We release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding.
We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering.
arXiv Detail & Related papers (2021-05-09T09:39:25Z) - How Does NLP Benefit Legal System: A Summary of Legal Artificial
Intelligence [81.04070052740596]
Legal Artificial Intelligence (LegalAI) focuses on applying the technology of artificial intelligence, especially natural language processing, to benefit tasks in the legal domain.
This paper introduces the history, the current state, and the future directions of research in LegalAI.
arXiv Detail & Related papers (2020-04-25T14:45:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.