PARAMANU-AYN: An Efficient Novel Generative and Instruction-tuned Language Model for Indian Legal Case Documents
- URL: http://arxiv.org/abs/2403.13681v1
- Date: Wed, 20 Mar 2024 15:39:54 GMT
- Title: PARAMANU-AYN: An Efficient Novel Generative and Instruction-tuned Language Model for Indian Legal Case Documents
- Authors: Mitodru Niyogi, Arnab Bhattacharya,
- Abstract summary: Paramanu-Ayn is a language model based exclusively on case documents of the Supreme Court of India, the Constitution of India, and the Indian Penal Code.
Our model can be run on CPU and achieved 42.46 tokens/sec CPU inference speed.
- Score: 3.9018931027384056
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we present PARAMANU-AYN, a language model based exclusively on case documents of the Supreme Court of India, the Constitution of India, and the Indian Penal Code. The novel Auto Regressive (AR) decoder based model is pretrained from scratch at a context size of 8192. We evaluated our pretrained legal model on perplexity metrics. We also instruction-tuned our pretrained model on a set of 10,763 instructions covering various legal tasks such as legal reasoning, judgement explanation, legal clause generation, legal drafting, legal contract drafting, case summarization, constitutional question-answering, etc. We also evaluated the responses of prompts for instruction-tuned models by GPT-3.5-Turbo on clarity, relevance, completeness, and legal reasoning metrics in a scale of 10. Our model can be run on CPU and achieved 42.46 tokens/sec CPU inference speed. We found that our models, despite not being pretrained on legal books, various legal contracts, and legal documents, were able to learn the domain knowledge required for drafting various legal contracts and legal clauses, and generalize to draft legal contracts and legal clauses with limited instruction tuning. Hence, we conclude that for a strong domain-specialized generative language model (such as legal), very large amounts of data are not required to develop models from scratch. We believe that this work is the first attempt to make a dedicated generative legal language model from scratch for Indian Supreme Court jurisdiction or in legal NLP overall. We plan to release our Paramanu-Ayn model at https://www.bharatgpts.com.
Related papers
- IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning [16.12863746776168]
Legal systems worldwide are inundated with exponential growth in cases and documents.
There is an imminent need to develop NLP and ML techniques for automatically processing and understanding legal documents.
This paper proposes IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning.
arXiv Detail & Related papers (2024-07-07T14:55:04Z) - LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model [44.71845500433037]
We introduce LawGPT, the first open-source model specifically designed for Chinese legal applications.
LawGPT comprises two key components: legal-oriented pre-training and legal supervised fine-tuning.
Our experimental results demonstrate that LawGPT outperforms the open-source LLaMA 7B model.
arXiv Detail & Related papers (2024-06-07T03:52:56Z) - Leveraging open-source models for legal language modeling and analysis: a case study on the Indian constitution [0.0]
This paper presents a novel approach to legal language modeling (LLM) and analysis using open-source models from Hugging Face.
We leverage Hugging Face embeddings via LangChain and Sentence Transformers.
We then demonstrate the application of this model by extracting insights from the official Constitution of India.
arXiv Detail & Related papers (2024-04-10T05:35:47Z) - Towards Explainability in Legal Outcome Prediction Models [64.00172507827499]
We argue that precedent is a natural way of facilitating explainability for legal NLP models.
By developing a taxonomy of legal precedent, we are able to compare human judges and neural models.
We find that while the models learn to predict outcomes reasonably well, their use of precedent is unlike that of human judges.
arXiv Detail & Related papers (2024-03-25T15:15:41Z) - SLJP: Semantic Extraction based Legal Judgment Prediction [0.0]
Legal Judgment Prediction (LJP) is a judicial assistance system that recommends the legal components such as applicable statues, prison term and penalty term.
Most of the existing Indian models did not adequately concentrate on the semantics embedded in the fact description (FD) that impacts the decision.
The proposed semantic extraction based LJP (SLJP) model provides the advantages of pretrained transformers for complex unstructured legal case document understanding.
arXiv Detail & Related papers (2023-12-13T08:50:02Z) - Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model
Collaboration [52.57055162778548]
Legal Judgment Prediction (LJP) has become an increasingly crucial task in Legal AI.
Precedents are the previous legal cases with similar facts, which are the basis for the judgment of the subsequent case in national legal systems.
Recent advances in deep learning have enabled a variety of techniques to be used to solve the LJP task.
arXiv Detail & Related papers (2023-10-13T16:47:20Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - Do Charge Prediction Models Learn Legal Theory? [59.74220430434435]
We argue that trustworthy charge prediction models should take legal theories into consideration.
We propose three principles for trustworthy models should follow in this task, which are sensitive, selective, and presumption of innocence.
Our findings indicate that, while existing charge prediction models meet the selective principle on a benchmark dataset, most of them are still not sensitive enough and do not satisfy the presumption of innocence.
arXiv Detail & Related papers (2022-10-31T07:32:12Z) - Pre-trained Language Models for the Legal Domain: A Case Study on Indian
Law [7.366081387295463]
We re-train two popular legal PLMs, LegalBERT and CaseLawBERT, on Indian legal data, as well as train a model from scratch with a vocabulary based on Indian legal text.
We observe our approach not only enhances performance on the new domain (Indian texts) but also over the original domain (European and UK texts)
arXiv Detail & Related papers (2022-09-13T15:01:11Z) - Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents [56.40163943394202]
We release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding.
We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering.
arXiv Detail & Related papers (2021-05-09T09:39:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.