Related papers: Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents

Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents

URL: http://arxiv.org/abs/2105.03887v1
Date: Sun, 9 May 2021 09:39:25 GMT
Title: Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents
Authors: Chaojun Xiao, Xueyu Hu, Zhiyuan Liu, Cunchao Tu, Maosong Sun
Abstract summary: We release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding. We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering.
Score: 56.40163943394202
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Legal artificial intelligence (LegalAI) aims to benefit legal systems with the technology of artificial intelligence, especially natural language processing (NLP). Recently, inspired by the success of pre-trained language models (PLMs) in the generic domain, many LegalAI researchers devote their effort to apply PLMs to legal tasks. However, utilizing PLMs to address legal tasks is still challenging, as the legal documents usually consist of thousands of tokens, which is far longer than the length that mainstream PLMs can process. In this paper, we release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding. We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering. The experimental results demonstrate that our model can achieve promising improvement on tasks with long documents as inputs.

Related papers

LegalOne: A Family of Foundation Models for Reliable Legal Reasoning [54.57434222018289]
We present LegalOne, a family of foundational models specifically tailored for the Chinese legal domain.<n>LegalOne is developed through a comprehensive three-phase pipeline designed to master legal reasoning.<n>We publicly release the LegalOne weights and the LegalKit evaluation framework to advance the field of Legal AI.
arXiv Detail & Related papers (2026-01-31T10:18:32Z)
LexGenius: An Expert-Level Benchmark for Large Language Models in Legal General Intelligence [74.05988707492058]
Legal general intelligence (GI) refers to artificial intelligence (AI) that encompasses legal understanding, reasoning, and decision-making.<n>Existing benchmarks are result-oriented and fail to systematically evaluate the legal intelligence of large language models (LLMs)<n>We propose LexGenius, an expert-level Chinese legal benchmark for evaluating legal GI in LLMs.
arXiv Detail & Related papers (2025-12-04T08:48:02Z)
Evaluating the Role of Large Language Models in Legal Practice in India [0.0]
The integration of Artificial Intelligence into the legal profession raises significant questions about the capacity of Large Language Models to perform key legal tasks.<n>I empirically evaluate how well LLMs, such as GPT, Claude, and Llama, perform key legal tasks in the Indian context.<n>I conclude that while LLMs can augment certain legal tasks, human expertise remains essential for nuanced reasoning and the precise application of law.
arXiv Detail & Related papers (2025-08-13T11:04:48Z)
LawLLM: Law Large Language Model for the US Legal System [43.13850456765944]
We introduce the Law Large Language Model (LawLLM), a multi-task model specifically designed for the US legal domain. LawLLM excels at Similar Case Retrieval (SCR), Precedent Case Recommendation (PCR), and Legal Judgment Prediction (LJP) We propose customized data preprocessing techniques for each task that transform raw legal data into a trainable format.
arXiv Detail & Related papers (2024-07-27T21:51:30Z)
It Cannot Be Right If It Was Written by AI: On Lawyers' Preferences of Documents Perceived as Authored by an LLM vs a Human [0.6827423171182154]
Large Language Models (LLMs) enable a future in which certain types of legal documents may be generated automatically. This study is the necessary analysis of the ongoing transition towards mature generative AI systems. Our analysis revealed a clear preference for documents perceived as crafted by a human over those believed to be generated by AI.
arXiv Detail & Related papers (2024-07-09T12:11:25Z)
InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws. We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries. InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z)
Legal Documents Drafting with Fine-Tuned Pre-Trained Large Language Model [1.3812010983144798]
This paper shows that we can leverage a large number of annotation-free legal documents without Chinese word segmentation to fine-tune a large-scale language model. It can also achieve the generating legal document drafts task, and at the same time achieve the protection of information privacy and to improve information security issues.
arXiv Detail & Related papers (2024-06-06T16:00:20Z)
Large Language Models in Law: A Survey [34.785207813971134]
The application of legal large language models (LLMs) is still in its nascent stage. We provide an overview of AI technologies in the legal field and showcase the recent research in LLMs. We explore the limitations of legal LLMs, including data, algorithms, and judicial practice.
arXiv Detail & Related papers (2023-11-26T00:48:12Z)
Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model Collaboration [52.57055162778548]
Legal Judgment Prediction (LJP) has become an increasingly crucial task in Legal AI. Precedents are the previous legal cases with similar facts, which are the basis for the judgment of the subsequent case in national legal systems. Recent advances in deep learning have enabled a variety of techniques to be used to solve the LJP task.
arXiv Detail & Related papers (2023-10-13T16:47:20Z)
Large Language Models as Tax Attorneys: A Case Study in Legal Capabilities Emergence [5.07013500385659]
This paper explores Large Language Models' (LLMs) capabilities in applying tax law. Our experiments demonstrate emerging legal understanding capabilities, with improved performance in each subsequent OpenAI model release. Findings indicate that LLMs, particularly when combined with prompting enhancements and the correct legal texts, can perform at high levels of accuracy but not yet at expert tax lawyer levels.
arXiv Detail & Related papers (2023-06-12T12:40:48Z)
SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system. Most existing language models have difficulty understanding the long-distance dependencies between different structures. We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z)
A Short Survey of Viewing Large Language Models in Legal Aspect [0.0]
Large language models (LLMs) have transformed many fields, including natural language processing, computer vision, and reinforcement learning. The integration of LLMs into the legal field has also raised several legal problems, including privacy concerns, bias, and explainability.
arXiv Detail & Related papers (2023-03-16T08:01:22Z)
How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence [81.04070052740596]
Legal Artificial Intelligence (LegalAI) focuses on applying the technology of artificial intelligence, especially natural language processing, to benefit tasks in the legal domain. This paper introduces the history, the current state, and the future directions of research in LegalAI.
arXiv Detail & Related papers (2020-04-25T14:45:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.