Understand Legal Documents with Contextualized Large Language Models
- URL: http://arxiv.org/abs/2303.12135v4
- Date: Wed, 19 Jul 2023 05:30:31 GMT
- Title: Understand Legal Documents with Contextualized Large Language Models
- Authors: Xin Jin, Yuchen Wang
- Abstract summary: We present our systems for SemEval-2023 Task 6: understanding legal texts.
We first develop the Legal-BERT-HSLN model that considers the comprehensive context information in both intra- and inter-sentence levels.
We then train a Legal-LUKE model, which is legal-contextualized and entity-aware, to recognize legal entities.
- Score: 16.416510744265086
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The growth of pending legal cases in populous countries, such as India, has
become a major issue. Developing effective techniques to process and understand
legal documents is extremely useful in resolving this problem. In this paper,
we present our systems for SemEval-2023 Task 6: understanding legal texts (Modi
et al., 2023). Specifically, we first develop the Legal-BERT-HSLN model that
considers the comprehensive context information in both intra- and
inter-sentence levels to predict rhetorical roles (subtask A) and then train a
Legal-LUKE model, which is legal-contextualized and entity-aware, to recognize
legal entities (subtask B). Our evaluations demonstrate that our designed
models are more accurate than baselines, e.g., with an up to 15.0% better F1
score in subtask B. We achieved notable performance in the task leaderboard,
e.g., 0.834 micro F1 score, and ranked No.5 out of 27 teams in subtask A.
Related papers
- LawLLM: Law Large Language Model for the US Legal System [43.13850456765944]
We introduce the Law Large Language Model (LawLLM), a multi-task model specifically designed for the US legal domain.
LawLLM excels at Similar Case Retrieval (SCR), Precedent Case Recommendation (PCR), and Legal Judgment Prediction (LJP)
We propose customized data preprocessing techniques for each task that transform raw legal data into a trainable format.
arXiv Detail & Related papers (2024-07-27T21:51:30Z) - InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws.
We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries.
InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z) - DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval.
We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability.
Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z) - Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model
Collaboration [52.57055162778548]
Legal Judgment Prediction (LJP) has become an increasingly crucial task in Legal AI.
Precedents are the previous legal cases with similar facts, which are the basis for the judgment of the subsequent case in national legal systems.
Recent advances in deep learning have enabled a variety of techniques to be used to solve the LJP task.
arXiv Detail & Related papers (2023-10-13T16:47:20Z) - NeCo@ALQAC 2023: Legal Domain Knowledge Acquisition for Low-Resource
Languages through Data Enrichment [2.441072488254427]
This paper presents NeCo Team's solutions to the Vietnamese text processing tasks provided in the Automated Legal Question Answering Competition 2023 (ALQAC 2023)
Our methods for the legal document retrieval task employ a combination of similarity ranking and deep learning models, while for the second task, we propose a range of adaptive techniques to handle different question types.
Our approaches achieve outstanding results on both tasks of the competition, demonstrating the potential benefits and effectiveness of question answering systems in the legal field.
arXiv Detail & Related papers (2023-09-11T14:43:45Z) - NOWJ at COLIEE 2023 -- Multi-Task and Ensemble Approaches in Legal
Information Processing [1.5593460008414899]
We present the NOWJ team's approach to the COLIEE 2023 Competition, which focuses on advancing legal information processing techniques.
We employ state-of-the-art machine learning models and innovative approaches, such as BERT, Longformer, BM25-ranking algorithm, and multi-task learning models.
arXiv Detail & Related papers (2023-06-08T03:10:49Z) - THUIR@COLIEE 2023: Incorporating Structural Knowledge into Pre-trained
Language Models for Legal Case Retrieval [16.191450092389722]
This paper summarizes the approach of the championship team THUIR in COLIEE 2023.
To be specific, we design structure-aware pre-trained language models to enhance the understanding of legal cases.
In the end, learning-to-rank methods are employed to merge features with different dimensions.
arXiv Detail & Related papers (2023-05-11T14:08:53Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - SemEval 2023 Task 6: LegalEval - Understanding Legal Texts [2.172613863157655]
There is a need for developing NLP-based techniques for processing and automatically understanding legal documents.
LegalEval task has three sub-tasks: Task-A (Rhetorical Roles Labeling) is about automatically structuring legal documents into semantically coherent units, Task-B (Legal Named Entity Recognition) deals with identifying relevant entities in a legal document, Task-C (Court Judgement Prediction with Explanation) explores the possibility of automatically predicting the outcome of a legal case.
In each of the sub-tasks, the proposed systems outperformed the baselines; however, there is a lot of scope for
arXiv Detail & Related papers (2023-04-19T10:28:32Z) - Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents [56.40163943394202]
We release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding.
We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering.
arXiv Detail & Related papers (2021-05-09T09:39:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.