A Hierarchical Neural Framework for Classification and its Explanation in Large Unstructured Legal Documents
- URL: http://arxiv.org/abs/2309.10563v3
- Date: Thu, 27 Jun 2024 22:40:45 GMT
- Title: A Hierarchical Neural Framework for Classification and its Explanation in Large Unstructured Legal Documents
- Authors: Nishchal Prasad, Mohand Boughanem, Taoufik Dkaki,
- Abstract summary: We define this problem as "scarce annotated legal documents"
We propose a deep-learning-based classification framework which we call MESc.
We also propose an explanation extraction algorithm named ORSE.
- Score: 0.5812284760539713
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Automatic legal judgment prediction and its explanation suffer from the problem of long case documents exceeding tens of thousands of words, in general, and having a non-uniform structure. Predicting judgments from such documents and extracting their explanation becomes a challenging task, more so on documents with no structural annotation. We define this problem as "scarce annotated legal documents" and explore their lack of structural information and their long lengths with a deep-learning-based classification framework which we call MESc; "Multi-stage Encoder-based Supervised with-clustering"; for judgment prediction. We explore the adaptability of LLMs with multi-billion parameters (GPT-Neo, and GPT-J) to legal texts and their intra-domain(legal) transfer learning capacity. Alongside this, we compare their performance and adaptability with MESc and the impact of combining embeddings from their last layers. For such hierarchical models, we also propose an explanation extraction algorithm named ORSE; Occlusion sensitivity-based Relevant Sentence Extractor; based on the input-occlusion sensitivity of the model, to explain the predictions with the most relevant sentences from the document. We explore these methods and test their effectiveness with extensive experiments and ablation studies on legal documents from India, the European Union, and the United States with the ILDC dataset and a subset of the LexGLUE dataset. MESc achieves a minimum total performance gain of approximately 2 points over previous state-of-the-art proposed methods, while ORSE applied on MESc achieves a total average gain of 50% over the baseline explainability scores.
Related papers
- LegalSeg: Unlocking the Structure of Indian Legal Judgments Through Rhetorical Role Classification [6.549338652948716]
We introduce LegalSeg, the largest annotated dataset for this task, comprising over 7,000 documents and 1.4 million sentences, labeled with 7 rhetorical roles.
Our results demonstrate that models incorporating broader context, structural relationships, and sequential sentence information outperform those relying solely on sentence-level features.
arXiv Detail & Related papers (2025-02-09T10:07:05Z) - HEAL: Hierarchical Embedding Alignment Loss for Improved Retrieval and Representation Learning [6.2751089721877955]
RAG enhances Large Language Models (LLMs) by integrating external document retrieval to provide domain-specific or up-to-date knowledge.
The effectiveness of RAG depends on the relevance of retrieved documents, which is influenced by the semantic alignment of embeddings with the domain's specialized content.
This paper introduces Hierarchical Embedding Alignment Loss (HEAL), a novel method that leverages hierarchical fuzzy clustering with matrix factorization within contrastive learning.
arXiv Detail & Related papers (2024-12-05T23:10:56Z) - Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions [62.12545440385489]
Large language models (LLMs) have brought substantial advancements in text generation, but their potential for enhancing classification tasks remains underexplored.
We propose a framework for thoroughly investigating fine-tuning LLMs for classification, including both generation- and encoding-based approaches.
We instantiate this framework in edit intent classification (EIC), a challenging and underexplored classification task.
arXiv Detail & Related papers (2024-10-02T20:48:28Z) - The CLC-UKET Dataset: Benchmarking Case Outcome Prediction for the UK Employment Tribunal [0.41044181091229565]
The study employs a large language model (LLM) for automatic annotation, resulting in the creation of the CLC-UKET dataset.
The dataset consists of approximately 19,000 UKET cases and their metadata.
Empirical results indicate that finetuned transformer models outperform zero-shot and few-shot LLMs on the UKET prediction task.
arXiv Detail & Related papers (2024-09-12T14:51:43Z) - Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding [118.75567341513897]
Existing methods typically analyze target text in isolation or solely with non-member contexts.
We propose Con-ReCall, a novel approach that leverages the asymmetric distributional shifts induced by member and non-member contexts.
arXiv Detail & Related papers (2024-09-05T09:10:38Z) - Exploring Large Language Models and Hierarchical Frameworks for
Classification of Large Unstructured Legal Documents [0.6349503549199403]
We explore the classification of large legal documents and their lack of structural information with a deep-learning-based hierarchical framework.
Specifically, we divide a document into parts to extract their embeddings from the last four layers of a custom fine-tuned Large Language Model.
Our approach achieves a minimum total performance gain of approximately 2 points over previous state-of-the-art methods.
arXiv Detail & Related papers (2024-03-11T16:24:08Z) - Multi-perspective Improvement of Knowledge Graph Completion with Large
Language Models [95.31941227776711]
We propose MPIKGC to compensate for the deficiency of contextualized knowledge and improve KGC by querying large language models (LLMs)
We conducted extensive evaluation of our framework based on four description-based KGC models and four datasets, for both link prediction and triplet classification tasks.
arXiv Detail & Related papers (2024-03-04T12:16:15Z) - Hierarchical Indexing for Retrieval-Augmented Opinion Summarization [60.5923941324953]
We propose a method for unsupervised abstractive opinion summarization that combines the attributability and scalability of extractive approaches with the coherence and fluency of Large Language Models (LLMs)
Our method, HIRO, learns an index structure that maps sentences to a path through a semantically organized discrete hierarchy.
At inference time, we populate the index and use it to identify and retrieve clusters of sentences containing popular opinions from input reviews.
arXiv Detail & Related papers (2024-03-01T10:38:07Z) - InfoCSE: Information-aggregated Contrastive Learning of Sentence
Embeddings [61.77760317554826]
This paper proposes an information-d contrastive learning framework for learning unsupervised sentence embeddings, termed InfoCSE.
We evaluate the proposed InfoCSE on several benchmark datasets w.r.t the semantic text similarity (STS) task.
Experimental results show that InfoCSE outperforms SimCSE by an average Spearman correlation of 2.60% on BERT-base, and 1.77% on BERT-large.
arXiv Detail & Related papers (2022-10-08T15:53:19Z) - Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context.
We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.