Related papers: HiCuLR: Hierarchical Curriculum Learning for Rhetorical Role Labeling of Legal Documents

HiCuLR: Hierarchical Curriculum Learning for Rhetorical Role Labeling of Legal Documents

URL: http://arxiv.org/abs/2409.18647v1
Date: Fri, 27 Sep 2024 11:28:01 GMT
Title: HiCuLR: Hierarchical Curriculum Learning for Rhetorical Role Labeling of Legal Documents
Authors: T. Y. S. S. Santosh, Apolline Isaia, Shiyu Hong, Matthias Grabmair,
Abstract summary: HiCuLR is a hierarchical curriculum learning framework for Rhetorical Role Labeling. It nests two curricula: Rhetorical Role-level Curriculum (RC) on the outer layer and Document-level Curriculum (DC) on the inner layer.
Score: 1.2562034805037443
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Rhetorical Role Labeling (RRL) of legal documents is pivotal for various downstream tasks such as summarization, semantic case search and argument mining. Existing approaches often overlook the varying difficulty levels inherent in legal document discourse styles and rhetorical roles. In this work, we propose HiCuLR, a hierarchical curriculum learning framework for RRL. It nests two curricula: Rhetorical Role-level Curriculum (RC) on the outer layer and Document-level Curriculum (DC) on the inner layer. DC categorizes documents based on their difficulty, utilizing metrics like deviation from a standard discourse structure and exposes the model to them in an easy-to-difficult fashion. RC progressively strengthens the model to discern coarse-to-fine-grained distinctions between rhetorical roles. Our experiments on four RRL datasets demonstrate the efficacy of HiCuLR, highlighting the complementary nature of DC and RC.

Related papers

DISRetrieval: Harnessing Discourse Structure for Long Document Retrieval [51.89673002051528]
DISRetrieval is a novel hierarchical retrieval framework that leverages linguistic discourse structure to enhance long document understanding.<n>Our studies confirm that discourse structure significantly enhances retrieval effectiveness across different document lengths and query types.
arXiv Detail & Related papers (2025-05-26T14:45:12Z)
Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning [76.50690734636477]
We introduce Rank-R1, a novel LLM-based reranker that performs reasoning over both the user query and candidate documents before performing the ranking task. Our experiments on the TREC DL and BRIGHT datasets show that Rank-R1 is highly effective, especially for complex queries.
arXiv Detail & Related papers (2025-03-08T03:14:26Z)
Contextualizing Search Queries In-Context Learning for Conversational Rewriting with LLMs [0.0]
This paper introduces Prompt-Guided In-Context Learning, a novel approach for few-shot conversational query rewriting. Our method employs carefully designed prompts, incorporating task descriptions, input/output format specifications, and a small set of illustrative examples. Experiments on benchmark datasets, TREC and Taskmaster-1, demonstrate that our approach significantly outperforms strong baselines.
arXiv Detail & Related papers (2025-02-20T20:02:42Z)
LegalSeg: Unlocking the Structure of Indian Legal Judgments Through Rhetorical Role Classification [6.549338652948716]
We introduce LegalSeg, the largest annotated dataset for this task, comprising over 7,000 documents and 1.4 million sentences, labeled with 7 rhetorical roles. Our results demonstrate that models incorporating broader context, structural relationships, and sequential sentence information outperform those relying solely on sentence-level features.
arXiv Detail & Related papers (2025-02-09T10:07:05Z)
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation [68.81271028921647]
We introduce CORAL, a benchmark designed to assess RAG systems in realistic multi-turn conversational settings. CORAL includes diverse information-seeking conversations automatically derived from Wikipedia. It supports three core tasks of conversational RAG: passage retrieval, response generation, and citation labeling.
arXiv Detail & Related papers (2024-10-30T15:06:32Z)
Eliciting Critical Reasoning in Retrieval-Augmented Language Models via Contrastive Explanations [4.697267141773321]
Retrieval-augmented generation (RAG) has emerged as a critical mechanism in contemporary NLP to support Large Language Models (LLMs) in systematically accessing richer factual context. Recent studies have shown that LLMs still struggle to critically analyse RAG-based in-context information, a limitation that may lead to incorrect inferences and hallucinations. In this paper, we investigate how to elicit critical reasoning in RAG via contrastive explanations.
arXiv Detail & Related papers (2024-10-30T10:11:53Z)
Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation [51.8188846284153]
RAG has been widely adopted to enhance Large Language Models (LLMs) Attributed Text Generation (ATG) has attracted growing attention, which provides citations to support the model's responses in RAG. This paper proposes a fine-grained ATG method called ReClaim(Refer & Claim), which alternates the generation of references and answers step by step.
arXiv Detail & Related papers (2024-07-01T20:47:47Z)
Mind Your Neighbours: Leveraging Analogous Instances for Rhetorical Role Labeling for Legal Documents [1.2562034805037443]
This study introduces novel techniques to enhance Rhetorical Role Labeling (RRL) performance. For inference-based methods, we explore techniques that bolster label predictions without re-training. While in training-based methods, we integrate learning with our novel discourse-aware contrastive method that work directly on embedding spaces.
arXiv Detail & Related papers (2024-03-31T08:10:45Z)
Hierarchical Indexing for Retrieval-Augmented Opinion Summarization [60.5923941324953]
We propose a method for unsupervised abstractive opinion summarization that combines the attributability and scalability of extractive approaches with the coherence and fluency of Large Language Models (LLMs) Our method, HIRO, learns an index structure that maps sentences to a path through a semantically organized discrete hierarchy. At inference time, we populate the index and use it to identify and retrieve clusters of sentences containing popular opinions from input reviews.
arXiv Detail & Related papers (2024-03-01T10:38:07Z)
Rhetorical Role Labeling of Legal Documents using Transformers and Graph Neural Networks [1.290382979353427]
This paper presents the approaches undertaken to perform the task of rhetorical role labelling on Indian Court Judgements as part of SemEval Task 6: understanding legal texts, shared subtask A.
arXiv Detail & Related papers (2023-05-06T17:04:51Z)
Zero-Shot Listwise Document Reranking with a Large Language Model [58.64141622176841]
We propose Listwise Reranker with a Large Language Model (LRL), which achieves strong reranking effectiveness without using any task-specific training data. Experiments on three TREC web search datasets demonstrate that LRL not only outperforms zero-shot pointwise methods when reranking first-stage retrieval results, but can also act as a final-stage reranker.
arXiv Detail & Related papers (2023-05-03T14:45:34Z)
Joint Span Segmentation and Rhetorical Role Labeling with Data Augmentation for Legal Documents [1.4072904523937537]
Rhetorical Role Labeling of legal judgements play a crucial role in retrieval and adjacent tasks. We reformulate the task at span level as identifying spans of multiple consecutive sentences that share the same rhetorical role label. We employ semi-Markov Conditional Random Fields (CRF) to jointly learn span segmentation and span label assignment.
arXiv Detail & Related papers (2023-02-13T15:28:02Z)
Contextualize Me -- The Case for Context in Reinforcement Learning [49.794253971446416]
Contextual Reinforcement Learning (cRL) provides a framework to model such changes in a principled manner. We show how cRL contributes to improving zero-shot generalization in RL through meaningful benchmarks and structured reasoning about generalization tasks.
arXiv Detail & Related papers (2022-02-09T15:01:59Z)
Semantic Segmentation of Legal Documents via Rhetorical Roles [3.285073688021526]
This paper proposes a Rhetorical Roles (RR) system for segmenting a legal document into semantically coherent units. We develop a multitask learning-based deep learning model with document rhetorical role label shift as an auxiliary task for segmenting a legal document.
arXiv Detail & Related papers (2021-12-03T10:49:19Z)
Detecting and Classifying Malevolent Dialogue Responses: Taxonomy, Data and Methodology [68.8836704199096]
Corpus-based conversational interfaces are able to generate more diverse and natural responses than template-based or retrieval-based agents. With their increased generative capacity of corpusbased conversational agents comes the need to classify and filter out malevolent responses. Previous studies on the topic of recognizing and classifying inappropriate content are mostly focused on a certain category of malevolence.
arXiv Detail & Related papers (2020-08-21T22:43:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.