Related papers: Coupling Local Context and Global Semantic Prototypes via a Hierarchical Architecture for Rhetorical Roles Labeling

Coupling Local Context and Global Semantic Prototypes via a Hierarchical Architecture for Rhetorical Roles Labeling

URL: http://arxiv.org/abs/2603.03856v1
Date: Wed, 04 Mar 2026 09:05:24 GMT
Title: Coupling Local Context and Global Semantic Prototypes via a Hierarchical Architecture for Rhetorical Roles Labeling
Authors: Anas Belfathi, Nicolas Hernandez, Laura Monceaux, Warren Bonnard, Mary Catherine Lavissiere, Christine Jacquin, Richard Dufour,
Abstract summary: Rhetorical Role Labeling (RRL) identifies the functional role of each sentence in a document.<n>We propose two prototype-based methods that integrate local context with global representations.<n>Experiments on legal, medical, and scientific benchmarks show consistent improvements over strong baselines.
Score: 5.444158140267451
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Rhetorical Role Labeling (RRL) identifies the functional role of each sentence in a document, a key task for discourse understanding in domains such as law and medicine. While hierarchical models capture local dependencies effectively, they are limited in modeling global, corpus-level features. To address this limitation, we propose two prototype-based methods that integrate local context with global representations. Prototype-Based Regularization (PBR) learns soft prototypes through a distance-based auxiliary loss to structure the latent space, while Prototype-Conditioned Modulation (PCM) constructs corpus-level prototypes and injects them during training and inference. Given the scarcity of RRL resources, we introduce SCOTUS-Law, the first dataset of U.S. Supreme Court opinions annotated with rhetorical roles at three levels of granularity: category, rhetorical function, and step. Experiments on legal, medical, and scientific benchmarks show consistent improvements over strong baselines, with 4 Macro-F1 gains on low-frequency roles. We further analyze the implications in the era of Large Language Models and complement our findings with expert evaluation.

Related papers

On Theoretically-Driven LLM Agents for Multi-Dimensional Discourse Analysis [0.0]
This paper presents a comparative multi-agent framework designed to quantify the benefits of incorporating explicit theoretical knowledge for this task.<n>We utilise a dataset of annotated political debates to establish a new standard encompassing four distinct rephrase functions.<n>We evaluate two parallel LLM-based agent systems: one enhanced by argumentation theory via Retrieval-Augmented Generation (RAG), and an identical zero-shot baseline.
arXiv Detail & Related papers (2026-02-14T10:30:39Z)
Divide, Conquer and Unite: Hierarchical Style-Recalibrated Prototype Alignment for Federated Medical Image Segmentation [66.82598255715696]
Federated learning enables multiple medical institutions to train a global model without sharing data.<n>Current approaches primarily focus on final-layer features, overlooking critical multi-level cues.<n>We propose FedBCS to bridge feature representation gaps via domain-invariant contextual prototypes alignment.
arXiv Detail & Related papers (2025-11-14T04:15:34Z)
Speech-DRAME: A Framework for Human-Aligned Benchmarks in Speech Role-Play [68.54773980519457]
Speech-DRAME is a unified framework that contributes at three levels.<n>It provides the first comprehensive, reproducible foundation for assessing spoken role-play.
arXiv Detail & Related papers (2025-11-03T06:12:40Z)
Beyond Chunking: Discourse-Aware Hierarchical Retrieval for Long Document Question Answering [51.7493726399073]
We present a discourse-aware hierarchical framework to enhance long document question answering.<n>The framework involves three key innovations: specialized discourse parsing for lengthy documents, LLM-based enhancement of discourse relation nodes, and structure-guided hierarchical retrieval.
arXiv Detail & Related papers (2025-05-26T14:45:12Z)
Multi-Scale Probabilistic Generation Theory: A Unified Information-Theoretic Framework for Hierarchical Structure in Large Language Models [1.0117553823134735]
Large Language Models (LLMs) exhibit remarkable emergent abilities but remain poorly understood at a mechanistic level.<n>This paper introduces the Multi-Scale Probabilistic Generation Theory (MSPGT)<n>MSPGT posits that standard language modeling objectives implicitly optimize multi-scale information compression.
arXiv Detail & Related papers (2025-05-23T16:55:35Z)
LegalSeg: Unlocking the Structure of Indian Legal Judgments Through Rhetorical Role Classification [6.549338652948716]
We introduce LegalSeg, the largest annotated dataset for this task, comprising over 7,000 documents and 1.4 million sentences, labeled with 7 rhetorical roles.<n>Our results demonstrate that models incorporating broader context, structural relationships, and sequential sentence information outperform those relying solely on sentence-level features.
arXiv Detail & Related papers (2025-02-09T10:07:05Z)
HiCuLR: Hierarchical Curriculum Learning for Rhetorical Role Labeling of Legal Documents [1.2562034805037443]
HiCuLR is a hierarchical curriculum learning framework for Rhetorical Role Labeling. It nests two curricula: Rhetorical Role-level Curriculum (RC) on the outer layer and Document-level Curriculum (DC) on the inner layer.
arXiv Detail & Related papers (2024-09-27T11:28:01Z)
Do Language Models Learn about Legal Entity Types during Pretraining? [4.604003661048267]
We show that Llama2 performs well on certain entities and exhibits potential for substantial improvement with optimized prompt templates. Llama2 appears to frequently overlook syntactic cues, a shortcoming less present in BERT-based architectures.
arXiv Detail & Related papers (2023-10-19T18:47:21Z)
Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling [70.23876429382969]
We propose a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks. Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena. For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge.
arXiv Detail & Related papers (2023-07-16T15:18:25Z)
Global and Local Hierarchy-aware Contrastive Framework for Implicit Discourse Relation Recognition [8.143877598684528]
implicit discourse relation recognition (IDRR) is a challenging task in discourse analysis. Recent methods tend to integrate the whole hierarchical information of senses into discourse relation representations. We propose a novel GlObal and Local Hierarchy-aware Contrastive Framework (GOLF), to model two kinds of hierarchies.
arXiv Detail & Related papers (2022-11-25T03:19:03Z)
A Multi-level Supervised Contrastive Learning Framework for Low-Resource Natural Language Inference [54.678516076366506]
Natural Language Inference (NLI) is a growingly essential task in natural language understanding. Here we propose a multi-level supervised contrastive learning framework named MultiSCL for low-resource natural language inference.
arXiv Detail & Related papers (2022-05-31T05:54:18Z)
UnifieR: A Unified Retriever for Large-Scale Retrieval [84.61239936314597]
Large-scale retrieval is to recall relevant documents from a huge collection given a query. Recent retrieval methods based on pre-trained language models (PLM) can be coarsely categorized into either dense-vector or lexicon-based paradigms. We propose a new learning framework, UnifieR which unifies dense-vector and lexicon-based retrieval in one model with a dual-representing capability.
arXiv Detail & Related papers (2022-05-23T11:01:59Z)
Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching. We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders. We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z)
BURT: BERT-inspired Universal Representation from Learning Meaningful Segment [46.51685959045527]
This work introduces and explores the universal representation learning, i.e., embeddings of different levels of linguistic unit in a uniform vector space. We present a universal representation model, BURT, to encode different levels of linguistic unit into the same vector space. Specifically, we extract and mask meaningful segments based on point-wise mutual information (PMI) to incorporate different granular objectives into the pre-training stage.
arXiv Detail & Related papers (2020-12-28T16:02:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.