MEIC-DT: Memory-Efficient Incremental Clustering for Long-Text Coreference Resolution with Dual-Threshold Constraints
- URL: http://arxiv.org/abs/2512.24711v1
- Date: Wed, 31 Dec 2025 08:26:34 GMT
- Title: MEIC-DT: Memory-Efficient Incremental Clustering for Long-Text Coreference Resolution with Dual-Threshold Constraints
- Authors: Kangyang Luo, Shuzheng Si, Yuzhuo Bai, Cheng Gao, Zhitong Wang, Cheng Huang, Yingli Shen, Yufeng Han, Wenhao Li, Cunliang Kong, Maosong Sun,
- Abstract summary: textbfMEIC-DT is a memory-efficient incremental clustering approach based on a lightweight Transformer.<n>We show that MEIC-DT achieves highly competitive coreference performance under stringent memory constraints.
- Score: 42.81232562487108
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the era of large language models (LLMs), supervised neural methods remain the state-of-the-art (SOTA) for Coreference Resolution. Yet, their full potential is underexplored, particularly in incremental clustering, which faces the critical challenge of balancing efficiency with performance for long texts. To address the limitation, we propose \textbf{MEIC-DT}, a novel dual-threshold, memory-efficient incremental clustering approach based on a lightweight Transformer. MEIC-DT features a dual-threshold constraint mechanism designed to precisely control the Transformer's input scale within a predefined memory budget. This mechanism incorporates a Statistics-Aware Eviction Strategy (\textbf{SAES}), which utilizes distinct statistical profiles from the training and inference phases for intelligent cache management. Furthermore, we introduce an Internal Regularization Policy (\textbf{IRP}) that strategically condenses clusters by selecting the most representative mentions, thereby preserving semantic integrity. Extensive experiments on common benchmarks demonstrate that MEIC-DT achieves highly competitive coreference performance under stringent memory constraints.
Related papers
- Beyond the Flat Sequence: Hierarchical and Preference-Aware Generative Recommendations [35.58864660038236]
We propose a novel framework named HPGR (Hierarchical and Preference-aware Generative Recommender)<n>First, a structure-aware pre-training stage employs a session-based Masked Item Modeling objective to learn a hierarchically-informed and semantically rich item representation space.<n>Second, a preference-aware fine-tuning stage leverages these powerful representations to implement a Preference-Guided Sparse Attention mechanism.
arXiv Detail & Related papers (2026-03-01T08:15:34Z) - Rethinking Multi-Condition DiTs: Eliminating Redundant Attention via Position-Alignment and Keyword-Scoping [61.459927600301654]
Multi-condition control is bottlenecked by the conventional concatenate-and-attend'' strategy.<n>Our analysis reveals that much of this cross-modal interaction is spatially or semantically redundant.<n>We propose Position-aligned and Keyword-scoped Attention (PKA), a highly efficient framework designed to eliminate these redundancies.
arXiv Detail & Related papers (2026-02-06T16:39:10Z) - Training-free Context-adaptive Attention for Efficient Long Context Modeling [57.703159205740185]
Training-free Context-adaptive Attention (TCA-Attention) is a training-free sparse attention mechanism that selectively attends to only the informative tokens for efficient long-context inference.<n>TCA-Attention achieves a 2.8$times$ speedup and reduces KV cache by 61% at 128K context length while maintaining performance comparable to full attention.
arXiv Detail & Related papers (2025-12-10T01:54:57Z) - Adapformer: Adaptive Channel Management for Multivariate Time Series Forecasting [49.40321003932633]
Adapformer is an advanced Transformer-based framework that merges the benefits of CI and CD methodologies through effective channel management.<n>Adapformer achieves superior performance over existing models, enhancing both predictive accuracy and computational efficiency.
arXiv Detail & Related papers (2025-11-18T16:24:05Z) - Memory- and Latency-Constrained Inference of Large Language Models via Adaptive Split Computing [8.705453442427585]
Large language models (LLMs) have achieved near-human performance across diverse reasoning tasks.<n>Their deployment on resource-constrained Internet-of-Things (IoT) devices remains impractical due to massive parameter footprints and memory-intensive autoregressive decoding.<n>This work introduces the first autoregressive-aware split computing framework designed explicitly for LLM deployment on edge devices.
arXiv Detail & Related papers (2025-11-06T02:55:07Z) - Label-independent hyperparameter-free self-supervised single-view deep subspace clustering [0.0]
Deep subspace clustering (DSC) algorithms face several challenges that hinder their widespread adoption across domains.<n>We introduce a novel single-view DSC approach that minimizes a layer-wise self expression loss using a joint representation matrix.<n>We evaluate the proposed method on six datasets representing faces, digits, and objects.
arXiv Detail & Related papers (2025-04-25T08:54:34Z) - Dynamic Memory-enhanced Transformer for Hyperspectral Image Classification [3.5093938502961763]
Hyperspectral image (HSI) classification remains a challenging task due to the intricate spatial-spectral correlations.<n>Existing transformer models excel in capturing long-range dependencies but often suffer from information redundancy and attention inefficiencies.<n>MemFormer introduces a memory-enhanced multi-head attention mechanism that iteratively refines a dynamic memory module.<n>A dynamic memory enrichment strategy progressively captures complex spatial and spectral dependencies, leading to more expressive feature representations.
arXiv Detail & Related papers (2025-04-17T17:43:34Z) - Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities [69.26544016976396]
We exploit the redundancy within Mixture-of-Experts (MoEs) as a source of additional capacity for learning a new modality.<n>We preserve the original language generation capabilities by applying low-rank adaptation exclusively to the tokens of the new modality.
arXiv Detail & Related papers (2025-03-28T15:21:24Z) - Contextual Compression Encoding for Large Language Models: A Novel Framework for Multi-Layered Parameter Space Pruning [0.0]
Contextual Compression.<n>(CCE) introduced a multi-stage encoding mechanism that dynamically restructured parameter distributions.<n>CCE retained linguistic expressivity and coherence, maintaining accuracy across a range of text generation and classification tasks.
arXiv Detail & Related papers (2025-02-12T11:44:19Z) - Structured Token Retention and Computational Memory Paths in Large Language Models [0.0]
This paper introduces a probabilistic selection framework that dynamically adjusts token persistence based on contextual significance.<n>It is extended through hierarchical memory allocation, refining retention efficiency through structured reallocation of token embeddings.<n>The integration of STR and CMP into an open-source model illustrates the adaptability of structured memory retention methodologies.
arXiv Detail & Related papers (2025-02-05T11:59:22Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.