DOREMI: Optimizing Long Tail Predictions in Document-Level Relation Extraction
- URL: http://arxiv.org/abs/2601.11190v1
- Date: Fri, 16 Jan 2026 11:04:18 GMT
- Title: DOREMI: Optimizing Long Tail Predictions in Document-Level Relation Extraction
- Authors: Laura Menotti, Stefano Marchesin, Gianmaria Silvello,
- Abstract summary: We introduce DOcument-level Relation Extraction optiMizing the long taIl (DOREMI)<n>DOREMI enhances underrepresented relations through minimal yet targeted manual annotations.<n>It can be applied to any existing DocRE model and is effective at mitigating long-tail biases.
- Score: 3.370733427756227
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Document-Level Relation Extraction (DocRE) presents significant challenges due to its reliance on cross-sentence context and the long-tail distribution of relation types, where many relations have scarce training examples. In this work, we introduce DOcument-level Relation Extraction optiMizing the long taIl (DOREMI), an iterative framework that enhances underrepresented relations through minimal yet targeted manual annotations. Unlike previous approaches that rely on large-scale noisy data or heuristic denoising, DOREMI actively selects the most informative examples to improve training efficiency and robustness. DOREMI can be applied to any existing DocRE model and is effective at mitigating long-tail biases, offering a scalable solution to improve generalization on rare relations.
Related papers
- Multimodal Large Language Models with Adaptive Preference Optimization for Sequential Recommendation [60.33386541343322]
We propose a Multimodal Large Language Models framework that integrates Hardness-aware and Noise-regularized preference optimization for Recommendation (HaNoRec)<n>Specifically, HaNoRec dynamically adjusts optimization weights based on both the estimated hardness of each training sample and the policy model's real-time responsiveness.
arXiv Detail & Related papers (2025-11-24T04:10:46Z) - COMM:Concentrated Margin Maximization for Robust Document-Level Relation Extraction [5.291403671224172]
Document-level relation extraction (DocRE) is the process of identifying and extracting relations between entities that span multiple sentences within a document.<n>The complexity inherent in DocRE makes the labeling process prone to errors, compounded by the extreme sparsity of positive relation samples.<n>We have developed a robust framework called textittextbfCOMM to better solve DocRE.
arXiv Detail & Related papers (2025-03-18T04:31:57Z) - Few-Shot, No Problem: Descriptive Continual Relation Extraction [27.296604792388646]
Few-shot Continual Relation Extraction is a crucial challenge for enabling AI systems to identify and adapt to evolving relationships in real-world domains.<n>Traditional memory-based approaches often overfit to limited samples, failing to reinforce old knowledge.<n>We propose a novel retrieval-based solution, starting with a large language model to generate descriptions for each relation.
arXiv Detail & Related papers (2025-02-27T23:44:30Z) - Rethinking Relation Extraction: Beyond Shortcuts to Generalization with a Debiased Benchmark [53.876493664396506]
Benchmarks are crucial for evaluating machine learning algorithm performance, facilitating comparison and identifying superior solutions.<n>This paper addresses the issue of entity bias in relation extraction tasks, where models tend to rely on entity mentions rather than context.<n>We propose a debiased relation extraction benchmark DREB that breaks the pseudo-correlation between entity mentions and relation types through entity replacement.<n>To establish a new baseline on DREB, we introduce MixDebias, a debiasing method combining data-level and model training-level techniques.
arXiv Detail & Related papers (2025-01-02T17:01:06Z) - Improving Long Tailed Document-Level Relation Extraction via Easy
Relation Augmentation and Contrastive Learning [66.83982926437547]
We argue that mitigating the long-tailed distribution problem is crucial for DocRE in the real-world scenario.
Motivated by the long-tailed distribution problem, we propose an Easy Relation Augmentation(ERA) method for improving DocRE.
arXiv Detail & Related papers (2022-05-21T06:15:11Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - Contrastive Self-supervised Sequential Recommendation with Robust
Augmentation [101.25762166231904]
Sequential Recommendationdescribes a set of techniques to model dynamic user behavior in order to predict future interactions in sequential user data.
Old and new issues remain, including data-sparsity and noisy data.
We propose Contrastive Self-Supervised Learning for sequential Recommendation (CoSeRec)
arXiv Detail & Related papers (2021-08-14T07:15:25Z) - Distantly-Supervised Long-Tailed Relation Extraction Using Constraint
Graphs [16.671606030727975]
In this paper, we introduce constraint graphs to model the dependencies between relation labels.
We also propose a novel constraint graph-based relation extraction framework(CGRE) to handle the two challenges simultaneously.
CGRE employs graph convolution networks (GCNs) to propagate information from data-rich relation nodes to data-poor relation nodes.
arXiv Detail & Related papers (2021-05-24T12:02:32Z) - RH-Net: Improving Neural Relation Extraction via Reinforcement Learning
and Hierarchical Relational Searching [2.1828601975620257]
We propose a novel framework named RH-Net, which utilizes Reinforcement learning and Hierarchical relational searching module to improve relation extraction.
We then propose the hierarchical relational searching module to share the semantics from correlative instances between data-rich and data-poor classes.
arXiv Detail & Related papers (2020-10-27T12:50:27Z) - Improving Long-Tail Relation Extraction with Collaborating
Relation-Augmented Attention [63.26288066935098]
We propose a novel neural network, Collaborating Relation-augmented Attention (CoRA), to handle both the wrong labeling and long-tail relations.
In the experiments on the popular benchmark dataset NYT, the proposed CoRA improves the prior state-of-the-art performance by a large margin.
arXiv Detail & Related papers (2020-10-08T05:34:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.