Related papers: Do not be greedy, Think Twice: Sampling and Selection for Document-level Information Extraction

Do not be greedy, Think Twice: Sampling and Selection for Document-level Information Extraction

URL: http://arxiv.org/abs/2601.18395v1
Date: Mon, 26 Jan 2026 11:53:08 GMT
Title: Do not be greedy, Think Twice: Sampling and Selection for Document-level Information Extraction
Authors: Mikel Zubillaga, Oscar Sainz, Oier Lopez de Lacalle, Eneko Agirre,
Abstract summary: Document-level Information Extraction (DocIE) aims to produce an output template with the entities and relations of interest occurring in the given document.<n>Standard practices include prompting decoder-only LLMs using greedy decoding to avoid output variability.<n>We show that sampling can produce substantially better solutions than greedy decoding, especially when using reasoning models.
Score: 19.989502176674183
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Document-level Information Extraction (DocIE) aims to produce an output template with the entities and relations of interest occurring in the given document. Standard practices include prompting decoder-only LLMs using greedy decoding to avoid output variability. Rather than treating this variability as a limitation, we show that sampling can produce substantially better solutions than greedy decoding, especially when using reasoning models. We thus propose ThinkTwice, a sampling and selection framework in which the LLM generates multiple candidate templates for a given document, and a selection module chooses the most suitable one. We introduce both an unsupervised method that exploits agreement across generated outputs, and a supervised selection method using reward models trained on labeled DocIE data. To address the scarcity of golden reasoning trajectories for DocIE, we propose a rejection-sampling-based method to generate silver training data that pairs output templates with reasoning traces. Our experiments show the validity of unsupervised and supervised ThinkTwice, consistently outperforming greedy baselines and the state-of-the-art.

Related papers

DiffuRank: Effective Document Reranking with Diffusion Language Models [71.16830004674513]
We propose DiffuRank, a reranking framework built upon diffusion language models (dLLMs)<n>dLLMs support more flexible decoding and generation processes that are not constrained to a left-to-right order.<n>We show dLLMs achieve performance comparable to, and in some cases exceeding, that of autoregressive LLMs with similar model sizes.
arXiv Detail & Related papers (2026-02-13T02:18:14Z)
DiffuGR: Generative Document Retrieval with Diffusion Language Models [80.78126312115087]
We propose generative document retrieval with diffusion language models, dubbed DiffuGR.<n>For inference, DiffuGR attempts to generate DocID tokens in parallel and refine them through a controllable number of denoising steps.<n>In contrast to conventional left-to-right auto-regressive decoding, DiffuGR provides a novel mechanism to first generate more confident DocID tokens.
arXiv Detail & Related papers (2025-11-11T12:00:09Z)
A Tale of Two Experts: Cooperative Learning for Source-Free Unsupervised Domain Adaptation [59.88864205383671]
Source-Free Unsupervised Domain Adaptation (SFUDA) addresses the realistic challenge of adapting a source-trained model to a target domain without access to the source data.<n>Existing SFUDA methods either exploit only the source model's predictions or fine-tune large multimodal models.<n>We propose the Experts Cooperative Learning (EXCL) to exploit complementary insights and the latent structure of target data.
arXiv Detail & Related papers (2025-09-26T11:39:50Z)
SelfJudge: Faster Speculative Decoding via Self-Supervised Judge Verification [28.63435151584449]
We propose SelfJudge, which trains judge verifiers via self-supervision of the target model.<n>Our method measures semantic preservation by assessing whether token-substituted responses preserve the meaning of original responses.
arXiv Detail & Related papers (2025-09-26T02:21:12Z)
Ranked from Within: Ranking Large Multimodal Models Without Labels [73.96543593298426]
We show that uncertainty scores derived from softmax distributions provide a robust basis for ranking models across various tasks.<n>This facilitates the ranking of LMMs on unlabeled data, providing a practical approach for selecting models for diverse target domains without requiring manual annotation.
arXiv Detail & Related papers (2024-12-09T13:05:43Z)
Graph-DPEP: Decomposed Plug and Ensemble Play for Few-Shot Document Relation Extraction with Graph-of-Thoughts Reasoning [34.85741925091139]
Graph-DPEP framework is grounded in the reasoning behind triplet explanation thoughts presented in natural language. We develop "ensemble-play", reapplying generation on the entire type list by leveraging the reasoning thoughts embedded in a sub-graph.
arXiv Detail & Related papers (2024-11-05T07:12:36Z)
Permissive Information-Flow Analysis for Large Language Models [21.563132267220073]
Large Language Models (LLMs) are rapidly becoming commodity components of larger software systems.<n>This poses natural security and privacy problems: poisoned data retrieved from one component can change the model's behavior and compromise the entire system.<n>We propose a novel, more permissive approach to propagate information flow labels through LLM queries.
arXiv Detail & Related papers (2024-10-04T00:25:43Z)
Lemur: Log Parsing with Entropy Sampling and Chain-of-Thought Merging [18.823038918091207]
We introduce a cutting-edge texttextbfLog parsing framework with textbfEntropy sampling and chain-of-thought textbfMerging (model)<n>To discard the tedious manual rules, we propose a novel sampling method inspired by information entropy, which efficiently clusters typical logs.<n>We have conducted experiments on large-scale public datasets.
arXiv Detail & Related papers (2024-02-28T09:51:55Z)
Label-Efficient Model Selection for Text Generation [14.61636207880449]
We introduce DiffUse, a method to make an informed decision between candidate text generation models based on preference annotations. In a series of experiments over hundreds of model pairs, we demonstrate that DiffUse can dramatically reduce the required number of annotations.
arXiv Detail & Related papers (2024-02-12T18:54:02Z)
Multi-Candidate Speculative Decoding [82.05519287513444]
Large language models have shown impressive capabilities across a variety of NLP tasks, yet their generating text autoregressively is time-consuming. One way to speed them up is speculative decoding, which generates candidate segments from a fast draft model that is then verified in parallel by the target model. This paper proposes sampling multiple candidates from a draft model and then organising them in batches for verification. We design algorithms for efficient multi-candidate verification while maintaining the distribution of the target model.
arXiv Detail & Related papers (2024-01-12T17:15:23Z)
DORE: Document Ordered Relation Extraction based on Generative Framework [56.537386636819626]
This paper investigates the root cause of the underwhelming performance of the existing generative DocRE models. We propose to generate a symbolic and ordered sequence from the relation matrix which is deterministic and easier for model to learn. Experimental results on four datasets show that our proposed method can improve the performance of the generative DocRE models.
arXiv Detail & Related papers (2022-10-28T11:18:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.