Combining Constrained and Unconstrained Decoding via Boosting: BoostCD and Its Application to Information Extraction
- URL: http://arxiv.org/abs/2506.14901v1
- Date: Tue, 17 Jun 2025 18:16:17 GMT
- Title: Combining Constrained and Unconstrained Decoding via Boosting: BoostCD and Its Application to Information Extraction
- Authors: Marija Ĺ akota, Robert West,
- Abstract summary: Boosted Constrained Decoding combines constrained and unconstrained decoding in two phases.<n>We demonstrate the power of BoostCD by applying it to closed information extraction.
- Score: 11.996681571362744
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many recent approaches to structured NLP tasks use an autoregressive language model $M$ to map unstructured input text $x$ to output text $y$ representing structured objects (such as tuples, lists, trees, code, etc.), where the desired output structure is enforced via constrained decoding. During training, these approaches do not require the model to be aware of the constraints, which are merely implicit in the training outputs $y$. This is advantageous as it allows for dynamic constraints without requiring retraining, but can lead to low-quality output during constrained decoding at test time. We overcome this problem with Boosted Constrained Decoding (BoostCD), which combines constrained and unconstrained decoding in two phases: Phase 1 decodes from the base model $M$ twice, in constrained and unconstrained mode, obtaining two weak predictions. In phase 2, a learned autoregressive boosted model combines the two weak predictions into one final prediction. The mistakes made by the base model with vs. without constraints tend to be complementary, which the boosted model learns to exploit for improved performance. We demonstrate the power of BoostCD by applying it to closed information extraction. Our model, BoostIE, outperforms prior approaches both in and out of distribution, addressing several common errors identified in those approaches.
Related papers
- Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling [90.86991492288487]
evaluating constraint on every token can be prohibitively expensive.<n> LCD can distort the global distribution over strings, sampling tokens based only on local information.<n>We show that our approach is superior to state-of-the-art baselines.
arXiv Detail & Related papers (2025-04-07T18:30:18Z) - Scaling Embedding Layers in Language Models [52.47659840377581]
SCONE is a new method for extending input embedding layers to enhance language model performance.<n> embeddings provide contextualized representation for each input token and are learned with a separate model during training.<n>SCONE enables two new scaling strategies: increasing the number of $n$-gram embeddings and scaling the model used to learn them, both while maintaining fixed accelerator usage during inference.
arXiv Detail & Related papers (2025-02-03T18:59:32Z) - PLPP: Prompt Learning with Perplexity Is Self-Distillation for Vision-Language Models [8.480318790780037]
We propose a plug-in prompt-regularization method called PLPP, which use perplexity loss to regularize prompt learning.<n>Experiments conducted on four classification tasks indicate that PLPP exhibits superior performance compared to existing methods.
arXiv Detail & Related papers (2024-12-18T03:08:53Z) - SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications [9.143856130336783]
Speculative decoding is widely adopted to reduce latency in large language model (LLM) inference.<n>Agentic frameworks submit repetitive inference requests, such as multi-agent pipelines performing similar subtasks or self-refinement loops iteratively enhancing outputs.<n>We introduce emphSuffixDecoding, a novel method that utilizes efficient suffix trees to cache long token sequences.
arXiv Detail & Related papers (2024-11-07T18:49:33Z) - COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement [80.18490952057125]
Iterative refinement has emerged as an effective paradigm for enhancing the capabilities of large language models (LLMs) on complex tasks.
We propose Context-Wise Order-Agnostic Language Modeling (COrAL) to overcome these challenges.
Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally.
arXiv Detail & Related papers (2024-10-12T23:56:19Z) - Confident Adaptive Language Modeling [95.45272377648773]
CALM is a framework for dynamically allocating different amounts of compute per input and generation timestep.
We demonstrate the efficacy of our framework in reducing compute -- potential speedup of up to $times 3$ -- while provably maintaining high performance.
arXiv Detail & Related papers (2022-07-14T17:00:19Z) - Improving Robustness and Generality of NLP Models Using Disentangled
Representations [62.08794500431367]
Supervised neural networks first map an input $x$ to a single representation $z$, and then map $z$ to the output label $y$.
We present methods to improve robustness and generality of NLP models from the standpoint of disentangled representation learning.
We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.
arXiv Detail & Related papers (2020-09-21T02:48:46Z) - The Right Tool for the Job: Matching Model and Instance Complexities [62.95183777679024]
As NLP models become larger, executing a trained model requires significant computational resources incurring monetary and environmental costs.
We propose a modification to contextual representation fine-tuning which, during inference, allows for an early (and fast) "exit"
We test our proposed modification on five different datasets in two tasks: three text classification datasets and two natural language inference benchmarks.
arXiv Detail & Related papers (2020-04-16T04:28:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.