Related papers: Improving Constrained Generation in Language Models via Self-Distilled Twisted Sequential Monte Carlo

Improving Constrained Generation in Language Models via Self-Distilled Twisted Sequential Monte Carlo

URL: http://arxiv.org/abs/2507.02315v1
Date: Thu, 03 Jul 2025 05:00:21 GMT
Title: Improving Constrained Generation in Language Models via Self-Distilled Twisted Sequential Monte Carlo
Authors: Sooyeon Kim, Giung Nam, Juho Lee,
Abstract summary: In constrained generation settings, learning becomes challenging due to sparse and uninformative reward signals.<n>We show that iteratively refining the base model through self-distillation alleviates this issue by making the model progressively more aligned with the target.
Score: 15.169258833686413
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent work has framed constrained text generation with autoregressive language models as a probabilistic inference problem. Among these, Zhao et al. (2024) introduced a promising approach based on twisted Sequential Monte Carlo, which incorporates learned twist functions and twist-induced proposals to guide the generation process. However, in constrained generation settings where the target distribution concentrates on outputs that are unlikely under the base model, learning becomes challenging due to sparse and uninformative reward signals. We show that iteratively refining the base model through self-distillation alleviates this issue by making the model progressively more aligned with the target, leading to substantial gains in generation quality.

Related papers

Simulated Annealing Enhances Theory-of-Mind Reasoning in Autoregressive Language Models [1.4323566945483497]
Theory of Mind (ToM) tasks crucially depend on reasoning about latent mental states of oneself and others.<n>We show that strong ToM capability can be recovered directly from the base model without any additional weight updates or verifications.
arXiv Detail & Related papers (2026-01-18T05:51:30Z)
Visual Self-Refinement for Autoregressive Models [27.0373357661741]
This work proposes a plug-and-play refinement module to enhance the complex spatial correspondence modeling.<n> Experiments demonstrate that the proposed method improves the generation quality, enhancing the model's ability to produce semantically consistent results.
arXiv Detail & Related papers (2025-10-01T15:03:32Z)
Self-Correcting Code Generation Using Small Language Models [11.4397549365277]
Self-correction has demonstrated potential in code generation by allowing language models to revise and improve their outputs through successive refinement.<n>We introduce CoCoS, an approach designed to enhance the ability of small language models for multi-turn code correction.<n>With 1B-scale models, CoCoS achieves improvements of 35.8% on the MBPP and 27.7% on HumanEval compared to the baselines.
arXiv Detail & Related papers (2025-05-29T04:04:44Z)
Self-Improvement in Language Models: The Sharpening Mechanism [70.9248553790022]
We offer a new perspective on the capabilities of self-improvement through a lens we refer to as sharpening.<n>Motivated by the observation that language models are often better at verifying response quality than they are at generating correct responses, we formalize self-improvement as using the model itself as a verifier during post-training.<n>We analyze two natural families of self-improvement algorithms based on SFT and RLHF.
arXiv Detail & Related papers (2024-12-02T20:24:17Z)
Enhancing Pre-Trained Generative Language Models with Question Attended Span Extraction on Machine Reading Comprehension [6.602323571343169]
Integrated during the fine-tuning phase of pre-trained generative language models (PLMs), QASE significantly enhances their performance. The efficacy of the QASE module has been rigorously tested across various datasets.
arXiv Detail & Related papers (2024-04-27T19:42:51Z)
Heat Death of Generative Models in Closed-Loop Learning [63.83608300361159]
We study the learning dynamics of generative models that are fed back their own produced content in addition to their original training dataset. We show that, unless a sufficient amount of external data is introduced at each iteration, any non-trivial temperature leads the model to degenerate.
arXiv Detail & Related papers (2024-04-02T21:51:39Z)
Non-autoregressive Sequence-to-Sequence Vision-Language Models [59.445765313094434]
We propose a parallel decoding sequence-to-sequence vision-language model that marginalizes over multiple inference paths in the decoder.<n>The model achieves performance on-par with its state-of-the-art autoregressive counterpart, but is faster at inference time.
arXiv Detail & Related papers (2024-03-04T17:34:59Z)
Calibrating Likelihoods towards Consistency in Summarization Models [22.023863165579602]
We argue that the main reason for such behavior is that the summarization models trained with maximum likelihood objective assign high probability to plausible sequences given the context. In this work, we solve this problem by calibrating the likelihood of model generated sequences to better align with a consistency metric measured by natural language inference (NLI) models.
arXiv Detail & Related papers (2023-10-12T23:17:56Z)
PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model [37.2192243883707]
We propose PLANNER, a model that combines latent semantic diffusion with autoregressive generation to generate fluent text. Results on semantic generation, text completion and summarization show its effectiveness in generating high-quality long-form text.
arXiv Detail & Related papers (2023-06-05T01:36:39Z)
On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules. We study the generalization and adaption performance of such modular neural causal models. Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z)
Improving Non-autoregressive Generation with Mixup Training [51.61038444990301]
We present a non-autoregressive generation model based on pre-trained transformer models. We propose a simple and effective iterative training method called MIx Source and pseudo Target. Our experiments on three generation benchmarks including question generation, summarization and paraphrase generation, show that the proposed framework achieves the new state-of-the-art results.
arXiv Detail & Related papers (2021-10-21T13:04:21Z)
Goal-directed Generation of Discrete Structures with Conditional Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward. We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.