Corrective Diffusion Language Models
- URL: http://arxiv.org/abs/2512.15596v1
- Date: Wed, 17 Dec 2025 17:04:38 GMT
- Title: Corrective Diffusion Language Models
- Authors: Shuibai Zhang, Fred Zhangzhi Peng, Yiheng Zhang, Jin Pan, Grigorios G. Chrysos,
- Abstract summary: We study corrective behavior in diffusion language models, defined as the ability to assign lower confidence to incorrect tokens and iteratively refine them while preserving correct content.<n>We propose a correction-oriented post-training principle that explicitly supervises visible incorrect tokens, enabling error-aware confidence and targeted refinement.
- Score: 12.724100711773593
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion language models are structurally well-suited for iterative error correction, as their non-causal denoising dynamics allow arbitrary positions in a sequence to be revised. However, standard masked diffusion language model (MDLM) training fails to reliably induce this behavior, as models often cannot identify unreliable tokens in a complete input, rendering confidence-guided refinement ineffective. We study corrective behavior in diffusion language models, defined as the ability to assign lower confidence to incorrect tokens and iteratively refine them while preserving correct content. We show that this capability is not induced by conventional masked diffusion objectives and propose a correction-oriented post-training principle that explicitly supervises visible incorrect tokens, enabling error-aware confidence and targeted refinement. To evaluate corrective behavior, we introduce the Code Revision Benchmark (CRB), a controllable and executable benchmark for assessing error localization and in-place correction. Experiments on code revision tasks and controlled settings demonstrate that models trained with our approach substantially outperform standard MDLMs in correction scenarios, while also improving pure completion performance. Our code is publicly available at https://github.com/zhangshuibai/CDLM.
Related papers
- CORE: Context-Robust Remasking for Diffusion Language Models [51.59514489363897]
We propose Context-Robust Remasking (CORE), a training-free framework for inference-time revision.<n>Rather than trusting static token probabilities, CORE identifies context-brittle tokens by probing their sensitivity to targeted masked-context perturbations.<n>On LLaDA-8B-Base, CORE delivers consistent improvements across reasoning and code benchmarks, outperforming compute-matched baselines and improving MBPP by up to 9.2 percentage points.
arXiv Detail & Related papers (2026-02-04T00:12:30Z) - Training-Free Self-Correction for Multimodal Masked Diffusion Models [61.84305395626145]
We propose a training-free self-correction framework that exploits the inductive biases of pre-trained masked diffusion models.<n>Our method significantly improves generation quality on text-to-image generation and multimodal understanding tasks with reduced sampling steps.
arXiv Detail & Related papers (2026-02-02T23:58:15Z) - From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model [72.73512218682187]
We introduce ReDiff, a refining-enhanced diffusion framework that teaches the model to identify and correct its own errors.<n>Our approach features a two-stage training process: first, we instill a foundational revision capability by training the model to revise synthetic errors; second, we implement a novel online self-correction loop.<n>This mistake-driven learning endows the model with the crucial ability to revisit and refine its already generated output, effectively breaking the error cascade.
arXiv Detail & Related papers (2025-10-22T06:58:55Z) - Adapting Language Balance in Code-Switching Speech [60.296574524609575]
Large foundational models still struggle against code-switching test cases.<n>We use differentiable surrogates to mitigate context bias during generation.<n>Experiments with Arabic and Chinese-English showed that the models are able to predict the switching places more correctly.
arXiv Detail & Related papers (2025-10-21T15:23:55Z) - Mechanistic Interpretability of Code Correctness in LLMs via Sparse Autoencoders [0.0]
We apply sparse autoencoders to decompose Large Language Models, identifying directions that correspond to code correctness.<n>We find that code correctness directions in LLMs reliably predict incorrect code, while correction capabilities, though statistically significant, involve tradeoffs between fixing errors and preserving correct code.<n>Our mechanistic insights suggest three practical applications: prompting strategies should prioritize test examples over elaborate problem descriptions, predictor directions can serve as error alarms for developer review, and these same predictors can guide selective steering, intervening only when errors are anticipated to prevent the code corruption from constant steering.
arXiv Detail & Related papers (2025-10-03T11:44:21Z) - Latent Veracity Inference for Identifying Errors in Stepwise Reasoning [78.29317733206643]
We introduce Veracity Search (VS), a discrete search algorithm over veracity assignments.<n>It performs otherwise intractable inference in the posterior distribution over latent veracity values.<n>It generalizes VS, enabling accurate zero-shot veracity inference in novel contexts.
arXiv Detail & Related papers (2025-05-17T04:16:36Z) - Conformal Linguistic Calibration: Trading-off between Factuality and Specificity [41.45862052156885]
We present a framework connecting abstention and linguistic calibration through the lens of linguistic pragmatics.<n>Results demonstrate our method produces calibrated outputs with conformal guarantees on factual accuracy.
arXiv Detail & Related papers (2025-02-26T13:01:49Z) - Selective Learning: Towards Robust Calibration with Dynamic Regularization [79.92633587914659]
Miscalibration in deep learning refers to there is a discrepancy between the predicted confidence and performance.
We introduce Dynamic Regularization (DReg) which aims to learn what should be learned during training thereby circumventing the confidence adjusting trade-off.
arXiv Detail & Related papers (2024-02-13T11:25:20Z) - uChecker: Masked Pretrained Language Models as Unsupervised Chinese
Spelling Checkers [23.343006562849126]
We propose a framework named textbfuChecker to conduct unsupervised spelling error detection and correction.
Masked pretrained language models such as BERT are introduced as the backbone model.
Benefiting from the various and flexible MASKing operations, we propose a Confusionset-guided masking strategy to fine-train the masked language model.
arXiv Detail & Related papers (2022-09-15T05:57:12Z) - Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese
Grammatical Error Correction [49.25830718574892]
We present a new framework named Tail-to-Tail (textbfTtT) non-autoregressive sequence prediction.
Considering that most tokens are correct and can be conveyed directly from source to target, and the error positions can be estimated and corrected.
Experimental results on standard datasets, especially on the variable-length datasets, demonstrate the effectiveness of TtT in terms of sentence-level Accuracy, Precision, Recall, and F1-Measure.
arXiv Detail & Related papers (2021-06-03T05:56:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.