Related papers: Automatic Syntax Error Repair for Discrete Controller Synthesis using Large Language Model

Automatic Syntax Error Repair for Discrete Controller Synthesis using Large Language Model

URL: http://arxiv.org/abs/2512.07261v1
Date: Mon, 08 Dec 2025 07:57:15 GMT
Title: Automatic Syntax Error Repair for Discrete Controller Synthesis using Large Language Model
Authors: Yusei Ishimizu, Takuto Yamauchi, Sinan Chen, Jinyu Cai, Jialong Li, Kenji Tei,
Abstract summary: This paper presents an automated approach that leverages Large Language Models (LLMs) to repair syntax errors in DCS models.<n>It equips the LLM with DCS-specific domain knowledge, including formal grammar rules and illustrative examples, to guide accurate corrections.<n>It achieves a speedup of 3.46 times compared to human developers.
Score: 8.741815980649667
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Discrete Controller Synthesis (DCS) is a powerful formal method for automatically generating specifications of discrete event systems. However, its practical adoption is often hindered by the highly specialized nature of formal models written in languages such as FSP and FLTL. In practice, syntax errors in modeling frequently become an important bottleneck for developers-not only disrupting the workflow and reducing productivity, but also diverting attention from higher-level semantic design. To this end, this paper presents an automated approach that leverages Large Language Models (LLMs) to repair syntax errors in DCS models using a well-designed, knowledge-informed prompting strategy. Specifically, the prompting is derived from a systematic empirical study of common error patterns, identified through expert interviews and student workshops. It equips the LLM with DCS-specific domain knowledge, including formal grammar rules and illustrative examples, to guide accurate corrections. To evaluate our method, we constructed a new benchmark by systematically injecting realistic syntax errors into validated DCS models. The quantitative evaluation demonstrates the high effectiveness of the proposed approach in terms of repair accuracy and its practical utility regarding time, achieving a speedup of 3.46 times compared to human developers. The experimental replication suite, including the benchmark and prompts, is available at https://github.com/Uuusay1432/DCSModelRepair.git

Related papers

Confusion-Aware Rubric Optimization for LLM-based Automated Grading [31.353360036776976]
We introduce Confusion-Aware Optimization (CARO), a novel framework that enhances accuracy and computational efficiency.<n>CARO decomposes monolithic error signals into distinct modes, allowing for unambiguous diagnosis and repair of specific misclassification patterns.<n>These results suggest that replacing mixed-error aggregation with surgical, mode-specific repair yields robust improvements in automated assessment scalability and precision.
arXiv Detail & Related papers (2026-02-28T04:17:12Z)
Teaching by Failure: Counter-Example-Driven Curricula for Transformer Self-Improvement [0.0]
Transformer models often exhibit brittle extrapolation, failing on inputs that are longer or structurally more complex than those seen during training.<n>We introduce Counter-Example-Driven Curricula (CEDC), an automated framework that improves model robustness by iteratively focusing on its own failures.
arXiv Detail & Related papers (2025-12-01T02:00:41Z)
Autoformalizer with Tool Feedback [52.334957386319864]
Autoformalization addresses the scarcity of data for Automated Theorem Proving (ATP) by translating mathematical problems from natural language into formal statements.<n>Existing formalizer still struggles to consistently generate valid statements that meet syntactic validity and semantic consistency.<n>We propose the Autoformalizer with Tool Feedback (ATF), a novel approach that incorporates syntactic and consistency information as tools into the formalization process.
arXiv Detail & Related papers (2025-10-08T10:25:12Z)
PAT-Agent: Autoformalization for Model Checking [17.082027022913998]
PAT-Agent is an end-to-end framework for natural language autoformalization and formal model repair.<n>It combines the generative capabilities of large language models with the rigor of formal verification.
arXiv Detail & Related papers (2025-09-28T06:32:14Z)
Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny [78.1575956773948]
Large Language Models (LLMs) trained with Reinforcement Learning (RL) face a significant challenge: their verification processes are neither reliable nor scalable.<n>A promising yet largely uncharted alternative is formal language-based reasoning.<n>Grounding LLMs in rigorous formal systems where generative models operate in formal language spaces (e.g., Dafny) enables the automatic and mathematically provable verification of their reasoning processes and outcomes.
arXiv Detail & Related papers (2025-07-22T08:13:01Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
InferAligner: Inference-Time Alignment for Harmlessness through Cross-Model Guidance [56.184255657175335]
We develop textbfInferAligner, a novel inference-time alignment method that utilizes cross-model guidance for harmlessness alignment. Experimental results show that our method can be very effectively applied to domain-specific models in finance, medicine, and mathematics. It significantly diminishes the Attack Success Rate (ASR) of both harmful instructions and jailbreak attacks, while maintaining almost unchanged performance in downstream tasks.
arXiv Detail & Related papers (2024-01-20T10:41:03Z)
Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting [68.19544657508509]
Large language models (LLMs) are adopted as a fundamental component of language technologies. We find that several widely used open-source LLMs are extremely sensitive to subtle changes in prompt format in few-shot settings. We propose an algorithm that rapidly evaluates a sampled set of plausible prompt formats for a given task, and reports the interval of expected performance without accessing model weights.
arXiv Detail & Related papers (2023-10-17T15:03:30Z)
The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation [93.01964988474755]
AutoMQM is a prompting technique which asks large language models to identify and categorize errors in translations. We study the impact of labeled data through in-context learning and finetuning. We then evaluate AutoMQM with PaLM-2 models, and we find that it improves performance compared to just prompting for scores.
arXiv Detail & Related papers (2023-08-14T17:17:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.