Related papers: Combining Logic with Large Language Models for Automatic Debugging and Repair of ASP Programs

Combining Logic with Large Language Models for Automatic Debugging and Repair of ASP Programs

URL: http://arxiv.org/abs/2410.20962v1
Date: Mon, 28 Oct 2024 12:30:48 GMT
Title: Combining Logic with Large Language Models for Automatic Debugging and Repair of ASP Programs
Authors: Ricardo Brancas, Vasco Manquinho, Ruben Martins,
Abstract summary: FormHe is a tool that combines logic-based techniques and Large Language Models to identify and correct issues in Answer Set Programming submissions. We show that FormHe accurately detects faults in 94% of cases and successfully repairs 58% of incorrect submissions.
Score: 1.0650780147044159
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Logic programs are a powerful approach for solving NP-Hard problems. However, due to their declarative nature, debugging logic programs poses significant challenges. Unlike procedural paradigms, which allow for step-by-step inspection of program state, logic programs require reasoning about logical statements for fault localization. This complexity is amplified in learning environments due to students' inexperience. We introduce FormHe, a novel tool that combines logic-based techniques and Large Language Models to identify and correct issues in Answer Set Programming submissions. FormHe consists of two components: a fault localization module and a program repair module. First, the fault localizer identifies a set of faulty program statements requiring modification. Subsequently, FormHe employs program mutation techniques and Large Language Models to repair the flawed ASP program. These repairs can then serve as guidance for students to correct their programs. Our experiments with real buggy programs submitted by students show that FormHe accurately detects faults in 94% of cases and successfully repairs 58% of incorrect submissions.

Related papers

HoarePrompt: Structural Reasoning About Program Correctness in Natural Language [6.0749049701897295]
HoarePrompt is a novel approach that adapts fundamental ideas from program analysis and verification to natural language artifacts. To manage loops, we propose few-shot-driven k-induction, an adaptation of the k-induction method widely used in model checking. Our experiments show that HoarePrompt improves the MCC by 62% compared to directly using Zero-shot-CoT prompts for correctness classification.
arXiv Detail & Related papers (2025-03-25T12:30:30Z)
Relating Answer Set Programming and Many-sorted Logics for Formal Verification [1.223779595809275]
My research agenda has been focused on addressing three issues with the intention of making ASP verification an accessible, routine task. I have investigated alternative semantics for ASP based on translations into the logic of here-and-there and many-sorted first-order logic. These semantics promote a modular understanding of logic programs, bypass grounding, and enable us to use automated theorem provers to automatically verify properties of programs.
arXiv Detail & Related papers (2025-02-13T11:52:40Z)
Logic Error Localization in Student Programming Assignments Using Pseudocode and Graph Neural Networks [31.600659350609476]
We develop a system designed to localize logic errors within student programming assignments at the line level. We employ a graph neural network to both localize and suggest corrections for logic errors. Our experimental results are promising, demonstrating a localization accuracy of 99.2% for logic errors within the top-10 suspected lines.
arXiv Detail & Related papers (2024-10-11T01:46:24Z)
Multi-Task Program Error Repair and Explanatory Diagnosis [28.711745671275477]
We present a novel machine-learning approach for Multi-task Program Error Repair and Explanatory Diagnosis (mPRED) A pre-trained language model is used to encode the source code, and a downstream model is specifically designed to identify and repair errors. To aid in visualizing and analyzing the program structure, we use a graph neural network for program structure visualization.
arXiv Detail & Related papers (2024-10-09T05:09:24Z)
Improving LLM Classification of Logical Errors by Integrating Error Relationship into Prompts [1.7095867620640115]
A key aspect of programming education is understanding and dealing with error message. 'logical errors' in which the program operates against the programmer's intentions do not receive error messages from the compiler. We propose an effective approach for detecting logical errors with LLMs that makes use of relations among error types in the Chain-of-Thought and Tree-of-Thought prompts.
arXiv Detail & Related papers (2024-04-30T08:03:22Z)
NExT: Teaching Large Language Models to Reason about Code Execution [50.93581376646064]
Large language models (LLMs) of code are typically trained on the surface textual form of programs. We propose NExT, a method to teach LLMs to inspect the execution traces of programs and reason about their run-time behavior.
arXiv Detail & Related papers (2024-04-23T01:46:32Z)
Language Models can be Logical Solvers [99.40649402395725]
We introduce LoGiPT, a novel language model that directly emulates the reasoning processes of logical solvers. LoGiPT is fine-tuned on a newly constructed instruction-tuning dataset derived from revealing and refining the invisible reasoning process of deductive solvers.
arXiv Detail & Related papers (2023-11-10T16:23:50Z)
Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts. Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness. Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z)
Fact-Checking Complex Claims with Program-Guided Reasoning [99.7212240712869]
Program-Guided Fact-Checking (ProgramFC) is a novel fact-checking model that decomposes complex claims into simpler sub-tasks. We first leverage the in-context learning ability of large language models to generate reasoning programs. We execute the program by delegating each sub-task to the corresponding sub-task handler.
arXiv Detail & Related papers (2023-05-22T06:11:15Z)
System Predictor: Grounding Size Estimator for Logic Programs under Answer Set Semantics [0.5801044612920815]
We present the system Predictor for estimating the grounding size of programs. We evaluate the impact of Predictor when used as a guide for rewritings produced by the answer set programming rewriting tools Projector and Lpopt.
arXiv Detail & Related papers (2023-03-29T20:49:40Z)
Enforcing Consistency in Weakly Supervised Semantic Parsing [68.2211621631765]
We explore the use of consistency between the output programs for related inputs to reduce the impact of spurious programs. We find that a more consistent formalism leads to improved model performance even without consistency-based training.
arXiv Detail & Related papers (2021-07-13T03:48:04Z)
Graph-based, Self-Supervised Program Repair from Diagnostic Feedback [108.48853808418725]
We introduce a program-feedback graph, which connects symbols relevant to program repair in source code and diagnostic feedback. We then apply a graph neural network on top to model the reasoning process. We present a self-supervised learning paradigm for program repair that leverages unlabeled programs available online.
arXiv Detail & Related papers (2020-05-20T07:24:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.