NEAT: QCP: A Practical Separation Logic-based C Program Verification Tool
- URL: http://arxiv.org/abs/2505.12878v1
- Date: Mon, 19 May 2025 09:04:34 GMT
- Title: NEAT: QCP: A Practical Separation Logic-based C Program Verification Tool
- Authors: Xiwei Wu, Yueyang Feng, Xiaoyang Lu, Tianchuan Lin, Kan Liu, Zhiyi Wang, Shushu Wu, Lihan Xie, Chengxi Yang, Hongyi Zhong, Naijun Zhan, Zhenjiang Hu, Qinxiang Cao,
- Abstract summary: This paper introduces a novel verification tool, called textbf C Programming Verifier (QCP).<n>QCP incorporates a refined front-end %syntax of assertion language to enhance user interaction.
- Score: 4.449417653464987
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As software systems increase in size and complexity dramatically, ensuring their correctness, security, and reliability becomes an increasingly formidable challenge. Despite significant advancements in verification techniques and tools, there still remain %these tools still continue to encounter substantial difficulties when applying these tools to complex, real-world scenarios. To address these difficulties, this paper introduces a novel verification tool, called \textbf{Qualified C Programming Verifier (QCP)}. QCP incorporates a refined front-end %syntax of assertion language to enhance user interaction. The proposed assertion language aims to %syntax is designed to lower the entry barrier for verification tools, improve proof efficiency by improving automation, and facilitate a deeper understanding of both the program and its verification results.
Related papers
- Position Paper: Programming Language Techniques for Bridging LLM Code Generation Semantic Gaps [3.61356888205659]
This paper argues that principled integration of Programming Language techniques is essential for bridging semantic gaps in large language models.<n>PL techniques can elevate LLM-generated code from statistical pattern matching to truly reliable and trustworthy levels.
arXiv Detail & Related papers (2025-07-12T04:32:15Z) - What Challenges Do Developers Face When Using Verification-Aware Programming Languages? [45.44831696628473]
In software development, increasing software reliability often involves testing.<n>For complex and critical systems, developers can use Design by Contract (DbC) methods to define precise specifications that software components must satisfy.<n> Verification-Aware (VA) programming languages support DbC and formal verification at compile-time or run-time, offering stronger correctness guarantees than traditional testing.
arXiv Detail & Related papers (2025-06-30T10:17:39Z) - Enhancing Tool Learning in Large Language Models with Hierarchical Error Checklists [49.450160442348825]
We propose the Hierarchical Tool Error Checklist (HiTEC) framework to mitigate and diagnose tool-calling errors.<n>HiTEC introduces a two-tiered approach: a global error checklist that identifies common, cross-tool issues, and a local error checklist that targets tool-specific and contextual failures.<n>Our framework significantly improves parameter-filling accuracy and tool-calling success rates compared to baseline methods.
arXiv Detail & Related papers (2025-05-28T08:39:35Z) - Enhancing Large Language Models with Faster Code Preprocessing for Vulnerability Detection [0.0]
We build on the existing SCoPE framework and introduce SCoPE2, an enhanced version with improved performance.<n>Our results show a 97.3% reduction in processing time with SCoPE2, along with an improved F1-score for the Large Language Model (LLM) for vulnerability detection.
arXiv Detail & Related papers (2025-05-08T19:00:11Z) - Acting Less is Reasoning More! Teaching Model to Act Efficiently [87.28134636548705]
Tool-integrated reasoning augments large language models with the ability to invoke external tools to solve tasks.<n>Current approaches typically optimize only for final correctness without considering the efficiency or necessity of external tool use.<n>We propose a framework that encourages models to produce accurate answers with minimal tool calls.<n>Our approach reduces tool calls by up to 68.3% and improves tool productivity by up to 215.4%, while maintaining comparable answer accuracy.
arXiv Detail & Related papers (2025-04-21T05:40:05Z) - FACTS&EVIDENCE: An Interactive Tool for Transparent Fine-Grained Factual Verification of Machine-Generated Text [39.804889153032526]
Facts&Evidence is an interactive tool for user-driven verification of complex text.<n>It visualizes the credibility of individual claims along with an explanation of model decisions and attribution to multiple, diverse evidence sources.
arXiv Detail & Related papers (2025-03-19T00:14:55Z) - Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger [49.81945268343162]
We propose MeCo, an adaptive decision-making strategy for external tool use.<n>MeCo quantifies metacognitive scores by capturing high-level cognitive signals in the representation space.<n>MeCo is fine-tuning-free and incurs minimal cost.
arXiv Detail & Related papers (2025-02-18T15:45:01Z) - ToolCoder: A Systematic Code-Empowered Tool Learning Framework for Large Language Models [49.04652315815501]
Tool learning has emerged as a crucial capability for large language models (LLMs) to solve complex real-world tasks through interaction with external tools.<n>We propose ToolCoder, a novel framework that reformulates tool learning as a code generation task.
arXiv Detail & Related papers (2025-02-17T03:42:28Z) - Evaluating Pre-Trained Models for Multi-Language Vulnerability Patching [3.220818227251765]
This paper investigates the potential of pre-trained language models, CodeBERT and CodeT5, for automated vulnerability patching.<n>We evaluate these models on their accuracy, computational efficiency, and how the length of vulnerable code patches impacts performance.
arXiv Detail & Related papers (2025-01-13T13:51:05Z) - Development and Adoption of SATD Detection Tools: A State-of-practice Report [5.670597842524448]
Self-Admitted Technical Debt (SATD) refers to instances where developers knowingly introduce suboptimal solutions into code.<n>This paper provides a comprehensive state-of-practice report on the development and adoption of SATD detection tools.
arXiv Detail & Related papers (2024-12-18T12:06:53Z) - Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification [52.095460362197336]
Large language models (LLMs) struggle with consistent and accurate reasoning.
LLMs are trained primarily on correct solutions, reducing their ability to detect and learn from errors.
We propose a novel collaborative method integrating Chain-of-Thought (CoT) and Program-of-Thought (PoT) solutions for verification.
arXiv Detail & Related papers (2024-10-05T05:21:48Z) - Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs)
The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation.
We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z) - Lemur: Integrating Large Language Models in Automated Program Verification [10.221822902660458]
We propose a general methodology to combine the power of LLMs and automated reasoners for automated program verification.
We instantiate the calculus as a sound automated verification procedure and demonstrate practical improvements on a set of synthetic and competition benchmarks.
arXiv Detail & Related papers (2023-10-07T16:44:53Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - FacTool: Factuality Detection in Generative AI -- A Tool Augmented
Framework for Multi-Task and Multi-Domain Scenarios [87.12753459582116]
A wider range of tasks now face an increasing risk of containing factual errors when handled by generative models.
We propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models.
arXiv Detail & Related papers (2023-07-25T14:20:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.