Related papers: SLICEMATE: Accurate and Scalable Static Program Slicing via LLM-Powered Agents

SLICEMATE: Accurate and Scalable Static Program Slicing via LLM-Powered Agents

URL: http://arxiv.org/abs/2507.18957v1
Date: Fri, 25 Jul 2025 04:51:47 GMT
Title: SLICEMATE: Accurate and Scalable Static Program Slicing via LLM-Powered Agents
Authors: Jianming Chang, Jieke Shi, Yunbo Lyu, Xin Zhou, Lulu Wang, Zhou Yang, Bixin Li, David Lo,
Abstract summary: SliceMate is a novel static program slicing solution powered by Large Language Model (LLM) agents.<n>It bypasses the need for explicit dependency graph construction and achieves superior slicing accuracy.<n>For rigorous evaluation, we construct a new and high-quality benchmark, SliceBench, comprising 2,200 manually annotated Java and Python programs.
Score: 11.069304685402642
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Static program slicing, which extracts the executable portions of a program that affect the values at a specific location, supports many software analysis tasks such as debugging and security auditing. However, traditional slicing tools rely on computationally expensive reachability analysis over dependency graphs, which struggle to scale to large programs and often fail to handle code with incomplete syntax. Recently emerged learning-based methods, while more robust to such cases, still fall short of achieving comparable performance to traditional methods on well-formed code. In this work, we propose SliceMate, a novel static program slicing solution powered by Large Language Model (LLM) agents. It bypasses the need for explicit dependency graph construction and achieving superior slicing accuracy. Concretely, SliceMate integrates three specialized agents: (1) a synthesis agent that produces candidate slices by incrementally expanding the scan scope across functions and files guided by LLM-inferred dependencies; (2) a verification agent that performs conciseness and completeness checks of the candidate slices, detecting missing or irrelevant statements; and (3) a refinement agent that repairs the slices with minimal edits in accordance with the verification results. These agents are orchestrated by a control module that ensures timely convergence and outputs high-quality slices without manual intervention. For rigorous evaluation, we construct a new and high-quality benchmark, SliceBench, comprising 2,200 manually annotated Java and Python programs, with program lengths ranging from 5 to 8,577 lines, significantly larger than those in existing slicing benchmarks. Experimental results show that SliceMate greatly outperforms both traditional and learning-based slicing tools.

Related papers

Training Language Models to Generate Quality Code with Program Analysis Feedback [66.0854002147103]
Code generation with large language models (LLMs) is increasingly adopted in production but fails to ensure code quality.<n>We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code.
arXiv Detail & Related papers (2025-05-28T17:57:47Z)
CompileAgent: Automated Real-World Repo-Level Compilation with Tool-Integrated LLM-based Agent System [52.048087777953064]
We propose CompileAgent, an agent framework dedicated to repo-level compilation.<n>CompileAgent integrates five tools and a flow-based agent strategy, enabling interaction with software artifacts for compilation instruction search and error resolution.<n>We show that our method significantly improves the compilation success rate, ranging from 10% to 71%.
arXiv Detail & Related papers (2025-05-07T08:59:14Z)
Acting Less is Reasoning More! Teaching Model to Act Efficiently [87.28134636548705]
Tool-integrated reasoning augments large language models with the ability to invoke external tools to solve tasks.<n>Current approaches typically optimize only for final correctness without considering the efficiency or necessity of external tool use.<n>We propose a framework that encourages models to produce accurate answers with minimal tool calls.<n>Our approach reduces tool calls by up to 68.3% and improves tool productivity by up to 215.4%, while maintaining comparable answer accuracy.
arXiv Detail & Related papers (2025-04-21T05:40:05Z)
Program Slicing in the Era of Large Language Models [7.990456190723922]
Program slicing is a critical technique in software engineering, enabling developers to isolate relevant portions of code. This study investigates the application of large language models (LLMs) to both static and dynamic program slicing.
arXiv Detail & Related papers (2024-09-19T00:07:56Z)
Scaling Symbolic Execution to Large Software Systems [0.0]
Symbolic execution is a popular static analysis technique used both in program verification and in bug detection software. We focus on an error finding framework called the Clang Static Analyzer, and the infrastructure built around it named CodeChecker.
arXiv Detail & Related papers (2024-08-04T02:54:58Z)
LLMDFA: Analyzing Dataflow in Code with Large Language Models [8.92611389987991]
This paper presents LLMDFA, a compilation-free and customizable dataflow analysis framework. We decompose the problem into several subtasks and introduce a series of novel strategies. On average, LLMDFA achieves 87.10% precision and 80.77% recall, surpassing existing techniques with F1 score improvements of up to 0.35.
arXiv Detail & Related papers (2024-02-16T15:21:35Z)
ControlLLM: Augment Language Models with Tools by Searching on Graphs [97.62758830255002]
We present ControlLLM, a novel framework that enables large language models (LLMs) to utilize multi-modal tools for solving real-world tasks. Our framework comprises three key components: (1) a textittask decomposer that breaks down a complex task into clear subtasks with well-defined inputs and outputs; (2) a textitThoughts-on-Graph (ToG) paradigm that searches the optimal solution path on a pre-built tool graph; and (3) an textitexecution engine with a rich toolbox that interprets the solution path and runs the
arXiv Detail & Related papers (2023-10-26T21:57:21Z)
Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts. Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness. Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z)
ART: Automatic multi-step reasoning and tool-use for large language models [105.57550426609396]
Large language models (LLMs) can perform complex reasoning in few- and zero-shot settings. Each reasoning step can rely on external tools to support computation beyond the core LLM capabilities. We introduce Automatic Reasoning and Tool-use (ART), a framework that uses frozen LLMs to automatically generate intermediate reasoning steps as a program.
arXiv Detail & Related papers (2023-03-16T01:04:45Z)
Top-Down Synthesis for Library Learning [46.285220926554345]
corpus-guided top-down synthesis is a mechanism for synthesizing library functions that capture common functionality from a corpus of programs. We present an implementation of the approach in a tool called Stitch and evaluate it against the state-of-the-art deductive library learning algorithm from DreamCoder.
arXiv Detail & Related papers (2022-11-29T21:57:42Z)
Comparative Code Structure Analysis using Deep Learning for Performance Prediction [18.226950022938954]
This paper aims to assess the feasibility of using purely static information (e.g., abstract syntax tree or AST) of applications to predict performance change based on the change in code structure. Our evaluations of several deep embedding learning methods demonstrate that tree-based Long Short-Term Memory (LSTM) models can leverage the hierarchical structure of source-code to discover latent representations and achieve up to 84% (individual problem) and 73% (combined dataset with multiple of problems) accuracy in predicting the change in performance.
arXiv Detail & Related papers (2021-02-12T16:59:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.