Moving beyond Deletions: Program Simplification via Diverse Program
Transformations
- URL: http://arxiv.org/abs/2401.15234v1
- Date: Fri, 26 Jan 2024 22:59:43 GMT
- Title: Moving beyond Deletions: Program Simplification via Diverse Program
Transformations
- Authors: Haibo Wang, Zezhong Xing, Zheng Wang, Chengnian Sun, Shin Hwei Tan
- Abstract summary: Developers manually simplify program (known as developer-induced program simplification in this paper) to reduce its code size yet preserving its functionality.
To reduce manual effort, rule-based approaches (e.g., deletion-based approaches) can be potentially applied to automate developer-induced program simplification.
We propose SimpT5, a tool that can automatically produce simplified programs (semantically-equivalent programs with reduced source lines of code)
Our evaluation shows that SimpT5 are more effective than prior approaches in automating developer-induced program simplification.
- Score: 11.038120567076772
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To reduce the complexity of software, Developers manually simplify program
(known as developer-induced program simplification in this paper) to reduce its
code size yet preserving its functionality but manual simplification is
time-consuming and error-prone. To reduce manual effort, rule-based approaches
(e.g., refactoring) and deletion-based approaches (e.g., delta debugging) can
be potentially applied to automate developer-induced program simplification.
However, as there is little study on how developers simplify programs in
Open-source Software (OSS) projects, it is unclear whether these approaches can
be effectively used for developer-induced program simplification. Hence, we
present the first study of developer-induced program simplification in OSS
projects, focusing on the types of program transformations used, the
motivations behind simplifications, and the set of program transformations
covered by existing refactoring types. Our study of 382 pull requests from 296
projects reveals that there exist gaps in applying existing approaches for
automating developer-induced program simplification. and outlines the criteria
for designing automatic program simplification techniques. Inspired by our
study and to reduce the manual effort in developer-induced program
simplification, we propose SimpT5, a tool that can automatically produce
simplified programs (semantically-equivalent programs with reduced source lines
of code). SimpT5 is trained based on our collected dataset of 92,485 simplified
programs with two heuristics: (1) simplified line localization that encodes
lines changed in simplified programs, and (2)checkers that measure the quality
of generated programs. Our evaluation shows that SimpT5 are more effective than
prior approaches in automating developer-induced program simplification.
Related papers
- Guided Sketch-Based Program Induction by Search Gradients [0.0]
We propose a framework for learning parameterized programs via search gradients using evolution strategies.
This formulation departs from traditional program induction as it allows for the programmer to impart task-specific code to the program'sketch'
arXiv Detail & Related papers (2024-02-10T16:47:53Z) - Refactoring Programs Using Large Language Models with Few-Shot Examples [20.48175387745551]
We demonstrate the application of using a large language model (LLM), GPT-3.5, to suggest less complex versions of the user-written Python program.
We show that 95.68% of programs can beed by generating 10 candidates each, resulting in a 17.35% reduction in the average cyclomatic complexity.
arXiv Detail & Related papers (2023-11-20T11:43:45Z) - Hierarchical Programmatic Reinforcement Learning via Learning to Compose
Programs [58.94569213396991]
We propose a hierarchical programmatic reinforcement learning framework to produce program policies.
By learning to compose programs, our proposed framework can produce program policies that describe out-of-distributionally complex behaviors.
The experimental results in the Karel domain show that our proposed framework outperforms baselines.
arXiv Detail & Related papers (2023-01-30T14:50:46Z) - NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual
Question Answering [52.10214317661547]
Current numerical reasoning methods autoregressively decode program sequences.
The accuracy of program generation drops sharply as the decoding steps unfold due to error propagation.
In this paper, we propose a non-autoregressive program generation framework.
arXiv Detail & Related papers (2022-11-07T11:25:21Z) - CodeRL: Mastering Code Generation through Pretrained Models and Deep
Reinforcement Learning [92.36705236706678]
"CodeRL" is a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning.
During inference, we introduce a new generation procedure with a critical sampling strategy.
For the model backbones, we extended the encoder-decoder architecture of CodeT5 with enhanced learning objectives.
arXiv Detail & Related papers (2022-07-05T02:42:15Z) - Learning from Self-Sampled Correct and Partially-Correct Programs [96.66452896657991]
We propose to let the model perform sampling during training and learn from both self-sampled fully-correct programs and partially-correct programs.
We show that our use of self-sampled correct and partially-correct programs can benefit learning and help guide the sampling process.
Our proposed method improves the pass@k performance by 3.1% to 12.3% compared to learning from a single reference program with MLE.
arXiv Detail & Related papers (2022-05-28T03:31:07Z) - Programming with Neural Surrogates of Programs [17.259433118432757]
We study three surrogate-based design patterns, evaluating each in case studies on a large-scale CPU simulator.
With surrogate compilation, programmers develop a surrogate that mimics the behavior of a program to deploy to end-users.
With surrogate adaptation, programmers develop a surrogate of a program then retrain that surrogate on a different task.
With surrogate optimization, programmers develop a surrogate of a program, optimize input parameters of the surrogate, then plug the optimized input parameters back into the original program.
arXiv Detail & Related papers (2021-12-12T04:45:41Z) - Procedures as Programs: Hierarchical Control of Situated Agents through
Natural Language [81.73820295186727]
We propose a formalism of procedures as programs, a powerful yet intuitive method of representing hierarchical procedural knowledge for agent command and control.
We instantiate this framework on the IQA and ALFRED datasets for NL instruction following.
arXiv Detail & Related papers (2021-09-16T20:36:21Z) - Incremental maintenance of overgrounded logic programs with tailored
simplifications [0.966840768820136]
We introduce a new strategy for generating series of monotonically growing propositional programs.
With respect to earlier approaches, our tailored simplification technique reduces the size of instantiated programs.
arXiv Detail & Related papers (2020-08-06T21:50:11Z) - Synthesize, Execute and Debug: Learning to Repair for Neural Program
Synthesis [81.54148730967394]
We propose SED, a neural program generation framework that incorporates synthesis, execution, and debug stages.
SED first produces initial programs using the neural program synthesizer component, then utilizes a neural program debugger to iteratively repair the generated programs.
On Karel, a challenging input-output program synthesis benchmark, SED reduces the error rate of the neural program synthesizer itself by a considerable margin, and outperforms the standard beam search for decoding.
arXiv Detail & Related papers (2020-07-16T04:15:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.