PACE: A Program Analysis Framework for Continuous Performance Prediction
- URL: http://arxiv.org/abs/2312.00918v1
- Date: Fri, 1 Dec 2023 20:43:34 GMT
- Title: PACE: A Program Analysis Framework for Continuous Performance Prediction
- Authors: Chidera Biringa and Gokhan Kul
- Abstract summary: PACE is a program analysis framework that provides continuous feedback on the performance impact of pending code updates.
We design performance microbenchmarks by mapping the execution time of functional test cases given a code update.
Our experiments achieved significant performance in predicting code performance, outperforming current state-of-the-art by 75% on neural-represented code stylometry features.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Software development teams establish elaborate continuous integration
pipelines containing automated test cases to accelerate the development process
of software. Automated tests help to verify the correctness of code
modifications decreasing the response time to changing requirements. However,
when the software teams do not track the performance impact of pending
modifications, they may need to spend considerable time refactoring existing
code. This paper presents PACE, a program analysis framework that provides
continuous feedback on the performance impact of pending code updates. We
design performance microbenchmarks by mapping the execution time of functional
test cases given a code update. We map microbenchmarks to code stylometry
features and feed them to predictors for performance predictions. Our
experiments achieved significant performance in predicting code performance,
outperforming current state-of-the-art by 75% on neural-represented code
stylometry features.
Related papers
- Tracing Optimization for Performance Modeling and Regression Detection [15.99435412859094]
A performance model analytically describes the relationship between the performance of a system and its runtime activities.
We propose statistical approaches to reduce tracing overhead by identifying and excluding performance-insensitive code regions.
Our approach is fully automated, making it ready to be used in production environments with minimal human effort.
arXiv Detail & Related papers (2024-11-26T16:11:55Z) - Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework.
Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z) - Which Combination of Test Metrics Can Predict Success of a Software Project? A Case Study in a Year-Long Project Course [1.553083901660282]
Testing plays an important role in securing the success of a software development project.
We investigate whether we can quantify the effects various types of testing have on functional suitability.
arXiv Detail & Related papers (2024-08-22T04:23:51Z) - AI-driven Java Performance Testing: Balancing Result Quality with Testing Time [0.40964539027092917]
We propose and study an AI-based framework to dynamically halt warm-up iterations at runtime.
Our framework significantly improves the accuracy of the warm-up estimates provided by state-of-practice and state-of-the-art methods.
Our study highlights that integrating AI to dynamically estimate the end of the warm-up phase can enhance the cost-effectiveness of Java performance testing.
arXiv Detail & Related papers (2024-08-09T14:41:32Z) - RepoMasterEval: Evaluating Code Completion via Real-World Repositories [12.176098357240095]
RepoMasterEval is a novel benchmark for evaluating code completion models constructed from real-world Python and TypeScript repositories.
To improve test accuracy of model generated code, we employ mutation testing to measure the effectiveness of the test cases.
Our empirical evaluation on 6 state-of-the-art models shows that test argumentation is critical in improving the accuracy of the benchmark.
arXiv Detail & Related papers (2024-08-07T03:06:57Z) - NExT: Teaching Large Language Models to Reason about Code Execution [50.93581376646064]
Large language models (LLMs) of code are typically trained on the surface textual form of programs.
We propose NExT, a method to teach LLMs to inspect the execution traces of programs and reason about their run-time behavior.
arXiv Detail & Related papers (2024-04-23T01:46:32Z) - FuzzyFlow: Leveraging Dataflow To Find and Squash Program Optimization
Bugs [92.47146416628965]
FuzzyFlow is a fault localization and test case extraction framework designed to test program optimizations.
We leverage dataflow program representations to capture a fully reproducible system state and area-of-effect for optimizations.
To reduce testing time, we design an algorithm for minimizing test inputs, trading off memory for recomputation.
arXiv Detail & Related papers (2023-06-28T13:00:17Z) - Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation.
We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z) - Performance Embeddings: A Similarity-based Approach to Automatic
Performance Optimization [71.69092462147292]
Performance embeddings enable knowledge transfer of performance tuning between applications.
We demonstrate this transfer tuning approach on case studies in deep neural networks, dense and sparse linear algebra compositions, and numerical weather prediction stencils.
arXiv Detail & Related papers (2023-03-14T15:51:35Z) - Learning Performance-Improving Code Edits [107.21538852090208]
We introduce a framework for adapting large language models (LLMs) to high-level program optimization.
First, we curate a dataset of performance-improving edits made by human programmers of over 77,000 competitive C++ programming submission pairs.
For prompting, we propose retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.
arXiv Detail & Related papers (2023-02-15T18:59:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.