Detecting Continuous Integration Skip : A Reinforcement Learning-based Approach
- URL: http://arxiv.org/abs/2405.09657v1
- Date: Wed, 15 May 2024 18:48:57 GMT
- Title: Detecting Continuous Integration Skip : A Reinforcement Learning-based Approach
- Authors: Hajer Mhalla, Mohamed Aymen Saied,
- Abstract summary: Continuous Integration (CI) practices facilitate the seamless integration of code changes by employing automated building and testing processes.
Some frameworks, such as Travis CI and GitHub Actions have significantly contributed to simplifying and enhancing the CI process.
Developers continue to encounter difficulties in accurately flagging commits as either suitable for CI execution or as candidates for skipping.
- Score: 0.4297070083645049
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The software industry is experiencing a surge in the adoption of Continuous Integration (CI) practices, both in commercial and open-source environments. CI practices facilitate the seamless integration of code changes by employing automated building and testing processes. Some frameworks, such as Travis CI and GitHub Actions have significantly contributed to simplifying and enhancing the CI process, rendering it more accessible and efficient for development teams. Despite the availability these CI tools , developers continue to encounter difficulties in accurately flagging commits as either suitable for CI execution or as candidates for skipping especially for large projects with many dependencies. Inaccurate flagging of commits can lead to resource-intensive test and build processes, as even minor commits may inadvertently trigger the Continuous Integration process. The problem of detecting CI-skip commits, can be modeled as binary classification task where we decide to either build a commit or to skip it. This study proposes a novel solution that leverages Deep Reinforcement Learning techniques to construct an optimal Decision Tree classifier that addresses the imbalanced nature of the data. We evaluate our solution by running a within and a cross project validation benchmark on diverse range of Open-Source projects hosted on GitHub which showcased superior results when compared with existing state-of-the-art methods.
Related papers
- CodeRAG-Bench: Can Retrieval Augment Code Generation? [78.37076502395699]
We conduct a systematic, large-scale analysis of code generation using retrieval-augmented generation.
We first curate a comprehensive evaluation benchmark, CodeRAG-Bench, encompassing three categories of code generation tasks.
We examine top-performing models on CodeRAG-Bench by providing contexts retrieved from one or multiple sources.
arXiv Detail & Related papers (2024-06-20T16:59:52Z) - How to Understand Whole Software Repository? [64.19431011897515]
An excellent understanding of the whole repository will be the critical path to Automatic Software Engineering (ASE)
We develop a novel method named RepoUnderstander by guiding agents to comprehensively understand the whole repositories.
To better utilize the repository-level knowledge, we guide the agents to summarize, analyze, and plan.
arXiv Detail & Related papers (2024-06-03T15:20:06Z) - AutoCodeRover: Autonomous Program Improvement [8.66280420062806]
We propose an automated approach for solving GitHub issues to autonomously achieve program improvement.
In our approach called AutoCodeRover, LLMs are combined with sophisticated code search capabilities, ultimately leading to a program modification or patch.
Experiments on SWE-bench-lite (300 real-life GitHub issues) show increased efficacy in solving GitHub issues (19% on SWE-bench-lite), which is higher than the efficacy of the recently reported SWE-agent.
arXiv Detail & Related papers (2024-04-08T11:55:09Z) - Enhancing LLM-based Test Generation for Hard-to-Cover Branches via Program Analysis [8.31978033489419]
We propose TELPA, a novel technique to generate tests that can reach hard-to-cover branches.
Our experimental results on 27 open-source Python projects demonstrate that TELPA significantly outperforms the state-of-the-art SBST and LLM-based techniques.
arXiv Detail & Related papers (2024-04-07T14:08:28Z) - RCAgent: Cloud Root Cause Analysis by Autonomous Agents with
Tool-Augmented Large Language Models [52.352418867917194]
Large language model (LLM) applications in cloud root cause analysis (RCA) have been actively explored recently.
We present RCAgent, a tool-augmented LLM autonomous agent framework for practical and privacy-aware industrial RCA usage.
Running on an internally deployed model rather than GPT families, RCAgent is capable of free-form data collection and comprehensive analysis with tools.
arXiv Detail & Related papers (2023-10-25T03:53:31Z) - CRAFT: Customizing LLMs by Creating and Retrieving from Specialized
Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs)
It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks.
Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z) - Toward Automatically Completing GitHub Workflows [16.302521048148748]
We present GH-WCOM (GitHub COMpletion), a Transformer-based approach supporting developers in writing a specific type of CI/CD pipelines, namely GitHub.
Our empirical study shows that GH-WCOM provides up to 34.23% correct predictions.
arXiv Detail & Related papers (2023-08-31T14:53:00Z) - The GitHub Development Workflow Automation Ecosystems [47.818229204130596]
Large-scale software development has become a highly collaborative endeavour.
This chapter explores the ecosystems of development bots and GitHub Actions.
It provides an extensive survey of the state-of-the-art in this domain.
arXiv Detail & Related papers (2023-05-08T15:24:23Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - A Brief Survey of Current Software Engineering Practices in Continuous
Integration and Automated Accessibility Testing [0.0]
This paper attempts to map the current scene of the software engineering effort spent on improving accessibility via continuous integration.
We also try to explore steps that agile teams may take to train members on how to implement accessibility testing.
arXiv Detail & Related papers (2021-02-27T01:13:43Z) - Reinforcement Learning for Test Case Prioritization [0.24366811507669126]
This paper extends recent studies on applying Reinforcement Learning to optimize testing strategies.
We test its ability to adapt to new environments, by testing it on novel data extracted from a financial institution.
We also studied the impact of using Decision Tree (DT) Approximator as a model for memory representation.
arXiv Detail & Related papers (2020-12-18T11:08:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.