Related papers: "Good" and "Bad" Failures in Industrial CI/CD -- Balancing Cost and Quality Assurance

"Good" and "Bad" Failures in Industrial CI/CD -- Balancing Cost and Quality Assurance

URL: http://arxiv.org/abs/2504.11839v1
Date: Wed, 16 Apr 2025 07:56:36 GMT
Title: "Good" and "Bad" Failures in Industrial CI/CD -- Balancing Cost and Quality Assurance
Authors: Simin Sun, David Friberg, Miroslaw Staron,
Abstract summary: Continuous Integration and Continuous Deployment (CI/CD) pipeline automates software development to speed up and enhance the efficiency of engineering software.<n>Our findings reveal that organizations can confuse the distinction between CI and CD, whereas code merge and product release serve as more effective milestones for process optimization and risk control.
Score: 1.8570591025615453
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Continuous Integration and Continuous Deployment (CI/CD) pipeline automates software development to speed up and enhance the efficiency of engineering software. These workflows consist of various jobs, such as code validation and testing, which developers must wait to complete before receiving feedback. The jobs can fail, which leads to unnecessary delays in build times, decreasing productivity for developers, and increasing costs for companies. To explore how companies adopt CI/CD workflows and balance cost with quality assurance during optimization, we studied 4 companies, reporting industry experiences with CI/CD practices. Our findings reveal that organizations can confuse the distinction between CI and CD, whereas code merge and product release serve as more effective milestones for process optimization and risk control. While numerous tools and research efforts target the post-merge phase to enhance productivity, limited attention has been given to the pre-merge phase, where early failure prevention brings more impacts and less risks.

Related papers

ORMind: A Cognitive-Inspired End-to-End Reasoning Framework for Operations Research [53.736407871322314]
We introduce ORMind, a cognitive-inspired framework that enhances optimization through counterfactual reasoning.<n>Our approach emulates human cognition, implementing an end-to-end workflow that transforms requirements into mathematical models and executable code.<n>It is currently being tested internally in Lenovo's AI Assistant, with plans to enhance optimization capabilities for both business and consumer customers.
arXiv Detail & Related papers (2025-06-02T05:11:21Z)
Training Language Models to Generate Quality Code with Program Analysis Feedback [66.0854002147103]
Code generation with large language models (LLMs) is increasingly adopted in production but fails to ensure code quality.<n>We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code.
arXiv Detail & Related papers (2025-05-28T17:57:47Z)
Code Improvement Practices at Meta [11.3591598115242]
We investigate Meta's practices by collaborating with engineers on code quality. We analyze rich source code change history to reveal a range of practices used for continual improvement of the. Our analysis of the impact of reengineering activities revealed substantial improvements in quality and speed.
arXiv Detail & Related papers (2025-04-16T22:30:54Z)
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models. Our framework incorporates two complementary strategies: internal TTC and external TTC. We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z)
The Role of DevOps in Enhancing Enterprise Software Delivery Success through R&D Efficiency and Source Code Management [0.4532517021515834]
This study focuses on enhancing R&D efficiency and source code management (SCM) for software delivery success. Using a qualitative methodology, data were collected from case studies of large-scale enterprises implementing DevOps.
arXiv Detail & Related papers (2024-11-04T16:01:43Z)
Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework. Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z)
The Hidden Costs of Automation: An Empirical Study on GitHub Actions Workflow Maintenance [45.53834452021771]
GitHub Actions (GA) is an orchestration platform that streamlines the automatic execution of engineering tasks. Human intervention is necessary to correct defects, update dependencies, or existing workflow files.
arXiv Detail & Related papers (2024-09-04T01:33:16Z)
Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs) The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation. We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z)
SOEN-101: Code Generation by Emulating Software Process Models Using Large Language Model Agents [50.82665351100067]
FlowGen is a code generation framework that emulates software process models based on multiple Large Language Model (LLM) agents. We evaluate FlowGenScrum on four benchmarks: HumanEval, HumanEval-ET, MBPP, and MBPP-ET.
arXiv Detail & Related papers (2024-03-23T14:04:48Z)
Embedded Software Development with Digital Twins: Specific Requirements for Small and Medium-Sized Enterprises [55.57032418885258]
Digital twins have the potential for cost-effective software development and maintenance strategies. We interviewed SMEs about their current development processes. First results show that real-time requirements prevent, to date, a Software-in-the-Loop development approach.
arXiv Detail & Related papers (2023-09-17T08:56:36Z)
Quality Engineering for Agile and DevOps on the Cloud and Edge [0.8521132000449767]
Software delivery has to be more agile now than ever before. This book addresses the need of effectively embedding quality engineering throughout the agile development cycle.
arXiv Detail & Related papers (2023-02-07T18:03:38Z)
SUPERNOVA: Automating Test Selection and Defect Prevention in AAA Video Games Using Risk Based Testing and Machine Learning [62.997667081978825]
Testing video games is an increasingly difficult task as traditional methods fail to scale with growing software systems. We present SUPERNOVA, a system responsible for test selection and defect prevention while also functioning as an automation hub. The direct impact of this has been observed to be a reduction in 55% or more testing hours for an undisclosed sports game title.
arXiv Detail & Related papers (2022-03-10T00:47:46Z)
Reinforcement Learning for Test Case Prioritization [0.24366811507669126]
This paper extends recent studies on applying Reinforcement Learning to optimize testing strategies. We test its ability to adapt to new environments, by testing it on novel data extracted from a financial institution. We also studied the impact of using Decision Tree (DT) Approximator as a model for memory representation.
arXiv Detail & Related papers (2020-12-18T11:08:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.