Related papers: Practitioners' Challenges and Perceptions of CI Build Failure Predictions at Atlassian

Related papers

Lessons from a Big-Bang Integration: Challenges in Edge Computing and Machine Learning [52.86213078016168]
The project faced critical setbacks due to a big-bang integration approach.<n>The study identifies technical and organisational barriers, including poor communication.<n>It also considers psychological factors such as a bias toward fully developed components over mockups.
arXiv Detail & Related papers (2025-07-23T07:16:45Z)
ORMind: A Cognitive-Inspired End-to-End Reasoning Framework for Operations Research [53.736407871322314]
We introduce ORMind, a cognitive-inspired framework that enhances optimization through counterfactual reasoning.<n>Our approach emulates human cognition, implementing an end-to-end workflow that transforms requirements into mathematical models and executable code.<n>It is currently being tested internally in Lenovo's AI Assistant, with plans to enhance optimization capabilities for both business and consumer customers.
arXiv Detail & Related papers (2025-06-02T05:11:21Z)
CXXCrafter: An LLM-Based Agent for Automated C/C++ Open Source Software Building [14.687126587793028]
C/C++ projects often proves to be difficult in practice, hindering the progress of downstream applications.<n>We develop an automated build system called CXXCrafter to address the challenges, such as dependency resolution.<n>Our evaluation on open-source software demonstrates that CXXCrafter achieves a success rate of 78% in project building.
arXiv Detail & Related papers (2025-05-27T11:54:56Z)
Evaluating Large Language Models for Real-World Engineering Tasks [75.97299249823972]
This paper introduces a curated database comprising over 100 questions derived from authentic, production-oriented engineering scenarios.<n>Using this dataset, we evaluate four state-of-the-art Large Language Models (LLMs)<n>Our results show that LLMs demonstrate strengths in basic temporal and structural reasoning but struggle significantly with abstract reasoning, formal modeling, and context-sensitive engineering logic.
arXiv Detail & Related papers (2025-05-12T14:05:23Z)
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models. Our framework incorporates two complementary strategies: internal TTC and external TTC. We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z)
AttackSeqBench: Benchmarking Large Language Models' Understanding of Sequential Patterns in Cyber Attacks [13.082370325093242]
We introduce AttackSeqBench, a benchmark to evaluate Large Language Models' (LLMs) capability to understand and reason attack sequences in Cyber Threat Intelligence (CTI) reports. Our benchmark encompasses three distinct Question Answering (QA) tasks, each task focuses on the varying granularity in adversarial behavior. We conduct extensive experiments and analysis with both fast-thinking and slow-thinking LLMs, while highlighting their strengths and limitations in analyzing the sequential patterns in cyber attacks.
arXiv Detail & Related papers (2025-03-05T04:25:21Z)
Understanding User Mental Models in AI-Driven Code Completion Tools: Insights from an Elicitation Study [5.534104886050636]
We conduct an elicitation study with 56 developers using focus groups to elicit their mental models when interacting with AI-powered code completion tools. The study findings provide actionable insights for designing human-centered CCTs that align with user expectations, enhance satisfaction and productivity, and foster trust in AI-powered development tools. We also develop ATHENA, a proof-of-concept CCT that dynamically adapts to developers' coding preferences and environments, ensuring seamless integration into diverse environments.
arXiv Detail & Related papers (2025-02-04T10:20:49Z)
Build Optimization: A Systematic Literature Review [0.0]
Continuous Integration (CI) consists of an automated build process involving continuous compilation, testing, and packaging of the software system. To better understand the literature so as to help practitioners find solutions for their problems and guide future research, we conduct a systematic review of 97 studies on build optimization published between 2006 and 2024. The identified build optimization studies focus on two main challenges: (1) long build durations, and (2) build failures.
arXiv Detail & Related papers (2025-01-21T07:32:06Z)
Active Causal Structure Learning with Latent Variables: Towards Learning to Detour in Autonomous Robots [49.1574468325115]
Artificial General Intelligence (AGI) Agents and Robots must be able to cope with everchanging environments and tasks. We claim that active causal structure learning with latent variables (ACSLWL) is a necessary component to build AGI agents and robots.
arXiv Detail & Related papers (2024-10-28T10:21:26Z)
Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks [68.49251303172674]
State-of-the-art large language models (LLMs) exhibit impressive problem-solving capabilities but may struggle with complex reasoning and factual correctness. Existing methods harness the strengths of chain-of-thought and retrieval-augmented generation (RAG) to decompose a complex problem into simpler steps and apply retrieval to improve factual correctness. We introduce Critic-guided planning with Retrieval-augmentation, CR-Planner, a novel framework that leverages fine-tuned critic models to guide both reasoning and retrieval processes through planning.
arXiv Detail & Related papers (2024-10-02T11:26:02Z)
Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges. We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow. We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z)
Making sense of AI systems development [3.6141428739228894]
We describe challenges in modern AI-based systems development that emerged in projects carried out by IBM and client companies. Many issues bear upon the current-generation AI's inherent characteristics. Those characteristics increase the complexity of the projects and call for balanced mindfulness to avoid unexpected problems.
arXiv Detail & Related papers (2024-08-08T08:46:32Z)
Detecting Continuous Integration Skip : A Reinforcement Learning-based Approach [0.4297070083645049]
Continuous Integration (CI) practices facilitate the seamless integration of code changes by employing automated building and testing processes. Some frameworks, such as Travis CI and GitHub Actions have significantly contributed to simplifying and enhancing the CI process. Developers continue to encounter difficulties in accurately flagging commits as either suitable for CI execution or as candidates for skipping.
arXiv Detail & Related papers (2024-05-15T18:48:57Z)
Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks. However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs. We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z)
Continuous Integration and Software Quality: A Causal Explanatory Study [0.46040036610482665]
Continuous Integration (CI) is a software engineering practice that aims to reduce the cost and risk of code integration among teams. Recent empirical studies have confirmed associations between CI and the software quality (SQ)
arXiv Detail & Related papers (2023-09-18T23:10:34Z)
The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation [97.63185634482552]
We summarize the winning solutions from the RoboDepth Challenge. The challenge was designed to facilitate and advance robust OoD depth estimation. We hope this challenge could lay a solid foundation for future research on robust and reliable depth estimation.
arXiv Detail & Related papers (2023-07-27T17:59:56Z)
The Impact of a Continuous Integration Service on the Delivery Time of Merged Pull Requests [8.108605385023939]
We study whether adopting a CI service (TravisCI) can quicken the time to deliver merged PRs. Our results reveal that adopting a CI service may not necessarily quicken the delivery of merge PRs. The automation provided by CI and the boost in developers' confidence are key advantages of adopting a CI service.
arXiv Detail & Related papers (2023-05-25T10:59:35Z)
Federated Learning with Unreliable Clients: Performance Analysis and Mechanism Design [76.29738151117583]
Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients. However, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training. We model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk.
arXiv Detail & Related papers (2021-05-10T08:02:27Z)
Transient Information Adaptation of Artificial Intelligence: Towards Sustainable Data Processes in Complex Projects [0.0]
Large scale projects increasingly operate in complicated settings whilst drawing on an array of complex data-points. 90% of megaprojects globally fail to achieve their planned objectives. Renewed interest in the concept of Artificial Intelligence seeks to enhance project managers cognitive capacity through the project lifecycle.
arXiv Detail & Related papers (2021-03-27T22:28:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.