What Drives Issue Resolution Speed? An Empirical Study of Scientific Workflow Systems on GitHub
- URL: http://arxiv.org/abs/2512.18852v1
- Date: Sun, 21 Dec 2025 18:41:27 GMT
- Title: What Drives Issue Resolution Speed? An Empirical Study of Scientific Workflow Systems on GitHub
- Authors: Khairul Alam, Banani Roy,
- Abstract summary: Scientific Systems (SWSs) play a vital role in enabling reproducible, scalable, and automated scientific analysis.<n>This paper presents an empirical study of issue management and resolution across a collection of GitHub-hosted SWS projects.
- Score: 4.902956965439
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scientific Workflow Systems (SWSs) play a vital role in enabling reproducible, scalable, and automated scientific analysis. Like other open-source software, these systems depend on active maintenance and community engagement to remain reliable and sustainable. However, despite the importance of timely issue resolution for software quality and community trust, little is known about what drives issue resolution speed within SWSs. This paper presents an empirical study of issue management and resolution across a collection of GitHub-hosted SWS projects. We analyze 21,116 issues to investigate how project characteristics, issue metadata, and contributor interactions affect time-to-close. Specifically, we address two research questions: (1) how issues are managed and addressed in SWSs, and (2) how issue and contributor features relate to issue resolution speed. We find that 68.91% of issues are closed, with half of them resolved within 18.09 days. Our results show that although SWS projects follow structured issue management practices, the issue resolution speed varies considerably across systems. Factors such as labeling and assigning issues are associated with faster issue resolution. Based on our findings, we make recommendations for developers to better manage SWS repository issues and improve their quality.
Related papers
- Why Authors and Maintainers Link (or Don't Link) Their PyPI Libraries to Code Repositories and Donation Platforms [83.16077040470975]
Metadata of libraries on the Python Package Index (PyPI) plays a critical role in supporting the transparency, trust, and sustainability of open-source libraries.<n>This paper presents a large-scale empirical study combining two targeted surveys sent to 50,000 PyPI authors and maintainers.<n>We analyze more than 1,400 responses using large language model (LLM)-based topic modeling to uncover key motivations and barriers related to linking repositories and donation platforms.
arXiv Detail & Related papers (2026-01-21T16:13:57Z) - Smells Depend on the Context: An Interview Study of Issue Tracking Problems and Smells in Practice [5.280471121231262]
Little is known about the challenges Software Engineering teams encounter in ITSs.<n>We conducted an in-depth interview study with 26 experienced SE practitioners.<n>We identified 14 common problems including issue findability, zombie issues, workflow bloat, and lack of workflow enforcement.<n>Our results suggest that ITS problems and smells are highly dependent on context factors such as ITS configuration, workflow stage, and team size.
arXiv Detail & Related papers (2026-01-07T17:38:52Z) - Barbarians at the Gate: How AI is Upending Systems Research [58.95406995634148]
We argue that systems research, long focused on designing and evaluating new performance-oriented algorithms, is particularly well-suited for AI-driven solution discovery.<n>We term this approach as AI-Driven Research for Systems ( ADRS), which iteratively generates, evaluates, and refines solutions.<n>Our results highlight both the disruptive potential and the urgent need to adapt systems research practices in the age of AI.
arXiv Detail & Related papers (2025-10-07T17:49:24Z) - Issue Tracking Ecosystems: Context and Best Practices [0.0]
GitHub and Jira are popular tools that support Software Engineering (SE) organisations through the management of issues''<n>An Issue Tracking Ecosystem (ITE) is the aggregate of the central ITS and the related SE artefacts, stakeholders, and processes.<n>I undertake the challenge of understanding ITEs at a broader level, addressing these questions regarding complexity and diversity.
arXiv Detail & Related papers (2025-07-09T09:57:13Z) - Towards an Interpretable Analysis for Estimating the Resolution Time of Software Issues [1.4039240369201997]
We build an issue monitoring system that extracts the actual effort required to fix issues on a per-project basis.<n>Our approach employs topic modeling to capture issue semantics and leverages metadata for interpretable resolution time analysis.
arXiv Detail & Related papers (2025-05-02T08:38:59Z) - SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution [56.9361004704428]
Large Language Models (LLMs) have demonstrated remarkable proficiency across a variety of complex tasks.<n>SWE-Fixer is a novel open-source framework designed to effectively and efficiently resolve GitHub issues.<n>We assess our approach on the SWE-Bench Lite and Verified benchmarks, achieving competitive performance among open-source models.
arXiv Detail & Related papers (2025-01-09T07:54:24Z) - An Empirical Investigation on the Challenges in Scientific Workflow Systems Development [6.920993703034064]
This study examines interactions between developers and researchers on Stack Overflow (SO) and GitHub.<n>By analyzing issues, we identified 13 topics (e.g., Errors and Bug Fixing, Documentation, Dependencies) and discovered that data structures and operations is the most difficult.<n>We also found common topics between SO and GitHub, such as data structures and operations, task management, and workflow scheduling.
arXiv Detail & Related papers (2024-11-16T21:14:11Z) - Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement [62.94719119451089]
Lingma SWE-GPT series learns from and simulating real-world code submission activities.
Lingma SWE-GPT 72B resolves 30.20% of GitHub issues, marking a significant improvement in automatic issue resolution.
arXiv Detail & Related papers (2024-11-01T14:27:16Z) - SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains? [64.34184587727334]
We propose SWE-bench Multimodal to evaluate systems on their ability to fix bugs in visual, user-facing JavaScript software.
SWE-bench M features 617 task instances collected from 17 JavaScript libraries used for web interface design, diagramming, data visualization, syntax highlighting, and interactive mapping.
Our analysis finds that top-performing SWE-bench systems struggle with SWE-bench M, revealing limitations in visual problem-solving and cross-language generalization.
arXiv Detail & Related papers (2024-10-04T18:48:58Z) - SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories [55.161075901665946]
Super aims to capture the realistic challenges faced by researchers working with Machine Learning (ML) and Natural Language Processing (NLP) research repositories.
Our benchmark comprises three distinct problem sets: 45 end-to-end problems with annotated expert solutions, 152 sub problems derived from the expert set that focus on specific challenges, and 602 automatically generated problems for larger-scale development.
We show that state-of-the-art approaches struggle to solve these problems with the best model (GPT-4o) solving only 16.3% of the end-to-end set, and 46.1% of the scenarios.
arXiv Detail & Related papers (2024-09-11T17:37:48Z) - Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration [64.19431011897515]
This paper presents Alibaba LingmaAgent, a novel Automated Software Engineering method designed to comprehensively understand and utilize whole software repositories for issue resolution.<n>Our approach introduces a top-down method to condense critical repository information into a knowledge graph, reducing complexity, and employs a Monte Carlo tree search based strategy.<n>In production deployment and evaluation at Alibaba Cloud, LingmaAgent automatically resolved 16.9% of in-house issues faced by development engineers, and solved 43.3% of problems after manual intervention.
arXiv Detail & Related papers (2024-06-03T15:20:06Z) - Software issues report for bug fixing process: An empirical study of
machine-learning libraries [0.0]
We investigated the effectiveness of issue resolution for bug-fixing processes in six machine-learning libraries.
The most common categories of issues that arise in machine-learning libraries are bugs, documentation, optimization, crashes, enhancement, new feature requests, build/CI, support, and performance.
This study concludes that efficient issue-tracking processes, effective communication, and collaboration are vital for effective resolution of issues and bug fixing processes in machine-learning libraries.
arXiv Detail & Related papers (2023-12-10T21:33:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.