Uncovering Scientific Software Sustainability through Community Engagement and Software Quality Metrics
- URL: http://arxiv.org/abs/2511.07851v1
- Date: Wed, 12 Nov 2025 01:23:59 GMT
- Title: Uncovering Scientific Software Sustainability through Community Engagement and Software Quality Metrics
- Authors: Sharif Ahmed, Addi Malviya Thakur, Gregory R. Watson, Nasir U. Eisty,
- Abstract summary: This paper explores the sustainability of scientific open-source software (Sci-OSS) projects hosted on GitHub.<n>We map sustainability to repository metrics from the literature and mined data from ten prominent Sci-OSS projects.<n>Our visualization and analysis methods offer researchers, funders, and developers key insights into long-term software sustainability.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scientific open-source software (Sci-OSS) projects are critical for advancing research, yet sustaining these projects long-term remains a major challenge. This paper explores the sustainability of Sci-OSS hosted on GitHub, focusing on two factors drawn from stewardship organizations: community engagement and software quality. We map sustainability to repository metrics from the literature and mined data from ten prominent Sci-OSS projects. A multimodal analysis of these projects led us to a novel visualization technique, providing a robust way to display both current and evolving software metrics over time, replacing multiple traditional visualizations with one. Additionally, our statistical analysis shows that even similar-domain projects sustain themselves differently. Natural language analysis supports claims from the literature, highlighting that project-specific feedback plays a key role in maintaining software quality. Our visualization and analysis methods offer researchers, funders, and developers key insights into long-term software sustainability.
Related papers
- Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey [59.3507264893654]
Issue resolution is a complex Software Engineering task integral to real-world development.<n> benchmarks like SWE-bench revealed this task as profoundly difficult for large language models.<n>This paper presents a systematic survey of this emerging domain.
arXiv Detail & Related papers (2026-01-15T18:55:03Z) - SQuaD: The Software Quality Dataset [3.9861000060030993]
The Software Quality dataset (SQuaD) is a time-aware collection of software quality metrics extracted from 450 mature open-source projects across diverse ecosystems.<n>By integrating nine state-of-the-art static analysis tools, SQuaD unifies over 700 unique metrics at method, class, file, and project levels.
arXiv Detail & Related papers (2025-11-14T12:57:22Z) - FinSight: Towards Real-World Financial Deep Research [68.31086471310773]
FinSight is a novel framework for producing high-quality, multimodal financial reports.<n>To ensure professional-grade visualization, we propose an Iterative Vision-Enhanced Mechanism.<n>A two-stage Writing Framework expands concise Chain-of-Analysis segments into coherent, citation-aware, and multimodal reports.
arXiv Detail & Related papers (2025-10-19T14:05:35Z) - BinMetric: A Comprehensive Binary Analysis Benchmark for Large Language Models [50.17907898478795]
We introduce BinMetric, a benchmark designed to evaluate the performance of large language models on binary analysis tasks.<n>BinMetric comprises 1,000 questions derived from 20 real-world open-source projects across 6 practical binary analysis tasks.<n>Our empirical study on this benchmark investigates the binary analysis capabilities of various state-of-the-art LLMs, revealing their strengths and limitations in this field.
arXiv Detail & Related papers (2025-05-12T08:54:07Z) - Incubation and Beyond: A Comparative Analysis of ASF Projects Sustainability Impacts on Software Quality [2.7059033823627923]
Free and Open Source Software (FOSS) communities' sustainability, meaning to remain operational without signs of weakening or interruptions to its development, is fundamental for the resilience and continuity of society's digital infrastructure.<n>This study seeks to understand how the different aspects of FOSS sustainability impact software quality from a life-cycle perspective.
arXiv Detail & Related papers (2025-04-13T07:51:40Z) - An Overview and Catalogue of Dependency Challenges in Open Source Software Package Registries [52.23798016734889]
This article provides a catalogue of dependency-related challenges that come with relying on OSS packages or libraries.
The catalogue is based on the scientific literature on empirical research that has been conducted to understand, quantify and overcome these challenges.
arXiv Detail & Related papers (2024-09-27T16:20:20Z) - Carbon-Efficient Software Design and Development: A Systematic Literature Review [1.6071754144962787]
We conduct a systematic literature review on state-of-the-art proposals for designing and developing carbon-efficient software.<n>We identify and analyse 65 primary studies by classifying them through a taxonomy aimed at answering the 5W1H questions of carbon-efficient software design and development.
arXiv Detail & Related papers (2024-07-29T11:24:11Z) - Estimating the Energy Footprint of Software Systems: a Primer [56.200335252600354]
quantifying the energy footprint of a software system is one of the most basic activities.
This document aims to be a starting point for researchers who want to begin conducting work in this area.
arXiv Detail & Related papers (2024-07-16T11:21:30Z) - Charting a Path to Efficient Onboarding: The Role of Software
Visualization [49.1574468325115]
The present study aims to explore the familiarity of managers, leaders, and developers with software visualization tools.
This approach incorporated quantitative and qualitative analyses of data collected from practitioners using questionnaires and semi-structured interviews.
arXiv Detail & Related papers (2024-01-17T21:30:45Z) - Code Ownership in Open-Source AI Software Security [18.779538756226298]
We use code ownership metrics to investigate the correlation with latent vulnerabilities across five prominent open-source AI software projects.
The findings suggest a positive relationship between high-level ownership (characterised by a limited number of minor contributors) and a decrease in vulnerabilities.
With these novel code ownership metrics, we have implemented a Python-based command-line application to aid project curators and quality assurance professionals in evaluating and benchmarking their on-site projects.
arXiv Detail & Related papers (2023-12-18T00:37:29Z) - Individual context-free online community health indicators fail to identify open source software sustainability [3.192308005611312]
We monitored thirty-eight open source projects over the period of a year.
None of the projects were abandoned during this period, and only one project entered a planned shutdown.
Results were highly heterogeneous, showing little commonality across documentation, mean response times for issues and code contributions, and available funding/staffing resources.
arXiv Detail & Related papers (2023-09-21T14:41:41Z) - "Project smells" -- Experiences in Analysing the Software Quality of ML
Projects with mllint [6.0141405230309335]
We introduce the novel concept of project smells which consider deficits in project management as a more holistic perspective on software quality.
An open-source static analysis tool mllint was also implemented to help detect and mitigate these.
Our findings indicate a need for context-aware static analysis tools, that fit the needs of the project at its current stage of development.
arXiv Detail & Related papers (2022-01-20T15:52:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.