Ten Essential Guidelines for Building High-Quality Research Software
        - URL: http://arxiv.org/abs/2507.16166v1
 - Date: Tue, 22 Jul 2025 02:22:41 GMT
 - Title: Ten Essential Guidelines for Building High-Quality Research Software
 - Authors: Nasir U. Eisty, David E. Bernholdt, Alex Koufos, David J. Luet, Miranda Mundt, 
 - Abstract summary: This paper presents ten guidelines for producing high-quality research software.<n>The guidelines cover every stage of the development lifecycle.<n>They emphasize the importance of planning, writing clean and readable code, using version control, and implementing testing strategies.
 - Score: 0.3562485774739681
 - License: http://creativecommons.org/licenses/by/4.0/
 - Abstract:   High-quality research software is a cornerstone of modern scientific progress, enabling researchers to analyze complex data, simulate phenomena, and share reproducible results. However, creating such software requires adherence to best practices that ensure robustness, usability, and sustainability. This paper presents ten guidelines for producing high-quality research software, covering every stage of the development lifecycle. These guidelines emphasize the importance of planning, writing clean and readable code, using version control, and implementing thorough testing strategies. Additionally, they address key principles such as modular design, reproducibility, performance optimization, and long-term maintenance. The paper also highlights the role of documentation and community engagement in enhancing software usability and impact. By following these guidelines, researchers can create software that advances their scientific objectives and contributes to a broader ecosystem of reliable and reusable research tools. This work serves as a practical resource for researchers and developers aiming to elevate the quality and impact of their research software. 
 
       
      
        Related papers
        - ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic   Scientific Workflows [82.07367406991678]
Large Language Models (LLMs) have extended their impact beyond Natural Language Processing.<n>Among these, computer-using agents are capable of interacting with operating systems as humans do.<n>We introduce ScienceBoard, which encompasses a realistic, multi-domain environment featuring dynamic and visually rich scientific software.
arXiv  Detail & Related papers  (2025-05-26T12:27:27Z) - A Dataset For Computational Reproducibility [2.147712260420443]
This article introduces a dataset of computational experiments covering a broad spectrum of scientific fields.<n>It incorporates details about software dependencies, execution steps, and configurations necessary for accurate reproduction.<n>It provides a universal benchmark by establishing a standardized dataset for objectively evaluating and comparing the effectiveness of tools.
arXiv  Detail & Related papers  (2025-04-11T16:45:10Z) - The Responsible Foundation Model Development Cheatsheet: A Review of   Tools & Resources [100.23208165760114]
Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications.<n>To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet.
arXiv  Detail & Related papers  (2024-06-24T15:55:49Z) - Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs)
The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation.
We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv  Detail & Related papers  (2024-06-24T15:45:22Z) - MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific   Workflows [58.56005277371235]
We introduce MASSW, a comprehensive text dataset on Multi-Aspect Summarization of ScientificAspects.
 MASSW includes more than 152,000 peer-reviewed publications from 17 leading computer science conferences spanning the past 50 years.
We demonstrate the utility of MASSW through multiple novel machine-learning tasks that can be benchmarked using this new dataset.
arXiv  Detail & Related papers  (2024-06-10T15:19:09Z) - Ten simple rules for training scientists to make better software [0.0]
Developing high-quality research software requires scientists to develop a host of software development skills.<n>There has been a growing importance placed on ensuring foundational and good development practices in computational research.<n>Recent articles in the Ten Simple Rules collection have discussed the teaching of computer science and coding techniques to biology students.<n>We advance this discussion by describing the specific steps for effectively teaching the necessary skills scientists need to develop sustainable software packages.
arXiv  Detail & Related papers  (2024-02-07T10:16:20Z) - The Efficiency Spectrum of Large Language Models: An Algorithmic Survey [54.19942426544731]
The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains.
This paper examines the multi-faceted dimensions of efficiency essential for the end-to-end algorithmic development of LLMs.
arXiv  Detail & Related papers  (2023-12-01T16:00:25Z) - Managing Software Provenance to Enhance Reproducibility in Computational
  Research [1.1421942894219899]
Management of computation-based scientific studies is often left to individual researchers who design their experiments based on personal preferences and the nature of the study.
We believe that the quality, efficiency, and of computation-based scientific research can be improved by explicitly creating an execution environment that allows researchers to provide a clear record of traceability.
arXiv  Detail & Related papers  (2023-08-29T21:13:18Z) - A Metadata-Based Ecosystem to Improve the FAIRness of Research Software [0.3185506103768896]
The reuse of research software is central to research efficiency and academic exchange.
The DataDesc ecosystem is presented, an approach to describing data models of software interfaces with detailed and machine-actionable metadata.
arXiv  Detail & Related papers  (2023-06-18T19:01:08Z) - Nine Best Practices for Research Software Registries and Repositories: A
  Concise Guide [63.52960372153386]
We present a set of nine best practices that can help managers define the scope, practices, and rules that govern individual registries and repositories.
These best practices were distilled from the experiences of the creators of existing resources, convened by a Task Force of the FORCE11 Software Implementation Working Group during the years 2011 and 2012.
arXiv  Detail & Related papers  (2020-12-24T05:37:54Z) - Software must be recognised as an important output of scholarly research [7.776162183510522]
We argue that as well as being important from a methodological perspective, software should be recognised as an output of research.
The article discusses the different roles that software may play in research and highlights the relationship between software and research sustainability.
arXiv  Detail & Related papers  (2020-11-15T16:34:31Z) 
        This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.