Continuous Analysis: Evolution of Software Engineering and Reproducibility for Science
- URL: http://arxiv.org/abs/2411.02283v1
- Date: Mon, 04 Nov 2024 17:11:08 GMT
- Title: Continuous Analysis: Evolution of Software Engineering and Reproducibility for Science
- Authors: Venkat S. Malladi, Maria Yazykova, Olesya Melnichenko, Yulia Dubinina,
- Abstract summary: This paper introduces the concept of Continuous Analysis to address the challenges in scientific research.
By adopting CA, the scientific community can ensure the validity and generalizability of research outcomes.
- Score: 0.0
- License:
- Abstract: Reproducibility in research remains hindered by complex systems involving data, models, tools, and algorithms. Studies highlight a reproducibility crisis due to a lack of standardized reporting, code and data sharing, and rigorous evaluation. This paper introduces the concept of Continuous Analysis to address the reproducibility challenges in scientific research, extending the DevOps lifecycle. Continuous Analysis proposes solutions through version control, analysis orchestration, and feedback mechanisms, enhancing the reliability of scientific results. By adopting CA, the scientific community can ensure the validity and generalizability of research outcomes, fostering transparency and collaboration and ultimately advancing the field.
Related papers
- Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges.
We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow.
We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z) - Reproducibility and Geometric Intrinsic Dimensionality: An Investigation on Graph Neural Network Research [0.0]
Building on these efforts we turn towards another critical challenge in machine learning, namely the curse of dimensionality.
Using the closely linked concept of intrinsic dimension we investigate to which the used machine learning models are influenced by the extend dimension of the data sets they are trained on.
arXiv Detail & Related papers (2024-03-13T11:44:30Z) - SciOps: Achieving Productivity and Reliability in Data-Intensive Research [0.8414742293641504]
Scientists are increasingly leveraging advances in instruments, automation, and collaborative tools to scale up their experiments and research goals.
Various scientific disciplines, including neuroscience, have adopted key technologies to enhance collaboration, inspiration and automation.
We introduce a five-level Capability Maturity Model describing the principles of rigorous scientific operations.
arXiv Detail & Related papers (2023-12-29T21:37:22Z) - Repeatability, Reproducibility, Replicability, Reusability (4R) in
Journals' Policies and Software/Data Management in Scientific Publications: A
Survey, Discussion, and Perspectives [1.446375009535228]
We have found a large gap between the citation-oriented practices, journal policies, recommendations, artifact Description/Evaluation guidelines, submission guides, technological evolution.
The relationship between authors and scientific journals in their mutual efforts to jointly improve scientific results is analyzed.
We propose recommendations for the journal policies, as well as a unified and standardized Reproducibility Guide for the submission of scientific articles for authors.
arXiv Detail & Related papers (2023-12-18T09:02:28Z) - AI Competitions and Benchmarks: The life cycle of challenges and
benchmarks [0.49478969093606673]
We argue for the need to creatively leverage the scientific research and algorithm development community as an axis of robust innovation.
Coordinated community engagement in the analysis of highly complex and massive data has emerged as one approach to find robust methodologies.
arXiv Detail & Related papers (2023-12-08T18:44:10Z) - A Metadata-Based Ecosystem to Improve the FAIRness of Research Software [0.3185506103768896]
The reuse of research software is central to research efficiency and academic exchange.
The DataDesc ecosystem is presented, an approach to describing data models of software interfaces with detailed and machine-actionable metadata.
arXiv Detail & Related papers (2023-06-18T19:01:08Z) - A Diachronic Analysis of Paradigm Shifts in NLP Research: When, How, and
Why? [84.46288849132634]
We propose a systematic framework for analyzing the evolution of research topics in a scientific field using causal discovery and inference techniques.
We define three variables to encompass diverse facets of the evolution of research topics within NLP.
We utilize a causal discovery algorithm to unveil the causal connections among these variables using observational data.
arXiv Detail & Related papers (2023-05-22T11:08:00Z) - Distributed intelligence on the Edge-to-Cloud Continuum: A systematic
literature review [62.997667081978825]
This review aims at providing a comprehensive vision of the main state-of-the-art libraries and frameworks for machine learning and data analytics available today.
The main simulation, emulation, deployment systems, and testbeds for experimental research on the Edge-to-Cloud Continuum available today are also surveyed.
arXiv Detail & Related papers (2022-04-29T08:06:05Z) - Applications of physics-informed scientific machine learning in
subsurface science: A survey [64.0476282000118]
Geosystems are geological formations altered by humans activities such as fossil energy exploration, waste disposal, geologic carbon sequestration, and renewable energy generation.
The responsible use and exploration of geosystems are thus critical to the geosystem governance, which in turn depends on the efficient monitoring, risk assessment, and decision support tools for practical implementation.
Fast advances in machine learning algorithms and novel sensing technologies in recent years have presented new opportunities for the subsurface research community to improve the efficacy and transparency of geosystem governance.
arXiv Detail & Related papers (2021-04-10T13:40:22Z) - Challenges in biomarker discovery and biorepository for Gulf-war-disease
studies: a novel data platform solution [48.7576911714538]
We introduce a novel data platform, named ROSALIND, to overcome the challenges, foster healthy and vital collaborations and advance scientific inquiries.
We follow the principles etched in the platform name - ROSALIND stands for resource organisms with self-governed accessibility, linkability, integrability, neutrality, and dependability.
The deployment of ROSALIND in our GWI study in recent 12 months has accelerated the pace of data experiment and analysis, removed numerous error sources, and increased research quality and productivity.
arXiv Detail & Related papers (2021-02-04T20:38:30Z) - Using satellite imagery to understand and promote sustainable
development [87.72561825617062]
We synthesize the growing literature that uses satellite imagery to understand sustainable development outcomes.
We quantify the paucity of ground data on key human-related outcomes and the growing abundance and resolution of satellite imagery.
We review recent machine learning approaches to model-building in the context of scarce and noisy training data.
arXiv Detail & Related papers (2020-09-23T05:20:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.