The Reproducible Research Platform establishes a unified open science environment bridging data and software lifecycles across disciplines, from proposal to publication
- URL: http://arxiv.org/abs/2512.06039v1
- Date: Thu, 04 Dec 2025 22:02:19 GMT
- Title: The Reproducible Research Platform establishes a unified open science environment bridging data and software lifecycles across disciplines, from proposal to publication
- Authors: Andreas P. Cuny, Henry Lütcke, Andrei-Valentin Plamadă, Antti Luomi, John Hennig, Matthew Baker, Fabian Rudolf, Bernd Rinn,
- Abstract summary: We developed the open-source Reproducible Research Platform (RRP), which unifies research data management with version-controlled, containerized computational environments.<n>RRP enables anyone to execute, reuse and publish fully documented, FAIR research without manual retrieval or platform-specific setup.<n>We demonstrate RRP's impact by reproducing results from diverse published studies, including work over a decade old, showing sustained usability.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Many research groups aspire to make data and code FAIR and reproducible, yet struggle because the data and code life cycles are disconnected, executable environments are often missing from published work, and technical skill requirements hinder adoption. Existing approaches rarely enable researchers to keep using their preferred tools or support seamless execution across domains. To close this gap, we developed the open-source Reproducible Research Platform (RRP), which unifies research data management with version-controlled, containerized computational environments in modular, shareable projects. RRP enables anyone to execute, reuse, and publish fully documented, FAIR research workflows without manual retrieval or platform-specific setup. We demonstrate RRP's impact by reproducing results from diverse published studies, including work over a decade old, showing sustained reproducibility and usability. With a minimal graphical interface focused on core tasks, modular tool installation, and compatibility with institutional servers or local computers, RRP makes reproducible science broadly accessible across scientific domains.
Related papers
- AI Copilots for Reproducibility in Science: A Case Study [1.583709163934932]
Open science initiatives seek to make research outputs more transparent, accessible, and reusable, but ensuring that published findings can be independently reproduced remains a persistent challenge.<n>This paper introduces OpenPub, an AI-powered platform that supports researchers, reviewers, and readers through a suite of modular copilots focused on key open science tasks.
arXiv Detail & Related papers (2025-06-25T04:56:28Z) - A Comprehensive Survey on Composed Image Retrieval [54.54527281731775]
Composed Image Retrieval (CIR) is an emerging yet challenging task that allows users to search for target images using a multimodal query.<n>There is currently no comprehensive review of CIR to provide a timely overview of this field.<n>We synthesize insights from over 120 publications in top conferences and journals, including ACM TOIS, SIGIR, and CVPR.
arXiv Detail & Related papers (2025-02-19T01:37:24Z) - CoIR: A Comprehensive Benchmark for Code Information Retrieval Models [52.61625841028781]
COIR (Code Information Retrieval Benchmark) is a robust and comprehensive benchmark designed to assess code retrieval capabilities.<n>COIR comprises ten meticulously curated code datasets, spanning eight distinctive retrieval tasks across seven diverse domains.<n>We evaluate nine widely used retrieval models using COIR, uncovering significant difficulties in performing code retrieval tasks even with state-of-the-art systems.
arXiv Detail & Related papers (2024-07-03T07:58:20Z) - DiscoveryBench: Towards Data-Driven Discovery with Large Language Models [50.36636396660163]
We present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery.
Our benchmark contains 264 tasks collected across 6 diverse domains, such as sociology and engineering.
Our benchmark, thus, illustrates the challenges in autonomous data-driven discovery and serves as a valuable resource for the community to make progress.
arXiv Detail & Related papers (2024-07-01T18:58:22Z) - FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research [70.6584488911715]
retrieval-augmented generation (RAG) has attracted considerable research attention.<n>Existing RAG toolkits are often heavy and inflexibly, failing to meet the customization needs of researchers.<n>Our toolkit has implemented 16 advanced RAG methods and gathered and organized 38 benchmark datasets.
arXiv Detail & Related papers (2024-05-22T12:12:40Z) - MLXP: A Framework for Conducting Replicable Experiments in Python [63.37350735954699]
We propose MLXP, an open-source, simple, and lightweight experiment management tool based on Python.
It streamlines the experimental process with minimal overhead while ensuring a high level of practitioner overhead.
arXiv Detail & Related papers (2024-02-21T14:22:20Z) - Automated Requirements Relation Extraction [4.110571395660999]
This chapter aims at providing a clear perspective on the theoretical and practical fundamentals in the field of natural language-based relation extraction.<n>We first describe the fundamentals of requirements relations based on the most relevant literature in the field, including the most common requirements relations types.<n>The core of the chapter is composed by two main sections: (i) natural language techniques for the identification and categorization of equirements relations (i.e., syntactic vs. semantic techniques) and (ii) information extraction methods for the task of relation extraction.
arXiv Detail & Related papers (2024-01-22T16:14:27Z) - A pragmatic workflow for research software engineering in computational
science [0.0]
University research groups in Computational Science and Engineering (CSE) generally lack dedicated funding and personnel for Research Software Engineering (RSE)
RSE shifts the focus away from sustainable research software development and reproducible results.
We propose a RSE workflow for CSE that addresses these challenges, that improves the quality of research output in CSE.
arXiv Detail & Related papers (2023-10-02T08:04:12Z) - Integration of Domain Expert-Centric Ontology Design into the CRISP-DM for Cyber-Physical Production Systems [45.05372822216111]
Methods from Machine Learning (ML) and Data Mining (DM) have proven to be promising in extracting complex and hidden patterns from the data collected.
However, such data-driven projects, usually performed with the Cross-Industry Standard Process for Data Mining (CRISPDM), often fail due to the disproportionate amount of time needed for understanding and preparing the data.
This contribution intends present an integrated approach so that data scientists are able to more quickly and reliably gain insights into the CPPS challenges.
arXiv Detail & Related papers (2023-07-21T15:04:00Z) - EasyTPP: Towards Open Benchmarking Temporal Point Processes [36.759041669027745]
Temporal point processes (TPPs) have emerged as the most natural and competitive models.
EasyTPP is the first central repository of research assets (e.g., data, models, evaluation programs, documentations) in the area of event sequence modeling.
arXiv Detail & Related papers (2023-07-16T16:43:38Z) - A Backend Platform for Supporting the Reproducibility of Computational
Experiments [2.1485350418225244]
It is challenging to recreate the same environment using the same frameworks, code, data sources, programming languages, dependencies, and so on.
In this work, we propose an Integrated Development Environment allowing the share, configuration, packaging and execution of an experiment.
We have been able to successfully reproduce 20 (80%) of these experiments achieving the results reported in such works with minimum effort.
arXiv Detail & Related papers (2023-06-29T10:29:11Z) - A Metadata-Based Ecosystem to Improve the FAIRness of Research Software [0.3185506103768896]
The reuse of research software is central to research efficiency and academic exchange.
The DataDesc ecosystem is presented, an approach to describing data models of software interfaces with detailed and machine-actionable metadata.
arXiv Detail & Related papers (2023-06-18T19:01:08Z) - TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series [61.436361263605114]
Time series data are often scarce or highly sensitive, which precludes the sharing of data between researchers and industrial organizations.
We introduce Time Series Generative Modeling (TSGM), an open-source framework for the generative modeling of synthetic time series.
arXiv Detail & Related papers (2023-05-19T10:11:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.