Repeatability, Reproducibility, Replicability, Reusability (4R) in
Journals' Policies and Software/Data Management in Scientific Publications: A
Survey, Discussion, and Perspectives
- URL: http://arxiv.org/abs/2312.11028v1
- Date: Mon, 18 Dec 2023 09:02:28 GMT
- Title: Repeatability, Reproducibility, Replicability, Reusability (4R) in
Journals' Policies and Software/Data Management in Scientific Publications: A
Survey, Discussion, and Perspectives
- Authors: Jos\'e Armando Hern\'andez (CB), Miguel Colom (CB, CMLA)
- Abstract summary: We have found a large gap between the citation-oriented practices, journal policies, recommendations, artifact Description/Evaluation guidelines, submission guides, technological evolution.
The relationship between authors and scientific journals in their mutual efforts to jointly improve scientific results is analyzed.
We propose recommendations for the journal policies, as well as a unified and standardized Reproducibility Guide for the submission of scientific articles for authors.
- Score: 1.446375009535228
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the recognized crisis of credibility in scientific research, there is a
growth of reproducibility studies in computer science, and although existing
surveys have reviewed reproducibility from various perspectives, especially
very specific technological issues, they do not address the author-publisher
relationship in the publication of reproducible computational scientific
articles. This aspect requires significant attention because it is the basis
for reliable research. We have found a large gap between the
reproducibility-oriented practices, journal policies, recommendations,
publisher artifact Description/Evaluation guidelines, submission guides,
technological reproducibility evolution, and its effective adoption to
contribute to tackling the crisis. We conducted a narrative survey, a
comprehensive overview and discussion identifying the mutual efforts required
from Authors, Journals, and Technological actors to achieve reproducibility
research. The relationship between authors and scientific journals in their
mutual efforts to jointly improve the reproducibility of scientific results is
analyzed. Eventually, we propose recommendations for the journal policies, as
well as a unified and standardized Reproducibility Guide for the submission of
scientific articles for authors. The main objective of this work is to analyze
the implementation and experiences of reproducibility policies, techniques and
technologies, standards, methodologies, software, and data management tools
required for scientific reproducible publications. Also, the benefits and
drawbacks of such an adoption, as well as open challenges and promising trends,
to propose possible strategies and efforts to mitigate the identified gaps. To
this purpose, we analyzed 200 scientific articles, surveyed 16 Computer Science
journals, and systematically classified them according to reproducibility
strategies, technologies, policies, code citation, and editorial business. We
conclude there is still a reproducibility gap in scientific publications,
although at the same time also the opportunity to reduce this gap with the
joint effort of authors, publishers, and technological providers.
Related papers
- Continuous Analysis: Evolution of Software Engineering and Reproducibility for Science [0.0]
This paper introduces the concept of Continuous Analysis to address the challenges in scientific research.
By adopting CA, the scientific community can ensure the validity and generalizability of research outcomes.
arXiv Detail & Related papers (2024-11-04T17:11:08Z) - MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows [58.56005277371235]
We introduce MASSW, a comprehensive text dataset on Multi-Aspect Summarization of ScientificAspects.
MASSW includes more than 152,000 peer-reviewed publications from 17 leading computer science conferences spanning the past 50 years.
We demonstrate the utility of MASSW through multiple novel machine-learning tasks that can be benchmarked using this new dataset.
arXiv Detail & Related papers (2024-06-10T15:19:09Z) - ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models [56.08917291606421]
ResearchAgent is a large language model-powered research idea writing agent.
It generates problems, methods, and experiment designs while iteratively refining them based on scientific literature.
We experimentally validate our ResearchAgent on scientific publications across multiple disciplines.
arXiv Detail & Related papers (2024-04-11T13:36:29Z) - Scientific Opinion Summarization: Paper Meta-review Generation Dataset, Methods, and Evaluation [55.00687185394986]
We propose the task of scientific opinion summarization, where research paper reviews are synthesized into meta-reviews.
We introduce the ORSUM dataset covering 15,062 paper meta-reviews and 57,536 paper reviews from 47 conferences.
Our experiments show that (1) human-written summaries do not always satisfy all necessary criteria such as depth of discussion, and identifying consensus and controversy for the specific domain, and (2) the combination of task decomposition and iterative self-refinement shows strong potential for enhancing the opinions.
arXiv Detail & Related papers (2023-05-24T02:33:35Z) - How Data Scientists Review the Scholarly Literature [4.406926847270567]
We examine the literature review practices of data scientists.
Data science represents a field seeing an exponential rise in papers.
No prior work has examined the specific practices and challenges faced by these scientists.
arXiv Detail & Related papers (2023-01-10T03:53:05Z) - Artificial intelligence technologies to support research assessment: A
review [10.203602318836444]
This literature review identifies indicators that associate with higher impact or higher quality research from article text.
It includes studies that used machine learning techniques to predict citation counts or quality scores for journal articles or conference papers.
arXiv Detail & Related papers (2022-12-11T06:58:39Z) - Revise and Resubmit: An Intertextual Model of Text-based Collaboration
in Peer Review [52.359007622096684]
Peer review is a key component of the publishing process in most fields of science.
Existing NLP studies focus on the analysis of individual texts.
editorial assistance often requires modeling interactions between pairs of texts.
arXiv Detail & Related papers (2022-04-22T16:39:38Z) - Nanopublication-Based Semantic Publishing and Reviewing: A Field Study
with Formalization Papers [0.5735035463793008]
We use the concept and technology of nanopublications for this endeavor.
We represent not just the submissions and final papers in this RDF-based format,but also the whole process in between.
We received 15 submissions from 18 authors,who then went through the whole publication process leading to the publication of their contributions in the special issue.
arXiv Detail & Related papers (2022-03-03T10:04:10Z) - CitationIE: Leveraging the Citation Graph for Scientific Information
Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers.
We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z) - What's New? Summarizing Contributions in Scientific Literature [85.95906677964815]
We introduce a new task of disentangled paper summarization, which seeks to generate separate summaries for the paper contributions and the context of the work.
We extend the S2ORC corpus of academic articles by adding disentangled "contribution" and "context" reference labels.
We propose a comprehensive automatic evaluation protocol which reports the relevance, novelty, and disentanglement of generated outputs.
arXiv Detail & Related papers (2020-11-06T02:23:01Z) - Generating Knowledge Graphs by Employing Natural Language Processing and
Machine Learning Techniques within the Scholarly Domain [1.9004296236396943]
We present a new architecture that takes advantage of Natural Language Processing and Machine Learning methods for extracting entities and relationships from research publications.
Within this research work, we i) tackle the challenge of knowledge extraction by employing several state-of-the-art Natural Language Processing and Text Mining tools.
We generated a scientific knowledge graph including 109,105 triples, extracted from 26,827 abstracts of papers within the Semantic Web domain.
arXiv Detail & Related papers (2020-10-28T08:31:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.