Related papers: How many preprints have actually been printed and why: a case study of computer science preprints on arXiv

How many preprints have actually been printed and why: a case study of computer science preprints on arXiv

URL: http://arxiv.org/abs/2308.01899v1
Date: Thu, 3 Aug 2023 17:56:16 GMT
Title: How many preprints have actually been printed and why: a case study of computer science preprints on arXiv
Authors: Jialiang Lin, Yao Yu, Yu Zhou, Zhiyang Zhou, Xiaodong Shi
Abstract summary: We quantify how many preprints have eventually been printed in peer-reviewed venues. Among those published manuscripts, some are published under different titles and without an update to their preprints on arXiv. In the field of computer science, published preprints feature adequate revisions, multiple authorship, detailed abstract and introduction, extensive and authoritative references and available source code.
Score: 9.783989953810725
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Preprints play an increasingly critical role in academic communities. There are many reasons driving researchers to post their manuscripts to preprint servers before formal submission to journals or conferences, but the use of preprints has also sparked considerable controversy, especially surrounding the claim of priority. In this paper, a case study of computer science preprints submitted to arXiv from 2008 to 2017 is conducted to quantify how many preprints have eventually been printed in peer-reviewed venues. Among those published manuscripts, some are published under different titles and without an update to their preprints on arXiv. In the case of these manuscripts, the traditional fuzzy matching method is incapable of mapping the preprint to the final published version. In view of this issue, we introduce a semantics-based mapping method with the employment of Bidirectional Encoder Representations from Transformers (BERT). With this new mapping method and a plurality of data sources, we find that 66% of all sampled preprints are published under unchanged titles and 11% are published under different titles and with other modifications. A further analysis was then performed to investigate why these preprints but not others were accepted for publication. Our comparison reveals that in the field of computer science, published preprints feature adequate revisions, multiple authorship, detailed abstract and introduction, extensive and authoritative references and available source code.

Related papers

Toward Reproducibility of Digital Twin Research: Exemplified with the PiCar-X [49.44419860570116]
Digital twins are increasingly relevant in the Industrial Internet of Things and Industry 4.0. The concept of dts lacks a unified definition and faces validation challenges. This paper presents a reproducible laboratory experiment that demonstrates various dt concepts.
arXiv Detail & Related papers (2024-08-25T15:34:00Z)
Mapping the Increasing Use of LLMs in Scientific Papers [99.67983375899719]
We conduct the first systematic, large-scale analysis across 950,965 papers published between January 2020 and February 2024 on the arXiv, bioRxiv, and Nature portfolio journals. Our findings reveal a steady increase in LLM usage, with the largest and fastest growth observed in Computer Science papers.
arXiv Detail & Related papers (2024-04-01T17:45:15Z)
CausalCite: A Causal Formulation of Paper Citations [80.82622421055734]
CausalCite is a new way to measure the significance of a paper by assessing the causal impact of the paper on its follow-up papers. It is based on a novel causal inference method, TextMatch, which adapts the traditional matching framework to high-dimensional text embeddings. We demonstrate the effectiveness of CausalCite on various criteria, such as high correlation with paper impact as reported by scientific experts.
arXiv Detail & Related papers (2023-11-05T23:09:39Z)
Estimating the Causal Effect of Early ArXiving on Paper Acceptance [56.538813945721685]
We estimate the effect of arXiving a paper before the reviewing period (early arXiving) on its acceptance to the conference. Our results suggest that early arXiving may have a small effect on a paper's chances of acceptance.
arXiv Detail & Related papers (2023-06-24T07:45:38Z)
Contrastive Attention Networks for Attribution of Early Modern Print [23.344655278038392]
We develop machine learning techniques to identify unknown printers in early modern (c.1500--1800) English printed books. Specifically, we focus on matching uniquely damaged character type-imprints in anonymously printed books to works with known printers.
arXiv Detail & Related papers (2023-06-12T19:57:11Z)
Cracking Double-Blind Review: Authorship Attribution with Deep Learning [43.483063713471935]
We propose a transformer-based, neural-network architecture to attribute an anonymous manuscript to an author. We leverage all research papers publicly available on arXiv amounting to over 2 million manuscripts. Our method achieves an unprecedented authorship attribution accuracy, where up to 73% of papers are attributed correctly.
arXiv Detail & Related papers (2022-11-14T15:50:24Z)
Scientometric engineering: Exploring citation dynamics via arXiv eprints [0.0]
We investigate the citation data of more than 1.5 million eprints on arXiv. We find that the typical growth and obsolescence patterns vary across disciplines. We derive a model consistent with the observed quantitative and temporal characteristics of citation growth and obsolescence.
arXiv Detail & Related papers (2021-06-09T12:38:44Z)
Enhancing Scientific Papers Summarization with Citation Graph [78.65955304229863]
We redefine the task of scientific papers summarization by utilizing their citation graph. We construct a novel scientific papers summarization dataset Semantic Scholar Network (SSN) which contains 141K research papers in different domains. Our model can achieve competitive performance when compared with the pretrained models.
arXiv Detail & Related papers (2021-04-07T11:13:35Z)
Is preprint the future of science? A thirty year journey of online preprint services [7.063908865620109]
Preprint is a version of a scientific paper that is publicly distributed preceding formal peer review. Since the launch of arXiv in 1991, preprints have been increasingly distributed over the Internet as opposed to paper copies.
arXiv Detail & Related papers (2021-02-17T23:08:01Z)
Preprints as accelerator of scholarly communication: An empirical analysis in Mathematics [9.899221738408581]
We measure two effects associated with preprint publishing: publication delay and impact. Article with preprint versions are more likely to be mentioned in social media and have shorter Altmetric attention delay.
arXiv Detail & Related papers (2020-11-24T07:32:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.