Related papers: Hidden Citations Obscure True Impact in Science

Hidden Citations Obscure True Impact in Science

URL: http://arxiv.org/abs/2310.16181v2
Date: Sat, 11 May 2024 19:51:45 GMT
Title: Hidden Citations Obscure True Impact in Science
Authors: Xiangyi Meng, Onur Varol, Albert-László Barabási,
Abstract summary: When a discovery becomes common knowledge, citations suffer from obliteration by incorporation. Here, we rely on unsupervised interpretable machine learning applied to the full text of each paper to systematically identify hidden citations. We show that the prevalence of hidden citations is not driven by citation counts, but by the degree of the discourse on the topic within the text of the manuscripts.
Score: 1.5279567721070433
License: http://creativecommons.org/licenses/by/4.0/
Abstract: References, the mechanism scientists rely on to signal previous knowledge, lately have turned into widely used and misused measures of scientific impact. Yet, when a discovery becomes common knowledge, citations suffer from obliteration by incorporation. This leads to the concept of hidden citation, representing a clear textual credit to a discovery without a reference to the publication embodying it. Here, we rely on unsupervised interpretable machine learning applied to the full text of each paper to systematically identify hidden citations. We find that for influential discoveries hidden citations outnumber citation counts, emerging regardless of publishing venue and discipline. We show that the prevalence of hidden citations is not driven by citation counts, but rather by the degree of the discourse on the topic within the text of the manuscripts, indicating that the more discussed is a discovery, the less visible it is to standard bibliometric analysis. Hidden citations indicate that bibliometric measures offer a limited perspective on quantifying the true impact of a discovery, raising the need to extract knowledge from the full text of the scientific corpus.

Related papers

In-depth Research Impact Summarization through Fine-Grained Temporal Citation Analysis [52.42612945266194]
We propose a new task: generating nuanced, expressive, and time-aware impact summaries.<n>We show that these summaries capture both praise (confirmation citations) and critique (correction citations) through the evolution of fine-grained citation intents.
arXiv Detail & Related papers (2025-05-20T19:11:06Z)
The Noisy Path from Source to Citation: Measuring How Scholars Engage with Past Research [20.649638393774048]
We introduce a computational pipeline to quantify citation fidelity at scale. Using full texts of papers, the pipeline identifies citations in citing papers and the corresponding claims in cited papers. Using a quasi-experiment, we establish the "telephone effect" - when citing papers have low fidelity to the original claim, future papers that cite the citing paper and the original have lower fidelity to the original.
arXiv Detail & Related papers (2025-02-27T22:47:03Z)
ALiiCE: Evaluating Positional Fine-grained Citation Generation [54.19617927314975]
We propose ALiiCE, the first automatic evaluation framework for fine-grained citation generation. Our framework first parses the sentence claim into atomic claims via dependency analysis and then calculates citation quality at the atomic claim level. We evaluate the positional fine-grained citation generation performance of several Large Language Models on two long-form QA datasets.
arXiv Detail & Related papers (2024-06-19T09:16:14Z)
Uncited articles and their effect on the concentration of citations [0.0]
Empirical evidence shows that citations received by scholarly publications follow a pattern of preferential attachment, resulting in a power-law distribution. Are citations becoming more concentrated in a small number of articles? Or have recent geopolitical and technical changes in science led to more decentralized distributions? This article explores how reference-based and citation-based approaches, uncited articles, citation inflation, the expansion of bibliometric databases, disciplinary differences, and self-citations affect the evolution of citation concentration.
arXiv Detail & Related papers (2023-06-16T15:38:12Z)
Detecting and analyzing missing citations to published scientific entities [5.811229506383401]
We design a special method Citation Recommendation for Published Scientific Entity (CRPSE) based on the cooccurrences between published scientific entities and in-text citations. We conduct a statistical analysis on missing citations among papers published in prestigious computer science conferences in 2020. On a median basis, the papers proposing these published scientific entities with missing citations were published 8 years ago.
arXiv Detail & Related papers (2022-10-18T18:08:20Z)
Deep Graph Learning for Anomalous Citation Detection [55.81334139806342]
We propose a novel deep graph learning model, namely GLAD (Graph Learning for Anomaly Detection), to identify anomalies in citation networks. Within the GLAD framework, we propose an algorithm called CPU (Citation PUrpose) to discover the purpose of citation based on citation texts.
arXiv Detail & Related papers (2022-02-23T09:05:28Z)
Towards generating citation sentences for multiple references with intent control [86.53829532976303]
We build a novel generation model with the Fusion-in-Decoder approach to cope with multiple long inputs. Experiments demonstrate that the proposed approaches provide much more comprehensive features for generating citation sentences.
arXiv Detail & Related papers (2021-12-02T15:32:24Z)
CitationIE: Leveraging the Citation Graph for Scientific Information Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers. We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z)
Semantic Analysis for Automated Evaluation of the Potential Impact of Research Articles [62.997667081978825]
This paper presents a novel method for vector representation of text meaning based on information theory. We show how this informational semantics is used for text classification on the basis of the Leicester Scientific Corpus. We show that an informational approach to representing the meaning of a text has offered a way to effectively predict the scientific impact of research papers.
arXiv Detail & Related papers (2021-04-26T20:37:13Z)
Citations are not opinions: a corpus linguistics approach to understanding how citations are made [0.0]
Key issue in citation content analysis is looking for linguistic structures that characterize distinct classes of citations. In this study, we start with a large sample of a pre-classified citation corpus, 2 million citations from each class of the scite Smart Citation dataset. By generating comparison tables for each citation type, we present a number of interesting linguistic features that uniquely characterize citation type.
arXiv Detail & Related papers (2021-04-16T12:52:27Z)
Enhancing Scientific Papers Summarization with Citation Graph [78.65955304229863]
We redefine the task of scientific papers summarization by utilizing their citation graph. We construct a novel scientific papers summarization dataset Semantic Scholar Network (SSN) which contains 141K research papers in different domains. Our model can achieve competitive performance when compared with the pretrained models.
arXiv Detail & Related papers (2021-04-07T11:13:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.