Related papers: Detecting and analyzing missing citations to published scientific entities

Detecting and analyzing missing citations to published scientific entities

URL: http://arxiv.org/abs/2210.10073v1
Date: Tue, 18 Oct 2022 18:08:20 GMT
Title: Detecting and analyzing missing citations to published scientific entities
Authors: Jialiang Lin, Yao Yu, Jiaxin Song, Xiaodong Shi
Abstract summary: We design a special method Citation Recommendation for Published Scientific Entity (CRPSE) based on the cooccurrences between published scientific entities and in-text citations. We conduct a statistical analysis on missing citations among papers published in prestigious computer science conferences in 2020. On a median basis, the papers proposing these published scientific entities with missing citations were published 8 years ago.
Score: 5.811229506383401
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Proper citation is of great importance in academic writing for it enables knowledge accumulation and maintains academic integrity. However, citing properly is not an easy task. For published scientific entities, the ever-growing academic publications and over-familiarity of terms easily lead to missing citations. To deal with this situation, we design a special method Citation Recommendation for Published Scientific Entity (CRPSE) based on the cooccurrences between published scientific entities and in-text citations in the same sentences from previous researchers. Experimental outcomes show the effectiveness of our method in recommending the source papers for published scientific entities. We further conduct a statistical analysis on missing citations among papers published in prestigious computer science conferences in 2020. In the 12,278 papers collected, 475 published scientific entities of computer science and mathematics are found to have missing citations. Many entities mentioned without citations are found to be well-accepted research results. On a median basis, the papers proposing these published scientific entities with missing citations were published 8 years ago, which can be considered the time frame for a published scientific entity to develop into a well-accepted concept. For published scientific entities, we appeal for accurate and full citation of their source papers as required by academic standards.

Related papers

Rethinking Review Citations: Impact on Scientific Integrity [0.0]
The proliferation of surveys and review articles in academic journals has impacted citation metrics like impact factor and h-index. This work investigates the implications of this trend, focusing on the field of Computer Science. We advocate for prioritizing citations of primary research in journal articles to uphold citation integrity and ensure fair recognition of substantive contributions.
arXiv Detail & Related papers (2025-04-08T11:02:31Z)
Decoding Knowledge Claims: The Evaluation of Scientific Publication Contributions through Semantic Analysis [0.3374875022248865]
This paper proposes the use of Relaxed Word Mover's Distance (RWMD), a semantic text similarity measure, to evaluate the novelty of scientific papers. We compare RWMD results across three groups: 1) H-Index-related papers, 2) scientometric studies, and 3) unrelated papers, aiming to discern redundant literature and hype from genuine innovations.
arXiv Detail & Related papers (2024-07-26T10:28:59Z)
Mapping the Increasing Use of LLMs in Scientific Papers [99.67983375899719]
We conduct the first systematic, large-scale analysis across 950,965 papers published between January 2020 and February 2024 on the arXiv, bioRxiv, and Nature portfolio journals. Our findings reveal a steady increase in LLM usage, with the largest and fastest growth observed in Computer Science papers.
arXiv Detail & Related papers (2024-04-01T17:45:15Z)
Hidden Citations Obscure True Impact in Science [1.5279567721070433]
When a discovery becomes common knowledge, citations suffer from obliteration by incorporation. Here, we rely on unsupervised interpretable machine learning applied to the full text of each paper to systematically identify hidden citations. We show that the prevalence of hidden citations is not driven by citation counts, but by the degree of the discourse on the topic within the text of the manuscripts.
arXiv Detail & Related papers (2023-10-24T20:58:07Z)
ChatGPT cites the most-cited articles and journals, relying solely on Google Scholar's citation counts. As a result, AI may amplify the Matthew Effect in environmental science [0.0]
ChatGPT tends to cite highly-cited publications in environmental science. Google Scholar citations play a significant role as a predictor for mentioning a study in GPT-generated content.
arXiv Detail & Related papers (2023-04-13T19:29:49Z)
Modeling Information Change in Science Communication with Semantically Matched Paraphrases [50.67030449927206]
SPICED is the first paraphrase dataset of scientific findings annotated for degree of information change. SPICED contains 6,000 scientific finding pairs extracted from news stories, social media discussions, and full texts of original papers. Models trained on SPICED improve downstream performance on evidence retrieval for fact checking of real-world scientific claims.
arXiv Detail & Related papers (2022-10-24T07:44:38Z)
CitationIE: Leveraging the Citation Graph for Scientific Information Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers. We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z)
A Measure of Research Taste [91.3755431537592]
We present a citation-based measure that rewards both productivity and taste. The presented measure, CAP, balances the impact of publications and their quantity. We analyze the characteristics of CAP for highly-cited researchers in biology, computer science, economics, and physics.
arXiv Detail & Related papers (2021-05-17T18:01:47Z)
Semantic Analysis for Automated Evaluation of the Potential Impact of Research Articles [62.997667081978825]
This paper presents a novel method for vector representation of text meaning based on information theory. We show how this informational semantics is used for text classification on the basis of the Leicester Scientific Corpus. We show that an informational approach to representing the meaning of a text has offered a way to effectively predict the scientific impact of research papers.
arXiv Detail & Related papers (2021-04-26T20:37:13Z)
Enhancing Scientific Papers Summarization with Citation Graph [78.65955304229863]
We redefine the task of scientific papers summarization by utilizing their citation graph. We construct a novel scientific papers summarization dataset Semantic Scholar Network (SSN) which contains 141K research papers in different domains. Our model can achieve competitive performance when compared with the pretrained models.
arXiv Detail & Related papers (2021-04-07T11:13:35Z)
Utilizing Citation Network Structure to Predict Citation Counts: A Deep Learning Approach [0.0]
This paper proposes an end-to-end deep learning network, DeepCCP, which combines the effect of information cascade and looks at the citation counts prediction problem. According to experiments on 6 real data sets, DeepCCP is superior to the state-of-the-art methods in terms of the accuracy of citation count prediction.
arXiv Detail & Related papers (2020-09-06T05:27:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.