ChatGPT cites the most-cited articles and journals, relying solely on
Google Scholar's citation counts. As a result, AI may amplify the Matthew
Effect in environmental science
- URL: http://arxiv.org/abs/2304.06794v1
- Date: Thu, 13 Apr 2023 19:29:49 GMT
- Title: ChatGPT cites the most-cited articles and journals, relying solely on
Google Scholar's citation counts. As a result, AI may amplify the Matthew
Effect in environmental science
- Authors: Eduard Petiska
- Abstract summary: ChatGPT tends to cite highly-cited publications in environmental science.
Google Scholar citations play a significant role as a predictor for mentioning a study in GPT-generated content.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: ChatGPT (GPT) has become one of the most talked-about innovations in recent
years, with over 100 million users worldwide. However, there is still limited
knowledge about the sources of information GPT utilizes. As a result, we
carried out a study focusing on the sources of information within the field of
environmental science. In our study, we asked GPT to identify the ten most
significant subdisciplines within the field of environmental science. We then
asked it to compose a scientific review article on each subdiscipline,
including 25 references. We proceeded to analyze these references, focusing on
factors such as the number of citations, publication date, and the journal in
which the work was published. Our findings indicate that GPT tends to cite
highly-cited publications in environmental science, with a median citation
count of 1184.5. It also exhibits a preference for older publications, with a
median publication year of 2010, and predominantly refers to well-respected
journals in the field, with Nature being the most cited journal by GPT.
Interestingly, our findings suggest that GPT seems to exclusively rely on
citation count data from Google Scholar for the works it cites, rather than
utilizing citation information from other scientific databases such as Web of
Science or Scopus. In conclusion, our study suggests that Google Scholar
citations play a significant role as a predictor for mentioning a study in
GPT-generated content. This finding reinforces the dominance of Google Scholar
among scientific databases and perpetuates the Matthew Effect in science, where
the rich get richer in terms of citations. With many scholars already utilizing
GPT for literature review purposes, we can anticipate further disparities and
an expanding gap between lesser-cited and highly-cited publications.
Related papers
- Mapping the Increasing Use of LLMs in Scientific Papers [99.67983375899719]
We conduct the first systematic, large-scale analysis across 950,965 papers published between January 2020 and February 2024 on the arXiv, bioRxiv, and Nature portfolio journals.
Our findings reveal a steady increase in LLM usage, with the largest and fastest growth observed in Computer Science papers.
arXiv Detail & Related papers (2024-04-01T17:45:15Z) - Forgotten Knowledge: Examining the Citational Amnesia in NLP [63.13508571014673]
We show how far back in time do we tend to go to cite papers? How has that changed over time, and what factors correlate with this citational attention/amnesia?
We show that around 62% of cited papers are from the immediate five years prior to publication, whereas only about 17% are more than ten years old.
We show that the median age and age diversity of cited papers were steadily increasing from 1990 to 2014, but since then, the trend has reversed, and current NLP papers have an all-time low temporal citation diversity.
arXiv Detail & Related papers (2023-05-29T18:30:34Z) - Geographic Citation Gaps in NLP Research [63.13508571014673]
This work asks a series of questions on the relationship between geographical location and publication success.
We first created a dataset of 70,000 papers from the ACL Anthology, extracted their meta-information, and generated their citation network.
We show that not only are there substantial geographical disparities in paper acceptance and citation but also that these disparities persist even when controlling for a number of variables such as venue of publication and sub-field of NLP.
arXiv Detail & Related papers (2022-10-26T02:25:23Z) - Detecting and analyzing missing citations to published scientific
entities [5.811229506383401]
We design a special method Citation Recommendation for Published Scientific Entity (CRPSE) based on the cooccurrences between published scientific entities and in-text citations.
We conduct a statistical analysis on missing citations among papers published in prestigious computer science conferences in 2020.
On a median basis, the papers proposing these published scientific entities with missing citations were published 8 years ago.
arXiv Detail & Related papers (2022-10-18T18:08:20Z) - CitationIE: Leveraging the Citation Graph for Scientific Information
Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers.
We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z) - A Measure of Research Taste [91.3755431537592]
We present a citation-based measure that rewards both productivity and taste.
The presented measure, CAP, balances the impact of publications and their quantity.
We analyze the characteristics of CAP for highly-cited researchers in biology, computer science, economics, and physics.
arXiv Detail & Related papers (2021-05-17T18:01:47Z) - A Graph Convolutional Neural Network based Framework for Estimating
Future Citations Count of Research Articles [0.03937354192623676]
We propose a Graph Convolutional Network (GCN) based framework for estimating future research publication citations for both the short-term (1-year) and long-term (for 5-years and 10-years) duration.
We have tested our proposed approach over the AMiner dataset, specifically on research articles from the computer science domain, consisting of more than 0.8 million articles.
arXiv Detail & Related papers (2021-04-11T07:20:53Z) - Enhancing Scientific Papers Summarization with Citation Graph [78.65955304229863]
We redefine the task of scientific papers summarization by utilizing their citation graph.
We construct a novel scientific papers summarization dataset Semantic Scholar Network (SSN) which contains 141K research papers in different domains.
Our model can achieve competitive performance when compared with the pretrained models.
arXiv Detail & Related papers (2021-04-07T11:13:35Z) - Utilizing Citation Network Structure to Predict Citation Counts: A Deep
Learning Approach [0.0]
This paper proposes an end-to-end deep learning network, DeepCCP, which combines the effect of information cascade and looks at the citation counts prediction problem.
According to experiments on 6 real data sets, DeepCCP is superior to the state-of-the-art methods in terms of the accuracy of citation count prediction.
arXiv Detail & Related papers (2020-09-06T05:27:50Z) - A Decade of In-text Citation Analysis based on Natural Language
Processing and Machine Learning Techniques: An overview of empirical studies [3.474275085556876]
Information scientists have gone far beyond traditional bibliometrics by tapping into advancements in full-text data processing techniques.
This article aims to narratively review the studies on these developments.
Its primary focus is on publications that have used natural language processing and machine learning techniques to analyse citations.
arXiv Detail & Related papers (2020-08-29T17:27:08Z) - Measuring prominence of scientific work in online news as a proxy for
impact [15.772621977756058]
We present a new corpus of newspaper articles linked to the scientific papers that they describe.
We find that Impact Case studies submitted to the UK Research Excellence Framework (REF) 2014 that refer to scientific papers mentioned in newspaper articles were awarded a higher score.
This supports our hypothesis that linguistic prominence in news can be used to suggest the wider non-academic impact of scientific work.
arXiv Detail & Related papers (2020-07-28T19:52:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.