Is there really a Citation Age Bias in NLP?
- URL: http://arxiv.org/abs/2401.03545v1
- Date: Sun, 7 Jan 2024 17:12:08 GMT
- Title: Is there really a Citation Age Bias in NLP?
- Authors: Hoa Nguyen and Steffen Eger
- Abstract summary: There is a citation age bias in the Natural Language Processing (NLP) community.
All AI subfields have similar trends of citation amnesia.
Rather than diagnosing this as a citation age bias in the NLP community, we believe this pattern is an artefact of the dynamics of these research fields.
- Score: 25.867690917154885
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Citations are a key ingredient of scientific research to relate a paper to
others published in the community. Recently, it has been noted that there is a
citation age bias in the Natural Language Processing (NLP) community, one of
the currently fastest growing AI subfields, in that the mean age of the
bibliography of NLP papers has become ever younger in the last few years,
leading to `citation amnesia' in which older knowledge is increasingly
forgotten. In this work, we put such claims into perspective by analyzing the
bibliography of $\sim$300k papers across 15 different scientific fields
submitted to the popular preprint server Arxiv in the time period from 2013 to
2022. We find that all AI subfields (in particular: cs.AI, cs.CL, cs.CV, cs.LG)
have similar trends of citation amnesia, in which the age of the bibliography
has roughly halved in the last 10 years (from above 12 in 2013 to below 7 in
2022), on average. Rather than diagnosing this as a citation age bias in the
NLP community, we believe this pattern is an artefact of the dynamics of these
research fields, in which new knowledge is produced in ever shorter time
intervals.
Related papers
- The Nature of NLP: Analyzing Contributions in NLP Papers [77.31665252336157]
We quantitatively investigate what constitutes NLP research by examining research papers.
Our findings reveal a rising involvement of machine learning in NLP since the early nineties.
In post-2020, there has been a resurgence of focus on language and people.
arXiv Detail & Related papers (2024-09-29T01:29:28Z) - Mapping the Increasing Use of LLMs in Scientific Papers [99.67983375899719]
We conduct the first systematic, large-scale analysis across 950,965 papers published between January 2020 and February 2024 on the arXiv, bioRxiv, and Nature portfolio journals.
Our findings reveal a steady increase in LLM usage, with the largest and fastest growth observed in Computer Science papers.
arXiv Detail & Related papers (2024-04-01T17:45:15Z) - Citation Amnesia: NLP and Other Academic Fields Are in a Citation Age
Recession [32.77640515002326]
This study examines the tendency to cite older work across 20 fields of study over 43 years (1980--2023)
We term this decline a 'citation age recession', analogous to how economists define periods of reduced economic activity.
Our results suggest that citing more recent works is not directly driven by the growth in publication rates.
arXiv Detail & Related papers (2024-02-19T10:59:29Z) - NLLG Quarterly arXiv Report 06/23: What are the most influential current
AI Papers? [15.830129136642755]
The objective is to offer a quick guide to the most relevant and widely discussed research, aiding both newcomers and established researchers in staying abreast of current trends.
We observe the dominance of papers related to Large Language Models (LLMs) and specifically ChatGPT during the first half of 2023.
NLP related papers are the most influential (around 60% of top papers) even though there are twice as many ML related papers in our data.
arXiv Detail & Related papers (2023-07-31T11:53:52Z) - Artificial intelligence adoption in the physical sciences, natural
sciences, life sciences, social sciences and the arts and humanities: A
bibliometric analysis of research publications from 1960-2021 [73.06361680847708]
In 1960 14% of 333 research fields were related to AI, but this increased to over half of all research fields by 1972, over 80% by 1986 and over 98% in current times.
In 1960 14% of 333 research fields were related to AI (many in computer science), but this increased to over half of all research fields by 1972, over 80% by 1986 and over 98% in current times.
We conclude that the context of the current surge appears different, and that interdisciplinary AI application is likely to be sustained.
arXiv Detail & Related papers (2023-06-15T14:08:07Z) - Forgotten Knowledge: Examining the Citational Amnesia in NLP [63.13508571014673]
We show how far back in time do we tend to go to cite papers? How has that changed over time, and what factors correlate with this citational attention/amnesia?
We show that around 62% of cited papers are from the immediate five years prior to publication, whereas only about 17% are more than ten years old.
We show that the median age and age diversity of cited papers were steadily increasing from 1990 to 2014, but since then, the trend has reversed, and current NLP papers have an all-time low temporal citation diversity.
arXiv Detail & Related papers (2023-05-29T18:30:34Z) - Geographic Citation Gaps in NLP Research [63.13508571014673]
This work asks a series of questions on the relationship between geographical location and publication success.
We first created a dataset of 70,000 papers from the ACL Anthology, extracted their meta-information, and generated their citation network.
We show that not only are there substantial geographical disparities in paper acceptance and citation but also that these disparities persist even when controlling for a number of variables such as venue of publication and sub-field of NLP.
arXiv Detail & Related papers (2022-10-26T02:25:23Z) - Improving Wikipedia Verifiability with AI [116.69749668874493]
We develop a neural network based system, called Side, to identify Wikipedia citations that are unlikely to support their claims.
Our first citation recommendation collects over 60% more preferences than existing Wikipedia citations for the same top 10% most likely unverifiable claims.
Our results indicate that an AI-based system could be used, in tandem with humans, to improve the verifiability of Wikipedia.
arXiv Detail & Related papers (2022-07-08T15:23:29Z) - Examining Citations of Natural Language Processing Literature [31.87319293259599]
We show that only about 56% of the papers in AA are cited ten or more times.
CL Journal has the most cited papers, but its citation dominance has lessened in recent years.
papers on sentiment classification, anaphora resolution, and entity recognition have the highest median citations.
arXiv Detail & Related papers (2020-05-02T20:01:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.