Citation Amnesia: On The Recency Bias of NLP and Other Academic Fields
- URL: http://arxiv.org/abs/2402.12046v2
- Date: Fri, 13 Dec 2024 12:50:18 GMT
- Title: Citation Amnesia: On The Recency Bias of NLP and Other Academic Fields
- Authors: Jan Philip Wahle, Terry Ruas, Mohamed Abdalla, Bela Gipp, Saif M. Mohammad,
- Abstract summary: This study examines the tendency to cite older work across 20 fields of study over 43 years (1980--2023)
We term this decline a 'citation age recession', analogous to how economists define periods of reduced economic activity.
Our results suggest that citing more recent works is not directly driven by the growth in publication rates.
- Score: 30.550895983110806
- License:
- Abstract: This study examines the tendency to cite older work across 20 fields of study over 43 years (1980--2023). We put NLP's propensity to cite older work in the context of these 20 other fields to analyze whether NLP shows similar temporal citation patterns to these other fields over time or whether differences can be observed. Our analysis, based on a dataset of approximately 240 million papers, reveals a broader scientific trend: many fields have markedly declined in citing older works (e.g., psychology, computer science). We term this decline a 'citation age recession', analogous to how economists define periods of reduced economic activity. The trend is strongest in NLP and ML research (-12.8% and -5.5% in citation age from previous peaks). Our results suggest that citing more recent works is not directly driven by the growth in publication rates (-3.4% across fields; -5.2% in humanities; -5.5% in formal sciences) -- even when controlling for an increase in the volume of papers. Our findings raise questions about the scientific community's engagement with past literature, particularly for NLP, and the potential consequences of neglecting older but relevant research. The data and a demo showcasing our results are publicly available.
Related papers
- The Nature of NLP: Analyzing Contributions in NLP Papers [77.31665252336157]
We quantitatively investigate what constitutes NLP research by examining research papers.
Our findings reveal a rising involvement of machine learning in NLP since the early nineties.
In post-2020, there has been a resurgence of focus on language and people.
arXiv Detail & Related papers (2024-09-29T01:29:28Z) - Mapping the Increasing Use of LLMs in Scientific Papers [99.67983375899719]
We conduct the first systematic, large-scale analysis across 950,965 papers published between January 2020 and February 2024 on the arXiv, bioRxiv, and Nature portfolio journals.
Our findings reveal a steady increase in LLM usage, with the largest and fastest growth observed in Computer Science papers.
arXiv Detail & Related papers (2024-04-01T17:45:15Z) - Is there really a Citation Age Bias in NLP? [25.867690917154885]
There is a citation age bias in the Natural Language Processing (NLP) community.
All AI subfields have similar trends of citation amnesia.
Rather than diagnosing this as a citation age bias in the NLP community, we believe this pattern is an artefact of the dynamics of these research fields.
arXiv Detail & Related papers (2024-01-07T17:12:08Z) - NLLG Quarterly arXiv Report 09/23: What are the most influential current
AI Papers? [21.68589129842815]
The US dominates among both top-40 and top-9k papers, followed by China.
Europe clearly lags behind and is hardly represented in the top-40 most cited papers.
US industry is largely overrepresented in the top-40 most influential papers.
arXiv Detail & Related papers (2023-12-09T21:42:20Z) - We are Who We Cite: Bridges of Influence Between Natural Language Processing and Other Academic Fields [30.550895983110806]
Cross-field engagement of Natural Language Processing has declined.
Less than 8% of NLP citations are to linguistics.
Less than 3% of NLP citations are to math and psychology.
arXiv Detail & Related papers (2023-10-23T12:42:06Z) - Forgotten Knowledge: Examining the Citational Amnesia in NLP [63.13508571014673]
We show how far back in time do we tend to go to cite papers? How has that changed over time, and what factors correlate with this citational attention/amnesia?
We show that around 62% of cited papers are from the immediate five years prior to publication, whereas only about 17% are more than ten years old.
We show that the median age and age diversity of cited papers were steadily increasing from 1990 to 2014, but since then, the trend has reversed, and current NLP papers have an all-time low temporal citation diversity.
arXiv Detail & Related papers (2023-05-29T18:30:34Z) - The Elephant in the Room: Analyzing the Presence of Big Tech in Natural Language Processing Research [28.382353702576314]
We use a corpus with comprehensive metadata of 78,187 NLP publications and 701 resumes of NLP publication authors.
We find that industry presence among NLP authors has been steady before a steep increase over the past five years.
A few companies account for most of the publications and provide funding to academic researchers through grants and internships.
arXiv Detail & Related papers (2023-05-04T12:57:18Z) - Geographic Citation Gaps in NLP Research [63.13508571014673]
This work asks a series of questions on the relationship between geographical location and publication success.
We first created a dataset of 70,000 papers from the ACL Anthology, extracted their meta-information, and generated their citation network.
We show that not only are there substantial geographical disparities in paper acceptance and citation but also that these disparities persist even when controlling for a number of variables such as venue of publication and sub-field of NLP.
arXiv Detail & Related papers (2022-10-26T02:25:23Z) - State-of-the-art generalisation research in NLP: A taxonomy and review [87.1541712509283]
We present a taxonomy for characterising and understanding generalisation research in NLP.
Our taxonomy is based on an extensive literature review of generalisation research.
We use our taxonomy to classify over 400 papers that test generalisation.
arXiv Detail & Related papers (2022-10-06T16:53:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.