HalluCitation Matters: Revealing the Impact of Hallucinated References with 300 Hallucinated Papers in ACL Conferences
- URL: http://arxiv.org/abs/2601.18724v1
- Date: Mon, 26 Jan 2026 17:48:23 GMT
- Title: HalluCitation Matters: Revealing the Impact of Hallucinated References with 300 Hallucinated Papers in ACL Conferences
- Authors: Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe,
- Abstract summary: We analyze all papers published at ACL, NAACL, and EMNLP in 2024 and 2025.<n>Half of these papers were identified at EMNLP 2025, indicating that this issue is rapidly increasing.<n>More than 100 such papers were accepted as main conference and Findings papers at EMNLP 2025, affecting the credibility.
- Score: 58.87954687016989
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, we have often observed hallucinated citations or references that do not correspond to any existing work in papers under review, preprints, or published papers. Such hallucinated citations pose a serious concern to scientific reliability. When they appear in accepted papers, they may also negatively affect the credibility of conferences. In this study, we refer to hallucinated citations as "HalluCitation" and systematically investigate their prevalence and impact. We analyze all papers published at ACL, NAACL, and EMNLP in 2024 and 2025, including main conference, Findings, and workshop papers. Our analysis reveals that nearly 300 papers contain at least one HalluCitation, most of which were published in 2025. Notably, half of these papers were identified at EMNLP 2025, the most recent conference, indicating that this issue is rapidly increasing. Moreover, more than 100 such papers were accepted as main conference and Findings papers at EMNLP 2025, affecting the credibility.
Related papers
- Compound Deception in Elite Peer Review: A Failure Mode Taxonomy of 100 Fabricated Citations at NeurIPS 2025 [0.0]
Large language models (LLMs) are increasingly used in academic writing, yet they frequently hallucinate by generating citations to sources that do not exist.<n>This study analyzes 100 AI-generated hallucinated citations that appeared in papers accepted by the 2025 Conference on Neural Information Processing Systems.<n>Despite review by 3-5 expert researchers per paper, these fabricated citations evaded detection, appearing in 53 published papers.
arXiv Detail & Related papers (2026-02-05T17:43:35Z) - Mapping the Increasing Use of LLMs in Scientific Papers [99.67983375899719]
We conduct the first systematic, large-scale analysis across 950,965 papers published between January 2020 and February 2024 on the arXiv, bioRxiv, and Nature portfolio journals.
Our findings reveal a steady increase in LLM usage, with the largest and fastest growth observed in Computer Science papers.
arXiv Detail & Related papers (2024-04-01T17:45:15Z) - Position: AI/ML Influencers Have a Place in the Academic Process [82.2069685579588]
We investigate the role of social media influencers in enhancing the visibility of machine learning research.
We have compiled a comprehensive dataset of over 8,000 papers, spanning tweets from December 2018 to October 2023.
Our statistical and causal inference analysis reveals a significant increase in citations for papers endorsed by these influencers.
arXiv Detail & Related papers (2024-01-24T20:05:49Z) - "Here's Your Evidence": False Consensus in Public Twitter Discussions of COVID-19 Science [50.08057052734799]
We estimate scientific consensus based on samples of abstracts from preprint servers.
We find that anti-consensus posts and users, though overall less numerous than pro-consensus ones, are vastly over-represented on Twitter.
arXiv Detail & Related papers (2024-01-24T06:16:57Z) - Estimating the Causal Effect of Early ArXiving on Paper Acceptance [56.538813945721685]
We estimate the effect of arXiving a paper before the reviewing period (early arXiving) on its acceptance to the conference.
Our results suggest that early arXiving may have a small effect on a paper's chances of acceptance.
arXiv Detail & Related papers (2023-06-24T07:45:38Z) - Forgotten Knowledge: Examining the Citational Amnesia in NLP [63.13508571014673]
We show how far back in time do we tend to go to cite papers? How has that changed over time, and what factors correlate with this citational attention/amnesia?
We show that around 62% of cited papers are from the immediate five years prior to publication, whereas only about 17% are more than ten years old.
We show that the median age and age diversity of cited papers were steadily increasing from 1990 to 2014, but since then, the trend has reversed, and current NLP papers have an all-time low temporal citation diversity.
arXiv Detail & Related papers (2023-05-29T18:30:34Z) - Community-Driven Comprehensive Scientific Paper Summarization: Insight
from cvpaper.challenge [23.10314444860379]
We organized a group of non-native English speakers to write summaries of papers presented at a computer vision conference.
We summarized a total of 2,000 papers presented at the Conference on Computer Vision and Pattern Recognition.
arXiv Detail & Related papers (2022-03-17T06:31:17Z) - Dynamics of Cross-Platform Attention to Retracted Papers [25.179837269945015]
Retracted papers circulate widely on social media, digital news and other websites before their official retraction.
We quantify the amount and type of attention 3,851 retracted papers received over time in different online platforms.
arXiv Detail & Related papers (2021-10-15T01:40:20Z) - Does the Venue of Scientific Conferences Leverage their Impact? A Large
Scale study on Computer Science Conferences [2.8388425545775386]
We conducted a large scale analysis on the data extracted from 3,838 Computer Science conference series and over 2.5 million papers spanning more than 30 years of research.
To quantify the "touristicity" of a venue we exploited some indicators such as the size of the Wikipedia page for the city hosting the venue and other indexes from reports of the World Economic Forum.
More-over the almost linear correlation with the Tourist Service Infrastructure index attests the specific importance of tourist/accommodation facilities in a given country.
arXiv Detail & Related papers (2021-05-31T09:51:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.