Big Tech-Funded AI Papers Have Higher Citation Impact, Greater Insularity, and Larger Recency Bias
- URL: http://arxiv.org/abs/2512.05714v1
- Date: Fri, 05 Dec 2025 13:41:29 GMT
- Title: Big Tech-Funded AI Papers Have Higher Citation Impact, Greater Insularity, and Larger Recency Bias
- Authors: Max Martin Gnewuch, Jan Philip Wahle, Terry Ruas, Bela Gipp,
- Abstract summary: We analyze about 49.8K papers, about 1.8M citations from AI papers to other papers, and about 2.3M citations from other papers to AI papers from 1998-2022 in Scopus.<n>Our findings reveal that industry presence has grown markedly since 2015, from less than 2 percent to more than 11 percent in 2020.<n>Industry-funded research is increasingly insular, citing predominantly other industry-funded papers while referencing fewer non-funded papers.
- Score: 11.267285650500737
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Over the past four decades, artificial intelligence (AI) research has flourished at the nexus of academia and industry. However, Big Tech companies have increasingly acquired the edge in computational resources, big data, and talent. So far, it has been largely unclear how many papers the industry funds, how their citation impact compares to non-funded papers, and what drives industry interest. This study fills that gap by quantifying the number of industry-funded papers at 10 top AI conferences (e.g., ICLR, CVPR, AAAI, ACL) and their citation influence. We analyze about 49.8K papers, about 1.8M citations from AI papers to other papers, and about 2.3M citations from other papers to AI papers from 1998-2022 in Scopus. Through seven research questions, we examine the volume and evolution of industry funding in AI research, the citation impact of funded papers, the diversity and temporal range of their citations, and the subfields in which industry predominantly acts. Our findings reveal that industry presence has grown markedly since 2015, from less than 2 percent to more than 11 percent in 2020. Between 2018 and 2022, 12 percent of industry-funded papers achieved high citation rates as measured by the h5-index, compared to 4 percent of non-industry-funded papers and 2 percent of non-funded papers. Top AI conferences engage more with industry-funded research than non-funded research, as measured by our newly proposed metric, the Citation Preference Ratio (CPR). We show that industry-funded research is increasingly insular, citing predominantly other industry-funded papers while referencing fewer non-funded papers. These findings reveal new trends in AI research funding, including a shift towards more industry-funded papers and their growing citation impact, greater insularity of industry-funded work than non-funded work, and a preference of industry-funded research to cite recent work.
Related papers
- The Role of Computing Resources in Publishing Foundation Model Research [84.20094600030092]
We evaluate the relationship between these resources and the scientific advancement of foundation models (FM)<n>We reviewed 6517 FM papers published between 2022 to 2024, and surveyed 229 first-authors to the impact of computing resources on scientific output.<n>We find that increased computing is correlated with national funding allocations and citations, but our findings don't observe the strong correlations with research environment.
arXiv Detail & Related papers (2025-10-15T14:50:45Z) - Conflicts of Interest in Published NLP Research 2000-2024 [0.3867363075280544]
Increasing entanglement of academic research and industry interests leads to conflicts of interest.<n>Overall 27.65% of the papers contained at least one industry-affiliated author.<n>Top-tier venues (ACL, EMNLP) as main drivers for that effect.
arXiv Detail & Related papers (2025-02-22T12:44:57Z) - Mapping the Increasing Use of LLMs in Scientific Papers [99.67983375899719]
We conduct the first systematic, large-scale analysis across 950,965 papers published between January 2020 and February 2024 on the arXiv, bioRxiv, and Nature portfolio journals.
Our findings reveal a steady increase in LLM usage, with the largest and fastest growth observed in Computer Science papers.
arXiv Detail & Related papers (2024-04-01T17:45:15Z) - Position: AI/ML Influencers Have a Place in the Academic Process [82.2069685579588]
We investigate the role of social media influencers in enhancing the visibility of machine learning research.
We have compiled a comprehensive dataset of over 8,000 papers, spanning tweets from December 2018 to October 2023.
Our statistical and causal inference analysis reveals a significant increase in citations for papers endorsed by these influencers.
arXiv Detail & Related papers (2024-01-24T20:05:49Z) - NLLG Quarterly arXiv Report 09/23: What are the most influential current
AI Papers? [21.68589129842815]
The US dominates among both top-40 and top-9k papers, followed by China.
Europe clearly lags behind and is hardly represented in the top-40 most cited papers.
US industry is largely overrepresented in the top-40 most influential papers.
arXiv Detail & Related papers (2023-12-09T21:42:20Z) - Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers [1.5362868418787874]
Large language models (LLMs) are dramatically influencing AI research, spurring discussions on what has changed so far and how to shape the field's future.
To clarify such questions, we analyze a new dataset of 16,979 LLM-related arXiv papers, focusing on recent trends in 2023 vs. 2018-2022.
An influx of new authors -- half of all first authors in 2023 -- are entering from non-NLP fields of AI, driving disciplinary expansion.
Surprisingly, industry accounts for a smaller publication share in 2023, largely due to reduced output from Google and other Big Tech companies.
arXiv Detail & Related papers (2023-07-20T08:45:00Z) - Artificial intelligence adoption in the physical sciences, natural
sciences, life sciences, social sciences and the arts and humanities: A
bibliometric analysis of research publications from 1960-2021 [73.06361680847708]
In 1960 14% of 333 research fields were related to AI, but this increased to over half of all research fields by 1972, over 80% by 1986 and over 98% in current times.
In 1960 14% of 333 research fields were related to AI (many in computer science), but this increased to over half of all research fields by 1972, over 80% by 1986 and over 98% in current times.
We conclude that the context of the current surge appears different, and that interdisciplinary AI application is likely to be sustained.
arXiv Detail & Related papers (2023-06-15T14:08:07Z) - Forgotten Knowledge: Examining the Citational Amnesia in NLP [63.13508571014673]
We show how far back in time do we tend to go to cite papers? How has that changed over time, and what factors correlate with this citational attention/amnesia?
We show that around 62% of cited papers are from the immediate five years prior to publication, whereas only about 17% are more than ten years old.
We show that the median age and age diversity of cited papers were steadily increasing from 1990 to 2014, but since then, the trend has reversed, and current NLP papers have an all-time low temporal citation diversity.
arXiv Detail & Related papers (2023-05-29T18:30:34Z) - Breaking Out of the Ivory Tower: A Large-scale Analysis of Patent
Citations to HCI Research [13.172300323407143]
We perform a large-scale measurement study primarily of 70,000 patent citations to premier HCI research venues.
We observe that 20.1% of papers from these venues are cited by patents -- far greater than premier venues in science overall.
The time lag between a patent and its paper citations is long (10.5 years) and getting longer, suggesting that HCI research and practice may not be efficiently connected.
arXiv Detail & Related papers (2023-01-31T05:56:59Z) - Industry and Academic Research in Computer Vision [5.634825161148484]
This work aims to study the dynamic between research in the industry and academia in computer vision.
The results are demonstrated on a set of top-5 vision conferences that are representative of the field.
arXiv Detail & Related papers (2021-07-10T20:09:52Z) - A Measure of Research Taste [91.3755431537592]
We present a citation-based measure that rewards both productivity and taste.
The presented measure, CAP, balances the impact of publications and their quantity.
We analyze the characteristics of CAP for highly-cited researchers in biology, computer science, economics, and physics.
arXiv Detail & Related papers (2021-05-17T18:01:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.