Quantifying the Knowledge Proximity Between Academic and Industry Research: An Entity and Semantic Perspective
- URL: http://arxiv.org/abs/2602.05211v1
- Date: Thu, 05 Feb 2026 02:12:47 GMT
- Title: Quantifying the Knowledge Proximity Between Academic and Industry Research: An Entity and Semantic Perspective
- Authors: Hongye Zhao, Yi Zhao, Chengzhi Zhang,
- Abstract summary: The academia and industry are characterized by a reciprocal shaping and dynamic feedback mechanism.<n>Existing studies on their knowledge proximity mainly rely on macro indicators such as the number of collaborative papers or patents.<n>This study quantifies the trajectory of academia-industry co-evolution through fine-grained entities and semantic space.
- Score: 7.232456257799662
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The academia and industry are characterized by a reciprocal shaping and dynamic feedback mechanism. Despite distinct institutional logics, they have adapted closely in collaborative publishing and talent mobility, demonstrating tension between institutional divergence and intensive collaboration. Existing studies on their knowledge proximity mainly rely on macro indicators such as the number of collaborative papers or patents, lacking an analysis of knowledge units in the literature. This has led to an insufficient grasp of fine-grained knowledge proximity between industry and academia, potentially undermining collaboration frameworks and resource allocation efficiency. To remedy the limitation, this study quantifies the trajectory of academia-industry co-evolution through fine-grained entities and semantic space. In the entity measurement part, we extract fine-grained knowledge entities via pre-trained models, measure sequence overlaps using cosine similarity, and analyze topological features through complex network analysis. At the semantic level, we employ unsupervised contrastive learning to quantify convergence in semantic spaces by measuring cross-institutional textual similarities. Finally, we use citation distribution patterns to examine correlations between bidirectional knowledge flows and similarity. Analysis reveals that knowledge proximity between academia and industry rises, particularly following technological change. This provides textual evidence of bidirectional adaptation in co-evolution. Additionally, academia's knowledge dominance weakens during technological paradigm shifts. The dataset and code for this paper can be accessed at https://github.com/tinierZhao/Academic-Industrial-associations.
Related papers
- Towards a Science of Collective AI: LLM-based Multi-Agent Systems Need a Transition from Blind Trial-and-Error to Rigorous Science [70.3658845234978]
Large Language Models (LLMs) have greatly extended the capabilities of Multi-Agent Systems (MAS)<n>Despite this rapid progress, the field still relies heavily on empirical trial-and-error.<n>This bottleneck stems from the ambiguity of attribution.<n>We propose a factor attribution paradigm to systematically identify collaboration-driving factors.
arXiv Detail & Related papers (2026-02-05T04:19:52Z) - Quantifying the Gap between Understanding and Generation within Unified Multimodal Models [66.07644743841007]
GapEval is a benchmark designed to quantify the gap between understanding and generation capabilities.<n>Experiments reveal a persistent gap between the two directions across a wide range of UMMs.<n>Our findings indicate that knowledge within UMMs often remains disjoint.
arXiv Detail & Related papers (2026-02-02T14:19:37Z) - Computational Foundations for Strategic Coopetition: Formalizing Interdependence and Complementarity [0.33985917934283577]
This report develops computational foundations that formalize two critical dimensions of coopetition: interdependence and complementarity.<n>We ground interdependence in i* structural dependency analysis, translating depender-dependee-dependum relationships into quantitative interdependence coefficients through a structured translation framework.<n>We formalize complementarity following Brandenburger and Nalebuff's Added Value concept, modeling synergistic value creation with validated parameterization.<n>We integrate structural dependencies with bargaining power in value appropriation and introduce a game-theoretic formulation where Nash Equilibrium incorporates structural interdependence.
arXiv Detail & Related papers (2025-10-21T16:57:40Z) - Trajectories of Change: Approaches for Tracking Knowledge Evolution [0.0]
We explore local vs. global evolution of knowledge systems through the framework of socio-epistemic networks (SEN)<n>We first use information-theoretic measures based on relative entropy to detect semantic shifts, assess their significance, and identify key driving features.<n>Second, variations in document embedding reveal changes in semantic neighbourhoods, tracking how concentration of similar documents increase, remain stable, or disperse.
arXiv Detail & Related papers (2024-12-31T11:09:37Z) - Discovering emergent connections in quantum physics research via dynamic word embeddings [0.562479170374811]
We introduce a novel approach based on dynamic word embeddings for concept combination prediction.
Unlike knowledge graphs, our method captures implicit relationships between concepts, can be learned in a fully unsupervised manner, and encodes a broader spectrum of information.
Our findings suggest that this representation offers a more flexible and informative way of modeling conceptual relationships in scientific literature.
arXiv Detail & Related papers (2024-11-10T19:45:59Z) - Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers [49.80959223722325]
We study the distinction between feed-forward and attention layers in large language models.<n>We find that feed-forward layers tend to learn simple distributional associations such as bigrams, while attention layers focus on in-context reasoning.
arXiv Detail & Related papers (2024-06-05T08:51:08Z) - The Clever Hans Mirage: A Comprehensive Survey on Spurious Correlations in Machine Learning [78.13481522957552]
Machine learning models are sensitive to spurious correlations between non-essential features of the inputs and the corresponding labels.<n>This paper provides a comprehensive survey of this emerging issue, along with a fine-grained taxonomy of existing state-of-the-art methods for addressing spurious correlations in machine learning models.
arXiv Detail & Related papers (2024-02-20T04:49:34Z) - Innovation and Word Usage Patterns in Machine Learning [1.3812010983144802]
We identify pivotal themes and fundamental concepts that have emerged within the realm of machine learning.
To quantify the novelty and divergence of research contributions, we employ the Kullback-Leibler Divergence metric.
arXiv Detail & Related papers (2023-11-07T00:41:15Z) - Knowledge-Enhanced Hierarchical Information Correlation Learning for
Multi-Modal Rumor Detection [82.94413676131545]
We propose a novel knowledge-enhanced hierarchical information correlation learning approach (KhiCL) for multi-modal rumor detection.
KhiCL exploits cross-modal joint dictionary to transfer the heterogeneous unimodality features into the common feature space.
It extracts visual and textual entities from images and text, and designs a knowledge relevance reasoning strategy.
arXiv Detail & Related papers (2023-06-28T06:08:20Z) - How Do Transformers Learn Topic Structure: Towards a Mechanistic
Understanding [56.222097640468306]
We provide mechanistic understanding of how transformers learn "semantic structure"
We show, through a combination of mathematical analysis and experiments on Wikipedia data, that the embedding layer and the self-attention layer encode the topical structure.
arXiv Detail & Related papers (2023-03-07T21:42:17Z) - Transformer-based Dual Relation Graph for Multi-label Image Recognition [56.12543717723385]
We propose a novel Transformer-based Dual Relation learning framework.
We explore two aspects of correlation, i.e., structural relation graph and semantic relation graph.
Our approach achieves new state-of-the-art on two popular multi-label recognition benchmarks.
arXiv Detail & Related papers (2021-10-10T07:14:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.