DeepInnovation AI: A Global Dataset Mapping the AI innovation from Academic Research to Industrial Patents
- URL: http://arxiv.org/abs/2503.09257v4
- Date: Fri, 28 Mar 2025 08:22:52 GMT
- Title: DeepInnovation AI: A Global Dataset Mapping the AI innovation from Academic Research to Industrial Patents
- Authors: Haixing Gong, Hui Zou, Xingzhou Liang, Shiyuan Meng, Pinlong Cai, Xingcheng Xu, Jingjing Qu,
- Abstract summary: DeepInnovationAI is a comprehensive global dataset containing three structured files.<n>DeepInnovationAI enables researchers, policymakers, and industry leaders to anticipate trends and identify collaboration opportunities.
- Score: 2.8191246153416243
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the rapidly evolving field of artificial intelligence (AI), mapping innovation patterns and understanding effective technology transfer from research to applications are essential for economic growth. However, existing data infrastructures suffer from fragmentation, incomplete coverage, and insufficient evaluative capacity. Here, we present DeepInnovationAI, a comprehensive global dataset containing three structured files. DeepPatentAI.csv: Contains 2,356,204 patent records with 8 field-specific attributes. DeepDiveAI.csv: Encompasses 3,511,929 academic publications with 13 metadata fields. These two datasets leverage large language models, multilingual text analysis and dual-layer BERT classifiers to accurately identify AI-related content, while utilizing hypergraph analysis to create robust innovation metrics. Additionally, DeepCosineAI.csv: By applying semantic vector proximity analysis, this file presents approximately one hundred million calculated paper-patent similarity pairs to enhance understanding of how theoretical advancements translate into commercial technologies. DeepInnovationAI enables researchers, policymakers, and industry leaders to anticipate trends and identify collaboration opportunities. With extensive temporal and geographical scope, it supports detailed analysis of technological development patterns and international competition dynamics, establishing a foundation for modeling AI innovation and technology transfer processes.
Related papers
- From ChatGPT to DeepSeek AI: A Comprehensive Analysis of Evolution, Deviation, and Future Implications in AI-Language Models [8.03446809073899]
The rapid advancement of artificial intelligence (AI) has reshaped the field of natural language processing (NLP), with models like OpenAI ChatGPT and DeepSeek AI.
This paper presents a detailed analysis of the evolution from ChatGPT to DeepSeek AI, highlighting their technical differences, practical applications, and broader implications for AI development.
arXiv Detail & Related papers (2025-04-04T07:08:29Z) - An Empirical Study of Validating Synthetic Data for Text-Based Person Retrieval [51.10419281315848]
We conduct an empirical study to explore the potential of synthetic data for Text-Based Person Retrieval (TBPR) research.
We propose an inter-class image generation pipeline, in which an automatic prompt construction strategy is introduced.
We develop an intra-class image augmentation pipeline, in which the generative AI models are applied to further edit the images.
arXiv Detail & Related papers (2025-03-28T06:18:15Z) - From large language models to multimodal AI: A scoping review on the potential of generative AI in medicine [40.23383597339471]
multimodal AI is capable of integrating diverse data modalities, including imaging, text, and structured data, within a single model.
This scoping review explores the evolution of multimodal AI, highlighting its methods, applications, datasets, and evaluation in clinical settings.
Our findings underscore a shift from unimodal to multimodal approaches, driving innovations in diagnostic support, medical report generation, drug discovery, and conversational AI.
arXiv Detail & Related papers (2025-02-13T11:57:51Z) - Exploring AI Text Generation, Retrieval-Augmented Generation, and Detection Technologies: a Comprehensive Overview [0.0]
Concerns surrounding AI-generated content, including issues of originality, bias, misinformation, and accountability, have become prominent.<n>This paper offers a comprehensive overview of AI text generators (AITGs), focusing on their evolution, capabilities, and ethical implications.<n>The paper explores future directions for improving detection accuracy, supporting ethical AI development, and increasing accessibility.
arXiv Detail & Related papers (2024-12-05T07:23:14Z) - O1 Replication Journey: A Strategic Progress Report -- Part 1 [52.062216849476776]
This paper introduces a pioneering approach to artificial intelligence research, embodied in our O1 Replication Journey.
Our methodology addresses critical challenges in modern AI research, including the insularity of prolonged team-based projects.
We propose the journey learning paradigm, which encourages models to learn not just shortcuts, but the complete exploration process.
arXiv Detail & Related papers (2024-10-08T15:13:01Z) - Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents [55.63497537202751]
Article explores the convergence of connectionist and symbolic artificial intelligence (AI)
Traditionally, connectionist AI focuses on neural networks, while symbolic AI emphasizes symbolic representation and logic.
Recent advancements in large language models (LLMs) highlight the potential of connectionist architectures in handling human language as a form of symbols.
arXiv Detail & Related papers (2024-07-11T14:00:53Z) - Curating Grounded Synthetic Data with Global Perspectives for Equitable AI [0.5120567378386615]
We introduce a novel approach to creating synthetic datasets, grounded in real-world diversity and enriched through strategic diversification.
We synthesize data using a comprehensive collection of news articles spanning 12 languages and originating from 125 countries, to ensure a breadth of linguistic and cultural representations.
Preliminary results demonstrate substantial improvements in performance on traditional NER benchmarks, by up to 7.3%.
arXiv Detail & Related papers (2024-06-10T17:59:11Z) - Generative Artificial Intelligence: A Systematic Review and Applications [7.729155237285151]
This paper documents the systematic review and analysis of recent advancements and techniques in Generative AI.
The major impact that generative AI has made to date, has been in language generation with the development of large language models.
The paper ends with a discussion of Responsible AI principles, and the necessary ethical considerations for the sustainability and growth of these generative models.
arXiv Detail & Related papers (2024-05-17T18:03:59Z) - Psittacines of Innovation? Assessing the True Novelty of AI Creations [0.26107298043931204]
We task an AI with generating project titles for hypothetical crowdfunding campaigns.
We compare within AI-generated project titles, measuring repetition and complexity.
Results suggest that the AI generates unique content even under increasing task complexity.
arXiv Detail & Related papers (2024-03-17T13:08:11Z) - Generative AI in the Construction Industry: A State-of-the-art Analysis [0.4241054493737716]
There is a gap in the literature on the current state, opportunities, and challenges of generative AI in the construction industry.
This study aims to review and categorize the existing and emerging generative AI opportunities and challenges in the construction industry.
It proposes a framework for construction firms to build customized generative AI solutions using their own data.
arXiv Detail & Related papers (2024-02-15T13:39:55Z) - AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs.
This paper explores the innovative concept of harnessing these AI-generated images as new data sources.
In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z) - The Future of Fundamental Science Led by Generative Closed-Loop
Artificial Intelligence [67.70415658080121]
Recent advances in machine learning and AI are disrupting technological innovation, product development, and society as a whole.
AI has contributed less to fundamental science in part because large data sets of high-quality data for scientific practice and model discovery are more difficult to access.
Here we explore and investigate aspects of an AI-driven, automated, closed-loop approach to scientific discovery.
arXiv Detail & Related papers (2023-07-09T21:16:56Z) - A Comprehensive Survey of AI-Generated Content (AIGC): A History of
Generative AI from GAN to ChatGPT [63.58711128819828]
ChatGPT and other Generative AI (GAI) techniques belong to the category of Artificial Intelligence Generated Content (AIGC)
The goal of AIGC is to make the content creation process more efficient and accessible, allowing for the production of high-quality content at a faster pace.
arXiv Detail & Related papers (2023-03-07T20:36:13Z) - Convergence of Artificial Intelligence and High Performance Computing on
NSF-supported Cyberinfrastructure [3.4291439418246177]
Artificial Intelligence (AI) applications have powered transformational solutions for big data challenges in industry and technology.
As AI continues to evolve into a computing paradigm endowed with statistical and mathematical rigor, it has become apparent that single- GPU solutions for training, validation, and testing are no longer sufficient.
This realization has been driving the confluence of AI and high performance computing to reduce time-to-insight.
arXiv Detail & Related papers (2020-03-18T18:00:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.