DeepInnovation AI: A Global Dataset Mapping the AI innovation from Academic Research to Industrial Patents
- URL: http://arxiv.org/abs/2503.09257v4
- Date: Fri, 28 Mar 2025 08:22:52 GMT
- Title: DeepInnovation AI: A Global Dataset Mapping the AI innovation from Academic Research to Industrial Patents
- Authors: Haixing Gong, Hui Zou, Xingzhou Liang, Shiyuan Meng, Pinlong Cai, Xingcheng Xu, Jingjing Qu,
- Abstract summary: DeepInnovationAI is a comprehensive global dataset containing three structured files.<n>DeepInnovationAI enables researchers, policymakers, and industry leaders to anticipate trends and identify collaboration opportunities.
- Score: 2.8191246153416243
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the rapidly evolving field of artificial intelligence (AI), mapping innovation patterns and understanding effective technology transfer from research to applications are essential for economic growth. However, existing data infrastructures suffer from fragmentation, incomplete coverage, and insufficient evaluative capacity. Here, we present DeepInnovationAI, a comprehensive global dataset containing three structured files. DeepPatentAI.csv: Contains 2,356,204 patent records with 8 field-specific attributes. DeepDiveAI.csv: Encompasses 3,511,929 academic publications with 13 metadata fields. These two datasets leverage large language models, multilingual text analysis and dual-layer BERT classifiers to accurately identify AI-related content, while utilizing hypergraph analysis to create robust innovation metrics. Additionally, DeepCosineAI.csv: By applying semantic vector proximity analysis, this file presents approximately one hundred million calculated paper-patent similarity pairs to enhance understanding of how theoretical advancements translate into commercial technologies. DeepInnovationAI enables researchers, policymakers, and industry leaders to anticipate trends and identify collaboration opportunities. With extensive temporal and geographical scope, it supports detailed analysis of technological development patterns and international competition dynamics, establishing a foundation for modeling AI innovation and technology transfer processes.
Related papers
- AI in Agriculture: A Survey of Deep Learning Techniques for Crops, Fisheries and Livestock [77.95897723270453]
Crops, fisheries and livestock form the backbone of global food production, essential to feed the ever-growing global population.<n> Addressing these issues requires efficient, accurate, and scalable technological solutions, highlighting the importance of artificial intelligence (AI)<n>This survey presents a systematic and thorough review of more than 200 research works covering conventional machine learning approaches, advanced deep learning techniques, and recent vision-language foundation models.
arXiv Detail & Related papers (2025-07-29T17:59:48Z) - A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications [3.002468101812191]
We analyze more than 80 commercial and non-commercial implementations that have emerged since 2023.<n>We propose a novel hierarchical taxonomy that categorizes systems according to four fundamental technical dimensions.<n>Our analysis reveals both the significant capabilities of current implementations and the technical and ethical challenges they present.
arXiv Detail & Related papers (2025-06-14T18:19:05Z) - AI-powered Contextual 3D Environment Generation: A Systematic Review [49.1574468325115]
This study performs a systematic review of existing generative AI techniques for 3D scene generation.<n>By examining state-of-the-art approaches, it presents key challenges such as scene authenticity and the influence of textual inputs.
arXiv Detail & Related papers (2025-06-05T15:56:28Z) - Data-Driven Breakthroughs and Future Directions in AI Infrastructure: A Comprehensive Review [0.0]
This paper presents a comprehensive synthesis of major breakthroughs in artificial intelligence (AI) over the past fifteen years.<n>It identifies key inflection points in AI' s evolution by tracing the convergence of computational resources, data access, and algorithmic innovation.
arXiv Detail & Related papers (2025-05-22T15:12:48Z) - Materials Generation in the Era of Artificial Intelligence: A Comprehensive Survey [54.40267149907223]
Materials are the foundation of modern society, underpinning advancements in energy, electronics, healthcare, transportation, and infrastructure.<n>The ability to discover and design new materials with tailored properties is critical to solving some of the most pressing global challenges.<n>Data-driven generative models provide a powerful tool for materials design by directly create novel materials that satisfy predefined property requirements.
arXiv Detail & Related papers (2025-05-22T08:33:21Z) - From Text to Network: Constructing a Knowledge Graph of Taiwan-Based China Studies Using Generative AI [0.0]
Taiwanese China Studies (CS) has developed into a rich, interdisciplinary research field shaped by the unique geopolitical position and long standing academic engagement with Mainland China.<n>This study proposes an AI assisted approach that transforms unstructured academic texts into structured, interactive knowledge representations.
arXiv Detail & Related papers (2025-05-15T08:51:53Z) - TUM2TWIN: Introducing the Large-Scale Multimodal Urban Digital Twin Benchmark Dataset [90.97440987655084]
Urban Digital Twins (UDTs) have become essential for managing cities and integrating complex, heterogeneous data from diverse sources.<n>To address these challenges, we introduce the first comprehensive multimodal Urban Digital Twin benchmark dataset: TUM2TWIN.<n>This dataset includes georeferenced, semantically aligned 3D models and networks along with various terrestrial, mobile, aerial, and satellite observations boasting 32 data subsets over roughly 100,000 $m2$ and currently 767 GB of data.
arXiv Detail & Related papers (2025-05-12T09:48:32Z) - From ChatGPT to DeepSeek AI: A Comprehensive Analysis of Evolution, Deviation, and Future Implications in AI-Language Models [8.03446809073899]
The rapid advancement of artificial intelligence (AI) has reshaped the field of natural language processing (NLP), with models like OpenAI ChatGPT and DeepSeek AI.
This paper presents a detailed analysis of the evolution from ChatGPT to DeepSeek AI, highlighting their technical differences, practical applications, and broader implications for AI development.
arXiv Detail & Related papers (2025-04-04T07:08:29Z) - An Empirical Study of Validating Synthetic Data for Text-Based Person Retrieval [51.10419281315848]
We conduct an empirical study to explore the potential of synthetic data for Text-Based Person Retrieval (TBPR) research.
We propose an inter-class image generation pipeline, in which an automatic prompt construction strategy is introduced.
We develop an intra-class image augmentation pipeline, in which the generative AI models are applied to further edit the images.
arXiv Detail & Related papers (2025-03-28T06:18:15Z) - CS-PaperSum: A Large-Scale Dataset of AI-Generated Summaries for Scientific Papers [3.929864777332447]
CS-PaperSum is a large-scale dataset of 91,919 papers from 31 top-tier computer science conferences.<n>Our dataset enables automated literature analysis, research trend forecasting, and AI-driven scientific discovery.
arXiv Detail & Related papers (2025-02-27T22:48:35Z) - From large language models to multimodal AI: A scoping review on the potential of generative AI in medicine [40.23383597339471]
multimodal AI is capable of integrating diverse data modalities, including imaging, text, and structured data, within a single model.
This scoping review explores the evolution of multimodal AI, highlighting its methods, applications, datasets, and evaluation in clinical settings.
Our findings underscore a shift from unimodal to multimodal approaches, driving innovations in diagnostic support, medical report generation, drug discovery, and conversational AI.
arXiv Detail & Related papers (2025-02-13T11:57:51Z) - Exploring AI Text Generation, Retrieval-Augmented Generation, and Detection Technologies: a Comprehensive Overview [0.0]
Concerns surrounding AI-generated content, including issues of originality, bias, misinformation, and accountability, have become prominent.<n>This paper offers a comprehensive overview of AI text generators (AITGs), focusing on their evolution, capabilities, and ethical implications.<n>The paper explores future directions for improving detection accuracy, supporting ethical AI development, and increasing accessibility.
arXiv Detail & Related papers (2024-12-05T07:23:14Z) - O1 Replication Journey: A Strategic Progress Report -- Part 1 [52.062216849476776]
This paper introduces a pioneering approach to artificial intelligence research, embodied in our O1 Replication Journey.
Our methodology addresses critical challenges in modern AI research, including the insularity of prolonged team-based projects.
We propose the journey learning paradigm, which encourages models to learn not just shortcuts, but the complete exploration process.
arXiv Detail & Related papers (2024-10-08T15:13:01Z) - Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents [55.63497537202751]
Article explores the convergence of connectionist and symbolic artificial intelligence (AI)
Traditionally, connectionist AI focuses on neural networks, while symbolic AI emphasizes symbolic representation and logic.
Recent advancements in large language models (LLMs) highlight the potential of connectionist architectures in handling human language as a form of symbols.
arXiv Detail & Related papers (2024-07-11T14:00:53Z) - Curating Grounded Synthetic Data with Global Perspectives for Equitable AI [0.5120567378386615]
We introduce a novel approach to creating synthetic datasets, grounded in real-world diversity and enriched through strategic diversification.
We synthesize data using a comprehensive collection of news articles spanning 12 languages and originating from 125 countries, to ensure a breadth of linguistic and cultural representations.
Preliminary results demonstrate substantial improvements in performance on traditional NER benchmarks, by up to 7.3%.
arXiv Detail & Related papers (2024-06-10T17:59:11Z) - Generative Artificial Intelligence: A Systematic Review and Applications [7.729155237285151]
This paper documents the systematic review and analysis of recent advancements and techniques in Generative AI.
The major impact that generative AI has made to date, has been in language generation with the development of large language models.
The paper ends with a discussion of Responsible AI principles, and the necessary ethical considerations for the sustainability and growth of these generative models.
arXiv Detail & Related papers (2024-05-17T18:03:59Z) - Psittacines of Innovation? Assessing the True Novelty of AI Creations [0.26107298043931204]
We task an AI with generating project titles for hypothetical crowdfunding campaigns.
We compare within AI-generated project titles, measuring repetition and complexity.
Results suggest that the AI generates unique content even under increasing task complexity.
arXiv Detail & Related papers (2024-03-17T13:08:11Z) - AceMap: Knowledge Discovery through Academic Graph [90.12694363549483]
AceMap is an academic system designed for knowledge discovery through academic graph.
We present advanced database construction techniques to build the comprehensive AceMap database.
AceMap provides advanced analysis capabilities, including tracing the evolution of academic ideas.
arXiv Detail & Related papers (2024-03-05T01:17:56Z) - Generative AI in the Construction Industry: A State-of-the-art Analysis [0.4241054493737716]
There is a gap in the literature on the current state, opportunities, and challenges of generative AI in the construction industry.
This study aims to review and categorize the existing and emerging generative AI opportunities and challenges in the construction industry.
It proposes a framework for construction firms to build customized generative AI solutions using their own data.
arXiv Detail & Related papers (2024-02-15T13:39:55Z) - AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs.
This paper explores the innovative concept of harnessing these AI-generated images as new data sources.
In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z) - The Future of Fundamental Science Led by Generative Closed-Loop
Artificial Intelligence [67.70415658080121]
Recent advances in machine learning and AI are disrupting technological innovation, product development, and society as a whole.
AI has contributed less to fundamental science in part because large data sets of high-quality data for scientific practice and model discovery are more difficult to access.
Here we explore and investigate aspects of an AI-driven, automated, closed-loop approach to scientific discovery.
arXiv Detail & Related papers (2023-07-09T21:16:56Z) - A Comprehensive Survey of AI-Generated Content (AIGC): A History of
Generative AI from GAN to ChatGPT [63.58711128819828]
ChatGPT and other Generative AI (GAI) techniques belong to the category of Artificial Intelligence Generated Content (AIGC)
The goal of AIGC is to make the content creation process more efficient and accessible, allowing for the production of high-quality content at a faster pace.
arXiv Detail & Related papers (2023-03-07T20:36:13Z) - Convergence of Artificial Intelligence and High Performance Computing on
NSF-supported Cyberinfrastructure [3.4291439418246177]
Artificial Intelligence (AI) applications have powered transformational solutions for big data challenges in industry and technology.
As AI continues to evolve into a computing paradigm endowed with statistical and mathematical rigor, it has become apparent that single- GPU solutions for training, validation, and testing are no longer sufficient.
This realization has been driving the confluence of AI and high performance computing to reduce time-to-insight.
arXiv Detail & Related papers (2020-03-18T18:00:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.