Related papers: AI Marker-based Large-scale AI Literature Mining

AI Marker-based Large-scale AI Literature Mining

URL: http://arxiv.org/abs/2011.00518v2
Date: Tue, 3 Nov 2020 04:13:16 GMT
Title: AI Marker-based Large-scale AI Literature Mining
Authors: Rujing Yao, Yingchun Ye, Ji Zhang, Shuxiao Li and Ou Wu
Abstract summary: Methods, datasets and metrics are used as AI markers for AI literature. The entity extraction model is used in this study to extract AI markers from large-scale AI literature. The evolution within method clusters and the influencing relationships amongst different research scene clusters are explored.
Score: 5.144684482990409
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The knowledge contained in academic literature is interesting to mine. Inspired by the idea of molecular markers tracing in the field of biochemistry, three named entities, namely, methods, datasets and metrics are used as AI markers for AI literature. These entities can be used to trace the research process described in the bodies of papers, which opens up new perspectives for seeking and mining more valuable academic information. Firstly, the entity extraction model is used in this study to extract AI markers from large-scale AI literature. Secondly, original papers are traced for AI markers. Statistical and propagation analysis are performed based on tracing results. Finally, the co-occurrences of AI markers are used to achieve clustering. The evolution within method clusters and the influencing relationships amongst different research scene clusters are explored. The above-mentioned mining based on AI markers yields many meaningful discoveries. For example, the propagation of effective methods on the datasets is rapidly increasing with the development of time; effective methods proposed by China in recent years have increasing influence on other countries, whilst France is the opposite. Saliency detection, a classic computer vision research scene, is the least likely to be affected by other research scenes.

Related papers

The Budget AI Researcher and the Power of RAG Chains [4.797627592793464]
Current approaches to supporting research idea generation often rely on generic large language models (LLMs)<n>Our framework, The Budget AI Researcher, uses retrieval-augmented generation chains, vector databases, and topic-guided pairing to recombine concepts from hundreds of machine learning papers.<n>The system ingests papers from nine major AI conferences, which collectively span the vast subfields of machine learning, and organizes them into a hierarchical topic tree.
arXiv Detail & Related papers (2025-06-14T02:40:35Z)
Enhancing literature review with LLM and NLP methods. Algorithmic trading case [0.0]
This study utilizes machine learning algorithms to analyze and organize knowledge in the field of algorithmic trading. By filtering a dataset of 136 million research papers, we identified 14,342 relevant articles published between 1956 and Q1 2020.
arXiv Detail & Related papers (2024-10-23T13:37:27Z)
Automating Bibliometric Analysis with Sentence Transformers and Retrieval-Augmented Generation (RAG): A Pilot Study in Semantic and Contextual Search for Customized Literature Characterization for High-Impact Urban Research [2.1728621449144763]
Bibliometric analysis is essential for understanding research trends, scope, and impact in urban science. Traditional methods, relying on keyword searches, often fail to uncover valuable insights not explicitly stated in article titles or keywords. We leverage Generative AI models, specifically transformers and Retrieval-Augmented Generation (RAG), to automate and enhance bibliometric analysis.
arXiv Detail & Related papers (2024-10-08T05:13:27Z)
From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal Reasoning with Large Language Models [56.9134620424985]
Cross-modal reasoning (CMR) is increasingly recognized as a crucial capability in the progression toward more sophisticated artificial intelligence systems. The recent trend of deploying Large Language Models (LLMs) to tackle CMR tasks has marked a new mainstream of approaches for enhancing their effectiveness. This survey offers a nuanced exposition of current methodologies applied in CMR using LLMs, classifying these into a detailed three-tiered taxonomy.
arXiv Detail & Related papers (2024-09-19T02:51:54Z)
Masked Image Modeling: A Survey [73.21154550957898]
Masked image modeling emerged as a powerful self-supervised learning technique in computer vision. We construct a taxonomy and review the most prominent papers in recent years. We aggregate the performance results of various masked image modeling methods on the most popular datasets.
arXiv Detail & Related papers (2024-08-13T07:27:02Z)
Large Language Models and Knowledge Graphs for Astronomical Entity Disambiguation [0.0]
This paper focuses on using large language models (LLMs) and knowledge graph clustering to extract entities and relationships from astronomical text. The experiment showcases the potential of combining LLMs and knowledge graph clustering techniques for information extraction in astronomical research.
arXiv Detail & Related papers (2024-06-17T10:38:03Z)
A Survey on Data Selection for Language Models [148.300726396877]
Data selection methods aim to determine which data points to include in a training dataset. Deep learning is mostly driven by empirical evidence and experimentation on large-scale data is expensive. Few organizations have the resources for extensive data selection research.
arXiv Detail & Related papers (2024-02-26T18:54:35Z)
Data-Driven Information Extraction and Enrichment of Molecular Profiling Data for Cancer Cell Lines [1.1999555634662633]
This work presents the design, implementation and application of a novel data extraction and exploration system. We introduce a new public data exploration portal, which enables automatic linking of genomic copy number variants plots with ranked, related entities. Our system is publicly available on the web at https://cancercelllines.org.
arXiv Detail & Related papers (2023-07-03T11:15:42Z)
Neurosymbolic AI and its Taxonomy: a survey [48.7576911714538]
Neurosymbolic AI deals with models that combine symbolic processing, like classic AI, and neural networks. This survey investigates research papers in this area during recent years and brings classification and comparison between the presented models as well as applications.
arXiv Detail & Related papers (2023-05-12T19:51:13Z)
Citation Trajectory Prediction via Publication Influence Representation Using Temporal Knowledge Graph [52.07771598974385]
Existing approaches mainly rely on mining temporal and graph data from academic articles. Our framework is composed of three modules: difference-preserved graph embedding, fine-grained influence representation, and learning-based trajectory calculation. Experiments are conducted on both the APS academic dataset and our contributed AIPatent dataset.
arXiv Detail & Related papers (2022-10-02T07:43:26Z)
Research Trends and Applications of Data Augmentation Algorithms [77.34726150561087]
We identify the main areas of application of data augmentation algorithms, the types of algorithms used, significant research trends, their progression over time and research gaps in data augmentation literature. We expect readers to understand the potential of data augmentation, as well as identify future research directions and open questions within data augmentation research.
arXiv Detail & Related papers (2022-07-18T11:38:32Z)
Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts. We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data. We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.