Extraction of Research Objectives, Machine Learning Model Names, and Dataset Names from Academic Papers and Analysis of Their Interrelationships Using LLM and Network Analysis
- URL: http://arxiv.org/abs/2408.12097v1
- Date: Thu, 22 Aug 2024 03:10:52 GMT
- Title: Extraction of Research Objectives, Machine Learning Model Names, and Dataset Names from Academic Papers and Analysis of Their Interrelationships Using LLM and Network Analysis
- Authors: S. Nishio, H. Nonaka, N. Tsuchiya, A. Migita, Y. Banno, T. Hayashi, H. Sakaji, T. Sakumoto, K. Watabe,
- Abstract summary: This study proposes a methodology extracting tasks, machine learning methods, and dataset names from scientific papers.
The proposed method's expression extraction performance, when using Llama3, achieves an F-score exceeding 0.8 across various categories.
Benchmarking results on financial domain papers have demonstrated the effectiveness of this method.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning is widely utilized across various industries. Identifying the appropriate machine learning models and datasets for specific tasks is crucial for the effective industrial application of machine learning. However, this requires expertise in both machine learning and the relevant domain, leading to a high learning cost. Therefore, research focused on extracting combinations of tasks, machine learning models, and datasets from academic papers is critically important, as it can facilitate the automatic recommendation of suitable methods. Conventional information extraction methods from academic papers have been limited to identifying machine learning models and other entities as named entities. To address this issue, this study proposes a methodology extracting tasks, machine learning methods, and dataset names from scientific papers and analyzing the relationships between these information by using LLM, embedding model, and network clustering. The proposed method's expression extraction performance, when using Llama3, achieves an F-score exceeding 0.8 across various categories, confirming its practical utility. Benchmarking results on financial domain papers have demonstrated the effectiveness of this method, providing insights into the use of the latest datasets, including those related to ESG (Environmental, Social, and Governance) data.
Related papers
- A Survey of Small Language Models [104.80308007044634]
Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources.
We present a comprehensive survey on SLMs, focusing on their architectures, training techniques, and model compression techniques.
arXiv Detail & Related papers (2024-10-25T23:52:28Z) - Deep Learning and Machine Learning -- Object Detection and Semantic Segmentation: From Theory to Applications [17.571124565519263]
Book covers state-of-the-art advancements in machine learning and deep learning.
Focuses on convolutional neural networks (CNNs), YOLO architectures, and transformer-based approaches like DETR.
Book also delves into the integration of artificial intelligence (AI) techniques and large language models for enhanced object detection.
arXiv Detail & Related papers (2024-10-21T02:10:49Z) - Structure Learning via Mutual Information [0.8702432681310399]
We propose a framework for learning and representing functional relationships in data using mutual information (MI) features.
Our method aims to capture the underlying structure of information in datasets, enabling more efficient and generalizable learning algorithms.
arXiv Detail & Related papers (2024-09-21T19:33:56Z) - Topological Methods in Machine Learning: A Tutorial for Practitioners [4.297070083645049]
Topological Machine Learning (TML) is an emerging field that leverages techniques from algebraic topology to analyze complex data structures.
This tutorial provides a comprehensive introduction to two key TML techniques, persistent homology and the Mapper algorithm.
To enhance accessibility, we adopt a data-centric approach, enabling readers to gain hands-on experience applying these techniques to relevant tasks.
arXiv Detail & Related papers (2024-09-04T17:44:52Z) - The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements.
LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information.
Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z) - Language models are weak learners [71.33837923104808]
We show that prompt-based large language models can operate effectively as weak learners.
We incorporate these models into a boosting approach, which can leverage the knowledge within the model to outperform traditional tree-based boosting.
Results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.
arXiv Detail & Related papers (2023-06-25T02:39:19Z) - A Cloud-based Machine Learning Pipeline for the Efficient Extraction of
Insights from Customer Reviews [0.0]
We present a cloud-based system that can extract insights from customer reviews using machine learning methods integrated into a pipeline.
For topic modeling, our composite model uses transformer-based neural networks designed for natural language processing.
Our system can achieve better results than this task's existing topic modeling and keyword extraction solutions.
arXiv Detail & Related papers (2023-06-13T14:07:52Z) - Interactive Machine Learning: A State of the Art Review [0.0]
We provide a comprehensive analysis of the state-of-the-art of interactive machine learning (iML)
Research works on adversarial black-box attacks and corresponding iML based defense system, exploratory machine learning, resource constrained learning, and iML performance evaluation are analyzed.
arXiv Detail & Related papers (2022-07-13T13:43:16Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - A Survey on Large-scale Machine Learning [67.6997613600942]
Machine learning can provide deep insights into data, allowing machines to make high-quality predictions.
Most sophisticated machine learning approaches suffer from huge time costs when operating on large-scale data.
Large-scale Machine Learning aims to learn patterns from big data with comparable performance efficiently.
arXiv Detail & Related papers (2020-08-10T06:07:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.