Related papers: Using Full-text Content of Academic Articles to Build a Methodology Taxonomy of Information Science in China

Using Full-text Content of Academic Articles to Build a Methodology Taxonomy of Information Science in China

URL: http://arxiv.org/abs/2101.07924v1
Date: Wed, 20 Jan 2021 01:56:43 GMT
Title: Using Full-text Content of Academic Articles to Build a Methodology Taxonomy of Information Science in China
Authors: Heng Zhang, Chengzhi Zhang
Abstract summary: This study provides new concepts for constructing a methodology taxonomy of information science. The proposed methodology taxonomy is more detailed than conventional schemes and the speed of taxonomy renewal has been enhanced.
Score: 10.949304105928286
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Research on the construction of traditional information science methodology taxonomy is mostly conducted manually. From the limited corpus, researchers have attempted to summarize some of the research methodology entities into several abstract levels (generally three levels); however, they have been unable to provide a more granular hierarchy. Moreover, updating the methodology taxonomy is traditionally a slow process. In this study, we collected full-text academic papers related to information science. First, we constructed a basic methodology taxonomy with three levels by manual annotation. Then, the word vectors of the research methodology entities were trained using the full-text data. Accordingly, the research methodology entities were clustered and the basic methodology taxonomy was expanded using the clustering results to obtain a methodology taxonomy with more levels. This study provides new concepts for constructing a methodology taxonomy of information science. The proposed methodology taxonomy is semi-automated; it is more detailed than conventional schemes and the speed of taxonomy renewal has been enhanced.

Related papers

A Hybrid AI Methodology for Generating Ontologies of Research Topics from Scientific Paper Corpora [6.384357773998868]
Sci-OG is a semi-auto-mated methodology for generating research topic.<n>This paper presents Sci-OG, a semi-auto-mated methodology for generating research topic.<n>We evaluate this approach against a range of alternative solutions using a dataset of 21,649 manually annotated semantic triples.
arXiv Detail & Related papers (2025-08-06T08:48:14Z)
TaxoAdapt: Aligning LLM-Based Multidimensional Taxonomy Construction to Evolving Research Corpora [34.103517830260365]
TaxoAdapt is a framework that adapts an LLM-generated taxonomy to a given corpus across multiple dimensions.<n>We demonstrate its state-of-the-art performance across a diverse set of computer science conferences.
arXiv Detail & Related papers (2025-06-12T14:26:28Z)
Taxonomy Tree Generation from Citation Graph [15.188580557890942]
HiGTL is a novel end-to-end framework guided by human-provided instructions or preferred topics. We develop a novel taxonomy node verbalization strategy that iteratively generates central concepts for each cluster. Experiments demonstrate that HiGTL effectively produces coherent, high-quality concept.
arXiv Detail & Related papers (2024-10-02T13:02:03Z)
Automatic Bottom-Up Taxonomy Construction: A Software Application Domain Study [6.0158981171030685]
Previous research in software application domain classification has faced challenges due to the lack of a proper taxonomy. This study aims to develop a comprehensive software application domain taxonomy by integrating multiple datasources and leveraging ensemble methods.
arXiv Detail & Related papers (2024-09-24T08:55:07Z)
An Instance-based Plus Ensemble Learning Method for Classification of Scientific Papers [2.0794749869068005]
This paper introduces a novel approach that combines instance-based learning and ensemble learning techniques for classifying scientific papers. Experiments show that the proposed classification method is effective and efficient in categorizing papers into various research areas.
arXiv Detail & Related papers (2024-09-21T19:42:15Z)
Empirical and Experimental Perspectives on Big Data in Recommendation Systems: A Comprehensive Survey [2.6319554262325924]
This survey paper provides a comprehensive analysis of big data algorithms in recommendation systems. It proposes a two-pronged approach: a thorough analysis of current algorithms and a novel, hierarchical taxonomy for precise categorization.
arXiv Detail & Related papers (2024-02-01T23:51:29Z)
A Comprehensive Survey of Text Classification Techniques and Their Research Applications: Observational and Experimental Insights [2.1436706159840013]
This survey paper introduces a comprehensive taxonomy specifically designed for text classification based on research fields. The taxonomy is structured into hierarchical levels: research field-based category, research field-based sub-category, methodology-based technique, methodology sub-technique, and research field applications.
arXiv Detail & Related papers (2024-01-11T08:17:42Z)
Incremental hierarchical text clustering methods: a review [49.32130498861987]
This study aims to analyze various hierarchical and incremental clustering techniques. The main contribution of this research is the organization and comparison of the techniques used by studies published between 2010 and 2018 that aimed to texts documents clustering.
arXiv Detail & Related papers (2023-12-12T22:27:29Z)
Taxonomy Enrichment with Text and Graph Vector Representations [61.814256012166794]
We address the problem of taxonomy enrichment which aims at adding new words to the existing taxonomy. We present a new method that allows achieving high results on this task with little effort. We achieve state-of-the-art results across different datasets and provide an in-depth error analysis of mistakes.
arXiv Detail & Related papers (2022-01-21T09:01:12Z)
TaxoCom: Topic Taxonomy Completion with Hierarchical Discovery of Novel Topic Clusters [57.59286394188025]
We propose a novel framework for topic taxonomy completion, named TaxoCom. TaxoCom discovers novel sub-topic clusters of terms and documents. Our comprehensive experiments on two real-world datasets demonstrate that TaxoCom not only generates the high-quality topic taxonomy in terms of term coherency and topic coverage.
arXiv Detail & Related papers (2022-01-18T07:07:38Z)
Can Taxonomy Help? Improving Semantic Question Matching using Question Taxonomy [37.57300969050908]
We propose a hybrid technique for semantic question matching. It uses our proposed two-layered taxonomy for English questions by augmenting state-of-the-art deep learning models with question classes obtained from a deep learning based question.
arXiv Detail & Related papers (2021-01-20T16:23:04Z)
Studying Taxonomy Enrichment on Diachronic WordNet Versions [70.27072729280528]
We explore the possibilities of taxonomy extension in a resource-poor setting and present methods which are applicable to a large number of languages. We create novel English and Russian datasets for training and evaluating taxonomy enrichment models and describe a technique of creating such datasets for other languages.
arXiv Detail & Related papers (2020-11-23T16:49:37Z)
A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning. This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021. We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z)
Octet: Online Catalog Taxonomy Enrichment with Self-Supervision [67.26804972901952]
We present a self-supervised end-to-end framework, Octet for Online Catalog EnrichmenT. We propose to train a sequence labeling model for term extraction and employ graph neural networks (GNNs) to capture the taxonomy structure. Octet enriches an online catalog in production to 2 times larger in the open-world evaluation.
arXiv Detail & Related papers (2020-06-18T04:53:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.