Using Full-text Content of Academic Articles to Build a Methodology
Taxonomy of Information Science in China
- URL: http://arxiv.org/abs/2101.07924v1
- Date: Wed, 20 Jan 2021 01:56:43 GMT
- Title: Using Full-text Content of Academic Articles to Build a Methodology
Taxonomy of Information Science in China
- Authors: Heng Zhang, Chengzhi Zhang
- Abstract summary: This study provides new concepts for constructing a methodology taxonomy of information science.
The proposed methodology taxonomy is more detailed than conventional schemes and the speed of taxonomy renewal has been enhanced.
- Score: 10.949304105928286
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Research on the construction of traditional information science methodology
taxonomy is mostly conducted manually. From the limited corpus, researchers
have attempted to summarize some of the research methodology entities into
several abstract levels (generally three levels); however, they have been
unable to provide a more granular hierarchy. Moreover, updating the methodology
taxonomy is traditionally a slow process. In this study, we collected full-text
academic papers related to information science. First, we constructed a basic
methodology taxonomy with three levels by manual annotation. Then, the word
vectors of the research methodology entities were trained using the full-text
data. Accordingly, the research methodology entities were clustered and the
basic methodology taxonomy was expanded using the clustering results to obtain
a methodology taxonomy with more levels. This study provides new concepts for
constructing a methodology taxonomy of information science. The proposed
methodology taxonomy is semi-automated; it is more detailed than conventional
schemes and the speed of taxonomy renewal has been enhanced.
Related papers
- Automatic Bottom-Up Taxonomy Construction: A Software Application Domain Study [6.0158981171030685]
Previous research in software application domain classification has faced challenges due to the lack of a proper taxonomy.
This study aims to develop a comprehensive software application domain taxonomy by integrating multiple datasources and leveraging ensemble methods.
arXiv Detail & Related papers (2024-09-24T08:55:07Z) - An Instance-based Plus Ensemble Learning Method for Classification of Scientific Papers [2.0794749869068005]
This paper introduces a novel approach that combines instance-based learning and ensemble learning techniques for classifying scientific papers.
Experiments show that the proposed classification method is effective and efficient in categorizing papers into various research areas.
arXiv Detail & Related papers (2024-09-21T19:42:15Z) - Empirical and Experimental Perspectives on Big Data in Recommendation
Systems: A Comprehensive Survey [2.6319554262325924]
This survey paper provides a comprehensive analysis of big data algorithms in recommendation systems.
It proposes a two-pronged approach: a thorough analysis of current algorithms and a novel, hierarchical taxonomy for precise categorization.
arXiv Detail & Related papers (2024-02-01T23:51:29Z) - Text Classification: A Review, Empirical, and Experimental Evaluation [2.341806147715478]
Existing survey papers categorize algorithms for text classification into broad classes.
We introduce a novel methodological taxonomy that classifies algorithms hierarchically into fine-grained classes and specific techniques.
Our study is the first survey to utilize this methodological taxonomy for classifying algorithms for text classification.
arXiv Detail & Related papers (2024-01-11T08:17:42Z) - Incremental hierarchical text clustering methods: a review [49.32130498861987]
This study aims to analyze various hierarchical and incremental clustering techniques.
The main contribution of this research is the organization and comparison of the techniques used by studies published between 2010 and 2018 that aimed to texts documents clustering.
arXiv Detail & Related papers (2023-12-12T22:27:29Z) - Taxonomy Enrichment with Text and Graph Vector Representations [61.814256012166794]
We address the problem of taxonomy enrichment which aims at adding new words to the existing taxonomy.
We present a new method that allows achieving high results on this task with little effort.
We achieve state-of-the-art results across different datasets and provide an in-depth error analysis of mistakes.
arXiv Detail & Related papers (2022-01-21T09:01:12Z) - TaxoCom: Topic Taxonomy Completion with Hierarchical Discovery of Novel
Topic Clusters [57.59286394188025]
We propose a novel framework for topic taxonomy completion, named TaxoCom.
TaxoCom discovers novel sub-topic clusters of terms and documents.
Our comprehensive experiments on two real-world datasets demonstrate that TaxoCom not only generates the high-quality topic taxonomy in terms of term coherency and topic coverage.
arXiv Detail & Related papers (2022-01-18T07:07:38Z) - Can Taxonomy Help? Improving Semantic Question Matching using Question
Taxonomy [37.57300969050908]
We propose a hybrid technique for semantic question matching.
It uses our proposed two-layered taxonomy for English questions by augmenting state-of-the-art deep learning models with question classes obtained from a deep learning based question.
arXiv Detail & Related papers (2021-01-20T16:23:04Z) - Studying Taxonomy Enrichment on Diachronic WordNet Versions [70.27072729280528]
We explore the possibilities of taxonomy extension in a resource-poor setting and present methods which are applicable to a large number of languages.
We create novel English and Russian datasets for training and evaluating taxonomy enrichment models and describe a technique of creating such datasets for other languages.
arXiv Detail & Related papers (2020-11-23T16:49:37Z) - A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z) - Octet: Online Catalog Taxonomy Enrichment with Self-Supervision [67.26804972901952]
We present a self-supervised end-to-end framework, Octet for Online Catalog EnrichmenT.
We propose to train a sequence labeling model for term extraction and employ graph neural networks (GNNs) to capture the taxonomy structure.
Octet enriches an online catalog in production to 2 times larger in the open-world evaluation.
arXiv Detail & Related papers (2020-06-18T04:53:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.