Academic Resource Text Level Multi-label Classification based on
Attention
- URL: http://arxiv.org/abs/2203.10743v1
- Date: Mon, 21 Mar 2022 05:32:35 GMT
- Title: Academic Resource Text Level Multi-label Classification based on
Attention
- Authors: Yue Wang, Yawen Li, Ang Li
- Abstract summary: Hierarchical multi-label academic text classification (HMTC) is to assign academic texts into a hierarchically structured labeling system.
We propose an attention-based hierarchical multi-label classification algorithm of academic texts (AHMCA) by integrating features such as text, keywords, and hierarchical structure.
- Score: 16.71166207897885
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hierarchical multi-label academic text classification (HMTC) is to assign
academic texts into a hierarchically structured labeling system. We propose an
attention-based hierarchical multi-label classification algorithm of academic
texts (AHMCA) by integrating features such as text, keywords, and hierarchical
structure, the academic documents are classified into the most relevant
categories. We utilize word2vec and BiLSTM to obtain embedding and latent
vector representations of text, keywords, and hierarchies. We use hierarchical
attention mechanism to capture the associations between keywords, label
hierarchies, and text word vectors to generate hierarchical-specific document
embedding vectors to replace the original text embeddings in HMCN-F. The
experimental results on the academic text dataset demonstrate the effectiveness
of the AHMCA algorithm.
Related papers
- Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for
Imbalanced Medical Classification [9.391704905671476]
This paper aims to rethink the data challenges in medical texts and present a novel framework-agnostic algorithm called Text2Tree.
We embed the ICD code tree structure of labels into cascade attention modules for learning hierarchy-aware label representations.
Two new learning schemes, Similarity Surrogate Learning (SSL) and Dissimilarity Mixup Learning (DML), are devised to boost text classification by reusing and distinguishing samples of other labels.
arXiv Detail & Related papers (2023-11-28T10:02:08Z) - Recent Advances in Hierarchical Multi-label Text Classification: A
Survey [11.709847202580505]
Hierarchical multi-label text classification aims to classify the input text into multiple labels, among which the labels are structured and hierarchical.
It is a vital task in many real world applications, e.g. scientific literature archiving.
arXiv Detail & Related papers (2023-07-30T16:13:00Z) - Hierarchical Multi-Label Classification of Scientific Documents [47.293189105900524]
We introduce a new dataset for hierarchical multi-label text classification of scientific papers called SciHTC.
This dataset contains 186,160 papers and 1,233 categories from the ACM CCS tree.
Our best model achieves a Macro-F1 score of 34.57% which shows that this dataset provides significant research opportunities.
arXiv Detail & Related papers (2022-11-05T04:12:57Z) - Many-Class Text Classification with Matching [65.74328417321738]
We formulate textbfText textbfClassification as a textbfMatching problem between the text and the labels, and propose a simple yet effective framework named TCM.
Compared with previous text classification approaches, TCM takes advantage of the fine-grained semantic information of the classification labels.
arXiv Detail & Related papers (2022-05-23T15:51:19Z) - Incorporating Hierarchy into Text Encoder: a Contrastive Learning
Approach for Hierarchical Text Classification [23.719121637849806]
We propose a hierarchy-guided Contrastive Learning (HGCLR) to embed the label hierarchy into a text encoder.
During training, HGCLR constructs positive samples for input text under the guidance of the label hierarchy.
After training, the HGCLR enhanced text encoder can dispense with the redundant hierarchy.
arXiv Detail & Related papers (2022-03-08T03:21:45Z) - Minimally-Supervised Structure-Rich Text Categorization via Learning on
Text-Rich Networks [61.23408995934415]
We propose a novel framework for minimally supervised categorization by learning from the text-rich network.
Specifically, we jointly train two modules with different inductive biases -- a text analysis module for text understanding and a network learning module for class-discriminative, scalable network learning.
Our experiments show that given only three seed documents per category, our framework can achieve an accuracy of about 92%.
arXiv Detail & Related papers (2021-02-23T04:14:34Z) - MATCH: Metadata-Aware Text Classification in A Large Hierarchy [60.59183151617578]
MATCH is an end-to-end framework that leverages both metadata and hierarchy information.
We propose different ways to regularize the parameters and output probability of each child label by its parents.
Experiments on two massive text datasets with large-scale label hierarchies demonstrate the effectiveness of MATCH.
arXiv Detail & Related papers (2021-02-15T05:23:08Z) - Exploring the Hierarchy in Relation Labels for Scene Graph Generation [75.88758055269948]
The proposed method can improve several state-of-the-art baselines by a large margin (up to $33%$ relative gain) in terms of Recall@50.
Experiments show that the proposed simple yet effective method can improve several state-of-the-art baselines by a large margin.
arXiv Detail & Related papers (2020-09-12T17:36:53Z) - Joint Embedding of Words and Category Labels for Hierarchical
Multi-label Text Classification [4.2750700546937335]
hierarchical text classification (HTC) has received extensive attention and has broad application prospects.
We propose a joint embedding of text and parent category based on hierarchical fine-tuning ordered neurons LSTM (HFT-ONLSTM) for HTC.
arXiv Detail & Related papers (2020-04-06T11:06:08Z) - Hierarchical Image Classification using Entailment Cone Embeddings [68.82490011036263]
We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier.
We empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
arXiv Detail & Related papers (2020-04-02T10:22:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.