Joint Embedding of Words and Category Labels for Hierarchical
Multi-label Text Classification
- URL: http://arxiv.org/abs/2004.02555v3
- Date: Wed, 26 Aug 2020 03:20:30 GMT
- Title: Joint Embedding of Words and Category Labels for Hierarchical
Multi-label Text Classification
- Authors: Jingpeng Zhao and Yinglong Ma
- Abstract summary: hierarchical text classification (HTC) has received extensive attention and has broad application prospects.
We propose a joint embedding of text and parent category based on hierarchical fine-tuning ordered neurons LSTM (HFT-ONLSTM) for HTC.
- Score: 4.2750700546937335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text classification has become increasingly challenging due to the continuous
refinement of classification label granularity and the expansion of
classification label scale. To address that, some research has been applied
onto strategies that exploit the hierarchical structure in problems with a
large number of categories. At present, hierarchical text classification (HTC)
has received extensive attention and has broad application prospects. Making
full use of the relationship between parent category and child category in text
classification task can greatly improve the performance of classification. In
this paper, We propose a joint embedding of text and parent category based on
hierarchical fine-tuning ordered neurons LSTM (HFT-ONLSTM) for HTC. Our method
makes full use of the connection between the upper-level and lower-level
labels. Experiments show that our model outperforms the state-of-the-art
hierarchical model at a lower computation cost.
Related papers
- HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text
Classification [19.12354692458442]
Hierarchical text classification (HTC) is a complex subtask under multi-label text classification.
We propose HiGen, a text-generation-based framework utilizing language models to encode dynamic text representations.
arXiv Detail & Related papers (2024-01-24T04:44:42Z) - Hierarchical Multi-Label Classification of Scientific Documents [47.293189105900524]
We introduce a new dataset for hierarchical multi-label text classification of scientific papers called SciHTC.
This dataset contains 186,160 papers and 1,233 categories from the ACM CCS tree.
Our best model achieves a Macro-F1 score of 34.57% which shows that this dataset provides significant research opportunities.
arXiv Detail & Related papers (2022-11-05T04:12:57Z) - Many-Class Text Classification with Matching [65.74328417321738]
We formulate textbfText textbfClassification as a textbfMatching problem between the text and the labels, and propose a simple yet effective framework named TCM.
Compared with previous text classification approaches, TCM takes advantage of the fine-grained semantic information of the classification labels.
arXiv Detail & Related papers (2022-05-23T15:51:19Z) - HFT-ONLSTM: Hierarchical and Fine-Tuning Multi-label Text Classification [7.176984223240199]
Hierarchical multi-label text classification (HMTC) with higher accuracy over large sets of closely related categories has become a challenging problem.
We present a hierarchical and fine-tuning approach based on the Ordered Neural LSTM neural network, abbreviated as HFT-ONLSTM, for more accurate level-by-level HMTC.
arXiv Detail & Related papers (2022-04-18T00:57:46Z) - Constrained Sequence-to-Tree Generation for Hierarchical Text
Classification [10.143177923523407]
Hierarchical Text Classification (HTC) is a challenging task where a document can be assigned to multiple hierarchically structured categories within a taxonomy.
In this paper, we formulate HTC as a sequence generation task and introduce a sequence-to-tree framework (Seq2Tree) for modeling the hierarchical label structure.
arXiv Detail & Related papers (2022-04-02T08:35:39Z) - Label Hierarchy Transition: Delving into Class Hierarchies to Enhance
Deep Classifiers [40.993137740456014]
We propose a unified probabilistic framework based on deep learning to address the challenges of hierarchical classification.
The proposed framework can be readily adapted to any existing deep network with only minor modifications.
We extend our proposed LHT framework to the skin lesion diagnosis task and validate its great potential in computer-aided diagnosis.
arXiv Detail & Related papers (2021-12-04T14:58:36Z) - Hierarchical Heterogeneous Graph Representation Learning for Short Text
Classification [60.233529926965836]
We propose a new method called SHINE, which is based on graph neural network (GNN) for short text classification.
First, we model the short text dataset as a hierarchical heterogeneous graph consisting of word-level component graphs.
Then, we dynamically learn a short document graph that facilitates effective label propagation among similar short texts.
arXiv Detail & Related papers (2021-10-30T05:33:05Z) - MATCH: Metadata-Aware Text Classification in A Large Hierarchy [60.59183151617578]
MATCH is an end-to-end framework that leverages both metadata and hierarchy information.
We propose different ways to regularize the parameters and output probability of each child label by its parents.
Experiments on two massive text datasets with large-scale label hierarchies demonstrate the effectiveness of MATCH.
arXiv Detail & Related papers (2021-02-15T05:23:08Z) - Coherent Hierarchical Multi-Label Classification Networks [56.41950277906307]
C-HMCNN(h) is a novel approach for HMC problems, which exploits hierarchy information in order to produce predictions coherent with the constraint and improve performance.
We conduct an extensive experimental analysis showing the superior performance of C-HMCNN(h) when compared to state-of-the-art models.
arXiv Detail & Related papers (2020-10-20T09:37:02Z) - Exploring the Hierarchy in Relation Labels for Scene Graph Generation [75.88758055269948]
The proposed method can improve several state-of-the-art baselines by a large margin (up to $33%$ relative gain) in terms of Recall@50.
Experiments show that the proposed simple yet effective method can improve several state-of-the-art baselines by a large margin.
arXiv Detail & Related papers (2020-09-12T17:36:53Z) - Description Based Text Classification with Reinforcement Learning [34.18824470728299]
We propose a new framework for text classification, in which each category label is associated with a category description.
We observe significant performance boosts over strong baselines on a wide range of text classification tasks.
arXiv Detail & Related papers (2020-02-08T02:14:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.