HiTIN: Hierarchy-aware Tree Isomorphism Network for Hierarchical Text
Classification
- URL: http://arxiv.org/abs/2305.15182v2
- Date: Fri, 9 Jun 2023 08:53:14 GMT
- Title: HiTIN: Hierarchy-aware Tree Isomorphism Network for Hierarchical Text
Classification
- Authors: He Zhu, Chong Zhang, Junjie Huang, Junran Wu, Ke Xu
- Abstract summary: We propose hierarchy-aware Tree Isomorphism Network (HiTIN) to enhance the text representations with only syntactic information of the label hierarchy.
We conduct experiments on three commonly used datasets and the results demonstrate that HiTIN could achieve better test performance and less memory consumption.
- Score: 18.03202012033514
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Hierarchical text classification (HTC) is a challenging subtask of
multi-label classification as the labels form a complex hierarchical structure.
Existing dual-encoder methods in HTC achieve weak performance gains with huge
memory overheads and their structure encoders heavily rely on domain knowledge.
Under such observation, we tend to investigate the feasibility of a
memory-friendly model with strong generalization capability that could boost
the performance of HTC without prior statistics or label semantics. In this
paper, we propose Hierarchy-aware Tree Isomorphism Network (HiTIN) to enhance
the text representations with only syntactic information of the label
hierarchy. Specifically, we convert the label hierarchy into an unweighted tree
structure, termed coding tree, with the guidance of structural entropy. Then we
design a structure encoder to incorporate hierarchy-aware information in the
coding tree into text representations. Besides the text encoder, HiTIN only
contains a few multi-layer perceptions and linear transformations, which
greatly saves memory. We conduct experiments on three commonly used datasets
and the results demonstrate that HiTIN could achieve better test performance
and less memory consumption than state-of-the-art (SOTA) methods.
Related papers
- HiLight: A Hierarchy-aware Light Global Model with Hierarchical Local ConTrastive Learning [3.889612454093451]
Hierarchical text classification (HTC) is a sub-task of multi-label classification (MLC)
We propose a new learning task to introduce the hierarchical information, called Hierarchical Local Contrastive Learning (HiLCL)
arXiv Detail & Related papers (2024-08-11T14:26:58Z) - HDT: Hierarchical Document Transformer [70.2271469410557]
HDT exploits document structure by introducing auxiliary anchor tokens and redesigning the attention mechanism into a sparse multi-level hierarchy.
We develop a novel sparse attention kernel that considers the hierarchical structure of documents.
arXiv Detail & Related papers (2024-07-11T09:28:04Z) - HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text
Classification [19.12354692458442]
Hierarchical text classification (HTC) is a complex subtask under multi-label text classification.
We propose HiGen, a text-generation-based framework utilizing language models to encode dynamic text representations.
arXiv Detail & Related papers (2024-01-24T04:44:42Z) - Use All The Labels: A Hierarchical Multi-Label Contrastive Learning
Framework [75.79736930414715]
We present a hierarchical multi-label representation learning framework that can leverage all available labels and preserve the hierarchical relationship between classes.
We introduce novel hierarchy preserving losses, which jointly apply a hierarchical penalty to the contrastive loss, and enforce the hierarchy constraint.
arXiv Detail & Related papers (2022-04-27T21:41:44Z) - Constrained Sequence-to-Tree Generation for Hierarchical Text
Classification [10.143177923523407]
Hierarchical Text Classification (HTC) is a challenging task where a document can be assigned to multiple hierarchically structured categories within a taxonomy.
In this paper, we formulate HTC as a sequence generation task and introduce a sequence-to-tree framework (Seq2Tree) for modeling the hierarchical label structure.
arXiv Detail & Related papers (2022-04-02T08:35:39Z) - Deep Hierarchical Semantic Segmentation [76.40565872257709]
hierarchical semantic segmentation (HSS) aims at structured, pixel-wise description of visual observation in terms of a class hierarchy.
HSSN casts HSS as a pixel-wise multi-label classification task, only bringing minimal architecture change to current segmentation models.
With hierarchy-induced margin constraints, HSSN reshapes the pixel embedding space, so as to generate well-structured pixel representations.
arXiv Detail & Related papers (2022-03-27T15:47:44Z) - Incorporating Hierarchy into Text Encoder: a Contrastive Learning
Approach for Hierarchical Text Classification [23.719121637849806]
We propose a hierarchy-guided Contrastive Learning (HGCLR) to embed the label hierarchy into a text encoder.
During training, HGCLR constructs positive samples for input text under the guidance of the label hierarchy.
After training, the HGCLR enhanced text encoder can dispense with the redundant hierarchy.
arXiv Detail & Related papers (2022-03-08T03:21:45Z) - Hierarchical Text Classification As Sub-Hierarchy Sequence Generation [8.062201442038957]
Hierarchical text classification (HTC) is essential for various real applications.
Recent HTC models have attempted to incorporate hierarchy information into a model structure.
We formulate HTC as a sub-hierarchy sequence generation to incorporate hierarchy information into a target label sequence.
HiDEC achieved state-of-the-art performance with significantly fewer model parameters than existing models on benchmark datasets.
arXiv Detail & Related papers (2021-11-22T10:50:39Z) - HTCInfoMax: A Global Model for Hierarchical Text Classification via
Information Maximization [75.45291796263103]
The current state-of-the-art model HiAGM for hierarchical text classification has two limitations.
It correlates each text sample with all labels in the dataset which contains irrelevant information.
We propose HTCInfoMax to address these issues by introducing information which includes two modules.
arXiv Detail & Related papers (2021-04-12T06:04:20Z) - MATCH: Metadata-Aware Text Classification in A Large Hierarchy [60.59183151617578]
MATCH is an end-to-end framework that leverages both metadata and hierarchy information.
We propose different ways to regularize the parameters and output probability of each child label by its parents.
Experiments on two massive text datasets with large-scale label hierarchies demonstrate the effectiveness of MATCH.
arXiv Detail & Related papers (2021-02-15T05:23:08Z) - Tree-structured Attention with Hierarchical Accumulation [103.47584968330325]
"Hierarchical Accumulation" encodes parse tree structures into self-attention at constant time complexity.
Our approach outperforms SOTA methods in four IWSLT translation tasks and the WMT'14 English-German translation task.
arXiv Detail & Related papers (2020-02-19T08:17:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.