HiLight: A Hierarchy-aware Light Global Model with Hierarchical Local ConTrastive Learning
- URL: http://arxiv.org/abs/2408.05786v1
- Date: Sun, 11 Aug 2024 14:26:58 GMT
- Title: HiLight: A Hierarchy-aware Light Global Model with Hierarchical Local ConTrastive Learning
- Authors: Zhijian Chen, Zhonghua Li, Jianxin Yang, Ye Qi,
- Abstract summary: Hierarchical text classification (HTC) is a sub-task of multi-label classification (MLC)
We propose a new learning task to introduce the hierarchical information, called Hierarchical Local Contrastive Learning (HiLCL)
- Score: 3.889612454093451
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hierarchical text classification (HTC) is a special sub-task of multi-label classification (MLC) whose taxonomy is constructed as a tree and each sample is assigned with at least one path in the tree. Latest HTC models contain three modules: a text encoder, a structure encoder and a multi-label classification head. Specially, the structure encoder is designed to encode the hierarchy of taxonomy. However, the structure encoder has scale problem. As the taxonomy size increases, the learnable parameters of recent HTC works grow rapidly. Recursive regularization is another widely-used method to introduce hierarchical information but it has collapse problem and generally relaxed by assigning with a small weight (ie. 1e-6). In this paper, we propose a Hierarchy-aware Light Global model with Hierarchical local conTrastive learning (HiLight), a lightweight and efficient global model only consisting of a text encoder and a multi-label classification head. We propose a new learning task to introduce the hierarchical information, called Hierarchical Local Contrastive Learning (HiLCL). Extensive experiments are conducted on two benchmark datasets to demonstrate the effectiveness of our model.
Related papers
- TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision [41.05874642535256]
Hierarchical text classification aims to categorize each document into a set of classes in a label taxonomy.
Most earlier works focus on fully or semi-supervised methods that require a large amount of human annotated data.
We work on hierarchical text classification with the minimal amount of supervision: using the sole class name of each node as the only supervision.
arXiv Detail & Related papers (2024-02-29T22:26:07Z) - Utilizing Local Hierarchy with Adversarial Training for Hierarchical Text Classification [30.353876890557984]
Hierarchical text classification (HTC) is a challenging subtask due to its complex taxonomic structure.
We propose a HiAdv framework that can fit in nearly all HTC models and optimize them with the local hierarchy as auxiliary information.
arXiv Detail & Related papers (2024-02-29T03:20:45Z) - HiTIN: Hierarchy-aware Tree Isomorphism Network for Hierarchical Text
Classification [18.03202012033514]
We propose hierarchy-aware Tree Isomorphism Network (HiTIN) to enhance the text representations with only syntactic information of the label hierarchy.
We conduct experiments on three commonly used datasets and the results demonstrate that HiTIN could achieve better test performance and less memory consumption.
arXiv Detail & Related papers (2023-05-24T14:14:08Z) - Implant Global and Local Hierarchy Information to Sequence based Code
Representation Models [25.776540440893257]
We analyze how the complete hierarchical structure influences the tokens in code sequences and abstract this influence as a property of code tokens called hierarchical embedding.
We propose the Hierarchy Transformer (HiT), a simple but effective sequence model to incorporate the complete hierarchical embeddings of source code into a Transformer model.
arXiv Detail & Related papers (2023-03-14T12:01:39Z) - Hierarchical Multi-Label Classification of Scientific Documents [47.293189105900524]
We introduce a new dataset for hierarchical multi-label text classification of scientific papers called SciHTC.
This dataset contains 186,160 papers and 1,233 categories from the ACM CCS tree.
Our best model achieves a Macro-F1 score of 34.57% which shows that this dataset provides significant research opportunities.
arXiv Detail & Related papers (2022-11-05T04:12:57Z) - Seeded Hierarchical Clustering for Expert-Crafted Taxonomies [48.10324642720299]
We propose HierSeed, a weakly supervised algorithm for fitting unlabeled corpora.
It is both data and efficient.
It outperforms both unsupervised and supervised baselines for the SHC task on three real-world datasets.
arXiv Detail & Related papers (2022-05-23T19:58:06Z) - Deep Hierarchical Semantic Segmentation [76.40565872257709]
hierarchical semantic segmentation (HSS) aims at structured, pixel-wise description of visual observation in terms of a class hierarchy.
HSSN casts HSS as a pixel-wise multi-label classification task, only bringing minimal architecture change to current segmentation models.
With hierarchy-induced margin constraints, HSSN reshapes the pixel embedding space, so as to generate well-structured pixel representations.
arXiv Detail & Related papers (2022-03-27T15:47:44Z) - Hierarchical Text Classification As Sub-Hierarchy Sequence Generation [8.062201442038957]
Hierarchical text classification (HTC) is essential for various real applications.
Recent HTC models have attempted to incorporate hierarchy information into a model structure.
We formulate HTC as a sub-hierarchy sequence generation to incorporate hierarchy information into a target label sequence.
HiDEC achieved state-of-the-art performance with significantly fewer model parameters than existing models on benchmark datasets.
arXiv Detail & Related papers (2021-11-22T10:50:39Z) - HTCInfoMax: A Global Model for Hierarchical Text Classification via
Information Maximization [75.45291796263103]
The current state-of-the-art model HiAGM for hierarchical text classification has two limitations.
It correlates each text sample with all labels in the dataset which contains irrelevant information.
We propose HTCInfoMax to address these issues by introducing information which includes two modules.
arXiv Detail & Related papers (2021-04-12T06:04:20Z) - MATCH: Metadata-Aware Text Classification in A Large Hierarchy [60.59183151617578]
MATCH is an end-to-end framework that leverages both metadata and hierarchy information.
We propose different ways to regularize the parameters and output probability of each child label by its parents.
Experiments on two massive text datasets with large-scale label hierarchies demonstrate the effectiveness of MATCH.
arXiv Detail & Related papers (2021-02-15T05:23:08Z) - Exploring the Hierarchy in Relation Labels for Scene Graph Generation [75.88758055269948]
The proposed method can improve several state-of-the-art baselines by a large margin (up to $33%$ relative gain) in terms of Recall@50.
Experiments show that the proposed simple yet effective method can improve several state-of-the-art baselines by a large margin.
arXiv Detail & Related papers (2020-09-12T17:36:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.