HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification
- URL: http://arxiv.org/abs/2204.13413v1
- Date: Thu, 28 Apr 2022 11:22:49 GMT
- Title: HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification
- Authors: Zihan Wang, Peiyi Wang, Tianyu Liu, Yunbo Cao, Zhifang Sui, Houfeng
Wang
- Abstract summary: We propose HPT, a Hierarchy-aware Prompt Tuning method to handle HTC from a multi-label perspective.
Specifically, we construct dynamic virtual template and label words which take the form of soft prompts to fuse the label hierarchy knowledge.
Experiments show HPT achieves the state-of-the-art performances on 3 popular HTC datasets.
- Score: 45.314357107687286
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hierarchical text classification (HTC) is a challenging subtask of
multi-label classification due to its complex label hierarchy. Recently, the
pretrained language models (PLM) have been widely adopted in HTC through a
fine-tuning paradigm. However, in this paradigm, there exists a huge gap
between the classification tasks with sophisticated label hierarchy and the
masked language model (MLM) pretraining tasks of PLMs and thus the potentials
of PLMs can not be fully tapped. To bridge the gap, in this paper, we propose
HPT, a Hierarchy-aware Prompt Tuning method to handle HTC from a multi-label
MLM perspective. Specifically, we construct dynamic virtual template and label
words which take the form of soft prompts to fuse the label hierarchy knowledge
and introduce a zero-bounded multi-label cross entropy loss to harmonize the
objectives of HTC and MLM. Extensive experiments show HPT achieves the
state-of-the-art performances on 3 popular HTC datasets and is adept at
handling the imbalance and low resource situations.
Related papers
- Zero-to-Strong Generalization: Eliciting Strong Capabilities of Large Language Models Iteratively without Gold Labels [75.77877889764073]
Large Language Models (LLMs) have demonstrated remarkable performance through supervised fine-tuning or in-context learning using gold labels.
This study explores whether solely utilizing unlabeled data can elicit strong model capabilities.
We propose a new paradigm termed zero-to-strong generalization.
arXiv Detail & Related papers (2024-09-19T02:59:44Z) - Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification [13.320591504692574]
We study the HTC problem under a few-shot setting to adapt knowledge in PLMs from an unstructured manner to the downstream hierarchy.
We use a simple yet effective method named Hierarchical Conditional Iterative Random Field (HierICRF) to search the most domain-challenging directions.
We show that prompt with HierICRF significantly boosts the few-shot HTC performance with an average Micro-F1 by 28.80% to 1.50% and Macro-F1 by 36.29% to 1.5%.
arXiv Detail & Related papers (2024-07-12T03:21:57Z) - Retrieval-style In-Context Learning for Few-shot Hierarchical Text Classification [34.06292178703825]
We introduce the first ICL-based framework with large language models (LLMs) for few-shot HTC.
We exploit a retrieval database to identify relevant demonstrations, and an iterative policy to manage multi-layer hierarchical labels.
We can achieve state-of-the-art results in few-shot HTC.
arXiv Detail & Related papers (2024-06-25T13:19:41Z) - HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text
Classification [19.12354692458442]
Hierarchical text classification (HTC) is a complex subtask under multi-label text classification.
We propose HiGen, a text-generation-based framework utilizing language models to encode dynamic text representations.
arXiv Detail & Related papers (2024-01-24T04:44:42Z) - TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data [73.29220562541204]
We consider harnessing the amazing power of language models (LLMs) to solve our task.
We develop a TAT-LLM language model by fine-tuning LLaMA 2 with the training data generated automatically from existing expert-annotated datasets.
arXiv Detail & Related papers (2024-01-24T04:28:50Z) - Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification [10.578682558356473]
hierarchical text classification (HTC) suffers a poor performance when low-resource or few-shot settings are considered.
In this work, we propose the hierarchical verbalizer ("HierVerb"), a multi-verbalizer framework treating HTC as a single- or multi-label classification problem.
In this manner, HierVerb fuses label hierarchy knowledge into verbalizers and remarkably outperforms those who inject hierarchy through graph encoders.
arXiv Detail & Related papers (2023-05-26T12:41:49Z) - MADNet: Maximizing Addressee Deduction Expectation for Multi-Party
Conversation Generation [64.54727792762816]
We study the scarcity of addressee labels which is a common issue in multi-party conversations (MPCs)
We propose MADNet that maximizes addressee deduction expectation in heterogeneous graph neural networks for MPC generation.
Experimental results on two Ubuntu IRC channel benchmarks show that MADNet outperforms various baseline models on the task of MPC generation.
arXiv Detail & Related papers (2023-05-22T05:50:11Z) - Guiding the PLMs with Semantic Anchors as Intermediate Supervision:
Towards Interpretable Semantic Parsing [57.11806632758607]
We propose to incorporate the current pretrained language models with a hierarchical decoder network.
By taking the first-principle structures as the semantic anchors, we propose two novel intermediate supervision tasks.
We conduct intensive experiments on several semantic parsing benchmarks and demonstrate that our approach can consistently outperform the baselines.
arXiv Detail & Related papers (2022-10-04T07:27:29Z) - Constrained Sequence-to-Tree Generation for Hierarchical Text
Classification [10.143177923523407]
Hierarchical Text Classification (HTC) is a challenging task where a document can be assigned to multiple hierarchically structured categories within a taxonomy.
In this paper, we formulate HTC as a sequence generation task and introduce a sequence-to-tree framework (Seq2Tree) for modeling the hierarchical label structure.
arXiv Detail & Related papers (2022-04-02T08:35:39Z) - HTCInfoMax: A Global Model for Hierarchical Text Classification via
Information Maximization [75.45291796263103]
The current state-of-the-art model HiAGM for hierarchical text classification has two limitations.
It correlates each text sample with all labels in the dataset which contains irrelevant information.
We propose HTCInfoMax to address these issues by introducing information which includes two modules.
arXiv Detail & Related papers (2021-04-12T06:04:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.