Related papers: Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification

Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification

URL: http://arxiv.org/abs/2305.16885v1
Date: Fri, 26 May 2023 12:41:49 GMT
Title: Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification
Authors: Ke Ji and Yixin Lian and Jingsheng Gao and Baoyuan Wang
Abstract summary: hierarchical text classification (HTC) suffers a poor performance when low-resource or few-shot settings are considered. In this work, we propose the hierarchical verbalizer ("HierVerb"), a multi-verbalizer framework treating HTC as a single- or multi-label classification problem. In this manner, HierVerb fuses label hierarchy knowledge into verbalizers and remarkably outperforms those who inject hierarchy through graph encoders.
Score: 10.578682558356473
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Due to the complex label hierarchy and intensive labeling cost in practice, the hierarchical text classification (HTC) suffers a poor performance especially when low-resource or few-shot settings are considered. Recently, there is a growing trend of applying prompts on pre-trained language models (PLMs), which has exhibited effectiveness in the few-shot flat text classification tasks. However, limited work has studied the paradigm of prompt-based learning in the HTC problem when the training data is extremely scarce. In this work, we define a path-based few-shot setting and establish a strict path-based evaluation metric to further explore few-shot HTC tasks. To address the issue, we propose the hierarchical verbalizer ("HierVerb"), a multi-verbalizer framework treating HTC as a single- or multi-label classification problem at multiple layers and learning vectors as verbalizers constrained by hierarchical structure and hierarchical contrastive learning. In this manner, HierVerb fuses label hierarchy knowledge into verbalizers and remarkably outperforms those who inject hierarchy through graph encoders, maximizing the benefits of PLMs. Extensive experiments on three popular HTC datasets under the few-shot settings demonstrate that prompt with HierVerb significantly boosts the HTC performance, meanwhile indicating an elegant way to bridge the gap between the large pre-trained model and downstream hierarchical classification tasks. Our code and few-shot dataset are publicly available at https://github.com/1KE-JI/HierVerb.

Related papers

Revisiting Hierarchical Text Classification: Inference and Metrics [4.057349748970303]
Hierarchical text classification (HTC) is the task of assigning labels to a text within a structured space organized as a hierarchy. Recent works treat HTC as a conventional multilabel classification problem, therefore evaluating it as such. We propose to evaluate models based on specifically designed hierarchical metrics and we demonstrate the intricacy of metric choice and prediction inference method.
arXiv Detail & Related papers (2024-10-02T07:57:33Z)
Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification [13.320591504692574]
We study the HTC problem under a few-shot setting to adapt knowledge in PLMs from an unstructured manner to the downstream hierarchy. We use a simple yet effective method named Hierarchical Conditional Iterative Random Field (HierICRF) to search the most domain-challenging directions. We show that prompt with HierICRF significantly boosts the few-shot HTC performance with an average Micro-F1 by 28.80% to 1.50% and Macro-F1 by 36.29% to 1.5%.
arXiv Detail & Related papers (2024-07-12T03:21:57Z)
Retrieval-style In-Context Learning for Few-shot Hierarchical Text Classification [34.06292178703825]
We introduce the first ICL-based framework with large language models (LLMs) for few-shot HTC. We exploit a retrieval database to identify relevant demonstrations, and an iterative policy to manage multi-layer hierarchical labels. We can achieve state-of-the-art results in few-shot HTC.
arXiv Detail & Related papers (2024-06-25T13:19:41Z)
Noise-Robust Dense Retrieval via Contrastive Alignment Post Training [89.29256833403167]
Contrastive Alignment POst Training (CAPOT) is a highly efficient finetuning method that improves model robustness without requiring index regeneration. CAPOT enables robust retrieval by freezing the document encoder while the query encoder learns to align noisy queries with their unaltered root. We evaluate CAPOT noisy variants of MSMARCO, Natural Questions, and Trivia QA passage retrieval, finding CAPOT has a similar impact as data augmentation with none of its overhead.
arXiv Detail & Related papers (2023-04-06T22:16:53Z)
M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios [103.6153593636399]
We propose a vision-language prompt tuning method with mitigated label bias (M-Tuning) It introduces open words from the WordNet to extend the range of words forming the prompt texts from only closed-set label words to more, and thus prompts are tuned in a simulated open-set scenario. Our method achieves the best performance on datasets with various scales, and extensive ablation studies also validate its effectiveness.
arXiv Detail & Related papers (2023-03-09T09:05:47Z)
HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification [45.314357107687286]
We propose HPT, a Hierarchy-aware Prompt Tuning method to handle HTC from a multi-label perspective. Specifically, we construct dynamic virtual template and label words which take the form of soft prompts to fuse the label hierarchy knowledge. Experiments show HPT achieves the state-of-the-art performances on 3 popular HTC datasets.
arXiv Detail & Related papers (2022-04-28T11:22:49Z)
Constrained Sequence-to-Tree Generation for Hierarchical Text Classification [10.143177923523407]
Hierarchical Text Classification (HTC) is a challenging task where a document can be assigned to multiple hierarchically structured categories within a taxonomy. In this paper, we formulate HTC as a sequence generation task and introduce a sequence-to-tree framework (Seq2Tree) for modeling the hierarchical label structure.
arXiv Detail & Related papers (2022-04-02T08:35:39Z)
Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model. In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z)
Multitask Learning for Class-Imbalanced Discourse Classification [74.41900374452472]
We show that a multitask approach can improve 7% Micro F1-score upon current state-of-the-art benchmarks. We also offer a comparative review of additional techniques proposed to address resource-poor problems in NLP.
arXiv Detail & Related papers (2021-01-02T07:13:41Z)
Adaptive Self-training for Few-shot Neural Sequence Labeling [55.43109437200101]
We develop techniques to address the label scarcity challenge for neural sequence labeling models. Self-training serves as an effective mechanism to learn from large amounts of unlabeled data. meta-learning helps in adaptive sample re-weighting to mitigate error propagation from noisy pseudo-labels.
arXiv Detail & Related papers (2020-10-07T22:29:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.