Retrieval-style In-Context Learning for Few-shot Hierarchical Text Classification
- URL: http://arxiv.org/abs/2406.17534v2
- Date: Sat, 29 Jun 2024 09:24:33 GMT
- Title: Retrieval-style In-Context Learning for Few-shot Hierarchical Text Classification
- Authors: Huiyao Chen, Yu Zhao, Zulong Chen, Mengjia Wang, Liangyue Li, Meishan Zhang, Min Zhang,
- Abstract summary: We introduce the first ICL-based framework with large language models (LLMs) for few-shot HTC.
We exploit a retrieval database to identify relevant demonstrations, and an iterative policy to manage multi-layer hierarchical labels.
We can achieve state-of-the-art results in few-shot HTC.
- Score: 34.06292178703825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hierarchical text classification (HTC) is an important task with broad applications, while few-shot HTC has gained increasing interest recently. While in-context learning (ICL) with large language models (LLMs) has achieved significant success in few-shot learning, it is not as effective for HTC because of the expansive hierarchical label sets and extremely-ambiguous labels. In this work, we introduce the first ICL-based framework with LLM for few-shot HTC. We exploit a retrieval database to identify relevant demonstrations, and an iterative policy to manage multi-layer hierarchical labels. Particularly, we equip the retrieval database with HTC label-aware representations for the input texts, which is achieved by continual training on a pretrained language model with masked language modeling (MLM), layer-wise classification (CLS, specifically for HTC), and a novel divergent contrastive learning (DCL, mainly for adjacent semantically-similar labels) objective. Experimental results on three benchmark datasets demonstrate superior performance of our method, and we can achieve state-of-the-art results in few-shot HTC.
Related papers
- Revisiting Hierarchical Text Classification: Inference and Metrics [4.057349748970303]
Hierarchical text classification (HTC) is the task of assigning labels to a text within a structured space organized as a hierarchy.
Recent works treat HTC as a conventional multilabel classification problem, therefore evaluating it as such.
We propose to evaluate models based on specifically designed hierarchical metrics and we demonstrate the intricacy of metric choice and prediction inference method.
arXiv Detail & Related papers (2024-10-02T07:57:33Z) - Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification [13.320591504692574]
We study the HTC problem under a few-shot setting to adapt knowledge in PLMs from an unstructured manner to the downstream hierarchy.
We use a simple yet effective method named Hierarchical Conditional Iterative Random Field (HierICRF) to search the most domain-challenging directions.
We show that prompt with HierICRF significantly boosts the few-shot HTC performance with an average Micro-F1 by 28.80% to 1.50% and Macro-F1 by 36.29% to 1.5%.
arXiv Detail & Related papers (2024-07-12T03:21:57Z) - Text-Video Retrieval with Global-Local Semantic Consistent Learning [122.15339128463715]
We propose a simple yet effective method, Global-Local Semantic Consistent Learning (GLSCL)
GLSCL capitalizes on latent shared semantics across modalities for text-video retrieval.
Our method achieves comparable performance with SOTA as well as being nearly 220 times faster in terms of computational cost.
arXiv Detail & Related papers (2024-05-21T11:59:36Z) - Enhancing Visual Continual Learning with Language-Guided Supervision [76.38481740848434]
Continual learning aims to empower models to learn new tasks without forgetting previously acquired knowledge.
We argue that the scarce semantic information conveyed by the one-hot labels hampers the effective knowledge transfer across tasks.
Specifically, we use PLMs to generate semantic targets for each class, which are frozen and serve as supervision signals.
arXiv Detail & Related papers (2024-03-24T12:41:58Z) - RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition [78.97487780589574]
Multimodal Large Language Models (MLLMs) excel at classifying fine-grained categories.
This paper introduces a Retrieving And Ranking augmented method for MLLMs.
Our proposed approach not only addresses the inherent limitations in fine-grained recognition but also preserves the model's comprehensive knowledge base.
arXiv Detail & Related papers (2024-03-20T17:59:55Z) - HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text
Classification [19.12354692458442]
Hierarchical text classification (HTC) is a complex subtask under multi-label text classification.
We propose HiGen, a text-generation-based framework utilizing language models to encode dynamic text representations.
arXiv Detail & Related papers (2024-01-24T04:44:42Z) - Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification [10.578682558356473]
hierarchical text classification (HTC) suffers a poor performance when low-resource or few-shot settings are considered.
In this work, we propose the hierarchical verbalizer ("HierVerb"), a multi-verbalizer framework treating HTC as a single- or multi-label classification problem.
In this manner, HierVerb fuses label hierarchy knowledge into verbalizers and remarkably outperforms those who inject hierarchy through graph encoders.
arXiv Detail & Related papers (2023-05-26T12:41:49Z) - SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding
Tasks [88.4408774253634]
Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community.
There are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers.
Recent work has begun to introduce such benchmark for several tasks.
arXiv Detail & Related papers (2022-12-20T18:39:59Z) - HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification [45.314357107687286]
We propose HPT, a Hierarchy-aware Prompt Tuning method to handle HTC from a multi-label perspective.
Specifically, we construct dynamic virtual template and label words which take the form of soft prompts to fuse the label hierarchy knowledge.
Experiments show HPT achieves the state-of-the-art performances on 3 popular HTC datasets.
arXiv Detail & Related papers (2022-04-28T11:22:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.