GUDN A novel guide network for extreme multi-label text classification
- URL: http://arxiv.org/abs/2201.11582v1
- Date: Mon, 10 Jan 2022 07:33:36 GMT
- Title: GUDN A novel guide network for extreme multi-label text classification
- Authors: Qing Wang, Hongji Shu, Jia Zhu
- Abstract summary: This paper constructs a novel guide network (GUDN) to help fine-tune the pre-trained model to instruct classification later.
We also use the raw label semantics to effectively explore the latent space between texts and labels, which can further improve predicted accuracy.
- Score: 12.975260278131078
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of extreme multi-label text classification (XMTC) is to recall
some most relevant labels for a text from an extremely large label set. Though
the methods based on deep pre-trained models have reached significant
achievement, the pre-trained models are still not fully utilized. Label
semantics has not attracted much attention so far, and the latent space between
texts and labels has not been effectively explored. This paper constructs a
novel guide network (GUDN) to help fine-tune the pre-trained model to instruct
classification later. Also, we use the raw label semantics to effectively
explore the latent space between texts and labels, which can further improve
predicted accuracy. Experimental results demonstrate that GUDN outperforms
state-of-the-art methods on several popular datasets. Our source code is
released at https://github.com/wq2581/GUDN.
Related papers
- Description-Enhanced Label Embedding Contrastive Learning for Text
Classification [65.01077813330559]
Self-Supervised Learning (SSL) in model learning process and design a novel self-supervised Relation of Relation (R2) classification task.
Relation of Relation Learning Network (R2-Net) for text classification, in which text classification and R2 classification are treated as optimization targets.
external knowledge from WordNet to obtain multi-aspect descriptions for label semantic learning.
arXiv Detail & Related papers (2023-06-15T02:19:34Z) - Rank-Aware Negative Training for Semi-Supervised Text Classification [3.105629960108712]
Semi-supervised text classification-based paradigms (SSTC) typically employ the spirit of self-training.
This paper presents a Rank-aware Negative Training (RNT) framework to address SSTC in learning with noisy label manner.
arXiv Detail & Related papers (2023-06-13T08:41:36Z) - Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - Label Semantic Aware Pre-training for Few-shot Text Classification [53.80908620663974]
We propose Label Semantic Aware Pre-training (LSAP) to improve the generalization and data efficiency of text classification systems.
LSAP incorporates label semantics into pre-trained generative models (T5 in our case) by performing secondary pre-training on labeled sentences from a variety of domains.
arXiv Detail & Related papers (2022-04-14T17:33:34Z) - Trustable Co-label Learning from Multiple Noisy Annotators [68.59187658490804]
Supervised deep learning depends on massive accurately annotated examples.
A typical alternative is learning from multiple noisy annotators.
This paper proposes a data-efficient approach, called emphTrustable Co-label Learning (TCL)
arXiv Detail & Related papers (2022-03-08T16:57:00Z) - Weakly-supervised Text Classification Based on Keyword Graph [30.57722085686241]
We propose a novel framework called ClassKG to explore keyword-keyword correlation on keyword graph by GNN.
Our framework is an iterative process. In each iteration, we first construct a keyword graph, so the task of assigning pseudo labels is transformed to annotating keyword subgraphs.
With the pseudo labels generated by the subgraph annotator, we then train a text classifier to classify the unlabeled texts.
arXiv Detail & Related papers (2021-10-06T08:58:02Z) - Label Confusion Learning to Enhance Text Classification Models [3.0251266104313643]
Label Confusion Model (LCM) learns label confusion to capture semantic overlap among labels.
LCM can generate a better label distribution to replace the original one-hot label vector.
experiments on five text classification benchmark datasets reveal the effectiveness of LCM for several widely used deep learning classification models.
arXiv Detail & Related papers (2020-12-09T11:34:35Z) - PseudoSeg: Designing Pseudo Labels for Semantic Segmentation [78.35515004654553]
We present a re-design of pseudo-labeling to generate structured pseudo labels for training with unlabeled or weakly-labeled data.
We demonstrate the effectiveness of the proposed pseudo-labeling strategy in both low-data and high-data regimes.
arXiv Detail & Related papers (2020-10-19T17:59:30Z) - MixText: Linguistically-Informed Interpolation of Hidden Space for
Semi-Supervised Text Classification [68.15015032551214]
MixText is a semi-supervised learning method for text classification.
TMix creates a large amount of augmented training samples by interpolating text in hidden space.
We leverage recent advances in data augmentation to guess low-entropy labels for unlabeled data.
arXiv Detail & Related papers (2020-04-25T21:37:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.