Metadata-Induced Contrastive Learning for Zero-Shot Multi-Label Text
Classification
- URL: http://arxiv.org/abs/2202.05932v1
- Date: Fri, 11 Feb 2022 23:22:17 GMT
- Title: Metadata-Induced Contrastive Learning for Zero-Shot Multi-Label Text
Classification
- Authors: Yu Zhang, Zhihong Shen, Chieh-Han Wu, Boya Xie, Junheng Hao, Ye-Yi
Wang, Kuansan Wang, Jiawei Han
- Abstract summary: We propose a novel metadata-induced contrastive learning (MICoL) method for large-scale multi-label text classification.
MICoL exploits document metadata, which are widely available on the Web, to derive similar document-document pairs.
We show that MICoL significantly outperforms strong zero-shot text classification and contrastive learning baselines.
- Score: 27.33039900612395
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale multi-label text classification (LMTC) aims to associate a
document with its relevant labels from a large candidate set. Most existing
LMTC approaches rely on massive human-annotated training data, which are often
costly to obtain and suffer from a long-tailed label distribution (i.e., many
labels occur only a few times in the training set). In this paper, we study
LMTC under the zero-shot setting, which does not require any annotated
documents with labels and only relies on label surface names and descriptions.
To train a classifier that calculates the similarity score between a document
and a label, we propose a novel metadata-induced contrastive learning (MICoL)
method. Different from previous text-based contrastive learning techniques,
MICoL exploits document metadata (e.g., authors, venues, and references of
research papers), which are widely available on the Web, to derive similar
document-document pairs. Experimental results on two large-scale datasets show
that: (1) MICoL significantly outperforms strong zero-shot text classification
and contrastive learning baselines; (2) MICoL is on par with the
state-of-the-art supervised metadata-aware LMTC method trained on 10K-200K
labeled documents; and (3) MICoL tends to predict more infrequent labels than
supervised methods, thus alleviates the deteriorated performance on long-tailed
labels.
Related papers
- Open-world Multi-label Text Classification with Extremely Weak Supervision [30.85235057480158]
We study open-world multi-label text classification under extremely weak supervision (XWS)
We first utilize the user description to prompt a large language model (LLM) for dominant keyphrases of a subset of raw documents, and then construct a label space via clustering.
We then apply a zero-shot multi-label classifier to locate the documents with small top predicted scores, so we can revisit their dominant keyphrases for more long-tail labels.
X-MLClass exhibits a remarkable increase in ground-truth label space coverage on various datasets.
arXiv Detail & Related papers (2024-07-08T04:52:49Z) - Zero-Shot Learning Over Large Output Spaces : Utilizing Indirect Knowledge Extraction from Large Language Models [3.908992369351976]
Extreme Zero-shot XMC (EZ-XMC) is a special setting of XMC wherein no supervision is provided.
Traditional state-of-the-art methods extract pseudo labels from the document title or segments.
We propose a framework to train a small bi-encoder model via the feedback from the large language model (LLM)
arXiv Detail & Related papers (2024-06-13T16:26:37Z) - Multi-Instance Partial-Label Learning: Towards Exploiting Dual Inexact
Supervision [53.530957567507365]
In some real-world tasks, each training sample is associated with a candidate label set that contains one ground-truth label and some false positive labels.
In this paper, we formalize such problems as multi-instance partial-label learning (MIPL)
Existing multi-instance learning algorithms and partial-label learning algorithms are suboptimal for solving MIPL problems.
arXiv Detail & Related papers (2022-12-18T03:28:51Z) - Pseudo-Labeling Based Practical Semi-Supervised Meta-Training for Few-Shot Learning [93.63638405586354]
We propose a simple and effective meta-training framework, called pseudo-labeling based meta-learning (PLML)
Firstly, we train a classifier via common semi-supervised learning (SSL) and use it to obtain the pseudo-labels of unlabeled data.
We build few-shot tasks from labeled and pseudo-labeled data and design a novel finetuning method with feature smoothing and noise suppression.
arXiv Detail & Related papers (2022-07-14T10:53:53Z) - Many-Class Text Classification with Matching [65.74328417321738]
We formulate textbfText textbfClassification as a textbfMatching problem between the text and the labels, and propose a simple yet effective framework named TCM.
Compared with previous text classification approaches, TCM takes advantage of the fine-grained semantic information of the classification labels.
arXiv Detail & Related papers (2022-05-23T15:51:19Z) - Trustable Co-label Learning from Multiple Noisy Annotators [68.59187658490804]
Supervised deep learning depends on massive accurately annotated examples.
A typical alternative is learning from multiple noisy annotators.
This paper proposes a data-efficient approach, called emphTrustable Co-label Learning (TCL)
arXiv Detail & Related papers (2022-03-08T16:57:00Z) - MATCH: Metadata-Aware Text Classification in A Large Hierarchy [60.59183151617578]
MATCH is an end-to-end framework that leverages both metadata and hierarchy information.
We propose different ways to regularize the parameters and output probability of each child label by its parents.
Experiments on two massive text datasets with large-scale label hierarchies demonstrate the effectiveness of MATCH.
arXiv Detail & Related papers (2021-02-15T05:23:08Z) - Multi-label Few/Zero-shot Learning with Knowledge Aggregated from
Multiple Label Graphs [8.44680447457879]
We present a simple multi-graph aggregation model that fuses knowledge from multiple label graphs encoding different semantic label relationships.
We show that methods equipped with the multi-graph knowledge aggregation achieve significant performance improvement across almost all the measures on few/zero-shot labels.
arXiv Detail & Related papers (2020-10-15T01:15:43Z) - An Empirical Study on Large-Scale Multi-Label Text Classification
Including Few and Zero-Shot Labels [49.036212158261215]
Large-scale Multi-label Text Classification (LMTC) has a wide range of Natural Language Processing (NLP) applications.
Current state-of-the-art LMTC models employ Label-Wise Attention Networks (LWANs)
We show that hierarchical methods based on Probabilistic Label Trees (PLTs) outperform LWANs.
We propose a new state-of-the-art method which combines BERT with LWANs.
arXiv Detail & Related papers (2020-10-04T18:55:47Z) - Minimally Supervised Categorization of Text with Metadata [40.13841133991089]
We propose MetaCat, a minimally supervised framework to categorize text with metadata.
We develop a generative process describing the relationships between words, documents, labels, and metadata.
Based on the same generative process, we synthesize training samples to address the bottleneck of label scarcity.
arXiv Detail & Related papers (2020-05-01T21:42:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.