Open Vocabulary Extreme Classification Using Generative Models
- URL: http://arxiv.org/abs/2205.05812v1
- Date: Thu, 12 May 2022 00:33:49 GMT
- Title: Open Vocabulary Extreme Classification Using Generative Models
- Authors: Daniel Simig, Fabio Petroni, Pouya Yanki, Kashyap Popat, Christina Du,
Sebastian Riedel, Majid Yazdani
- Abstract summary: The extreme multi-label classification (XMC) task aims at tagging content with a subset of labels from an extremely large label set.
We propose GROOV, a fine-tuned seq2seq model for OXMC that generates the set of labels as a flat sequence and is trained using a novel loss independent of predicted label order.
We show the efficacy of the approach, experimenting with popular XMC datasets for which GROOV is able to predict meaningful labels outside the given vocabulary while performing on par with state-of-the-art solutions for known labels.
- Score: 24.17018785195843
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The extreme multi-label classification (XMC) task aims at tagging content
with a subset of labels from an extremely large label set. The label vocabulary
is typically defined in advance by domain experts and assumed to capture all
necessary tags. However in real world scenarios this label set, although large,
is often incomplete and experts frequently need to refine it. To develop
systems that simplify this process, we introduce the task of open vocabulary
XMC (OXMC): given a piece of content, predict a set of labels, some of which
may be outside of the known tag set. Hence, in addition to not having training
data for some labels - as is the case in zero-shot classification - models need
to invent some labels on-the-fly. We propose GROOV, a fine-tuned seq2seq model
for OXMC that generates the set of labels as a flat sequence and is trained
using a novel loss independent of predicted label order. We show the efficacy
of the approach, experimenting with popular XMC datasets for which GROOV is
able to predict meaningful labels outside the given vocabulary while performing
on par with state-of-the-art solutions for known labels.
Related papers
- Open-world Multi-label Text Classification with Extremely Weak Supervision [30.85235057480158]
We study open-world multi-label text classification under extremely weak supervision (XWS)
We first utilize the user description to prompt a large language model (LLM) for dominant keyphrases of a subset of raw documents, and then construct a label space via clustering.
We then apply a zero-shot multi-label classifier to locate the documents with small top predicted scores, so we can revisit their dominant keyphrases for more long-tail labels.
X-MLClass exhibits a remarkable increase in ground-truth label space coverage on various datasets.
arXiv Detail & Related papers (2024-07-08T04:52:49Z) - Zero-Shot Learning Over Large Output Spaces : Utilizing Indirect Knowledge Extraction from Large Language Models [3.908992369351976]
Extreme Zero-shot XMC (EZ-XMC) is a special setting of XMC wherein no supervision is provided.
Traditional state-of-the-art methods extract pseudo labels from the document title or segments.
We propose a framework to train a small bi-encoder model via the feedback from the large language model (LLM)
arXiv Detail & Related papers (2024-06-13T16:26:37Z) - Learning label-label correlations in Extreme Multi-label Classification via Label Features [44.00852282861121]
Extreme Multi-label Text Classification (XMC) involves learning a classifier that can assign an input with a subset of most relevant labels from millions of label choices.
Short-text XMC with label features has found numerous applications in areas such as query-to-ad-phrase matching in search ads, title-based product recommendation, prediction of related searches.
We propose Gandalf, a novel approach which makes use of a label co-occurrence graph to leverage label features as additional data points to supplement the training distribution.
arXiv Detail & Related papers (2024-05-03T21:18:43Z) - Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations [91.67511167969934]
imprecise label learning (ILL) is a framework for the unification of learning with various imprecise label configurations.
We demonstrate that ILL can seamlessly adapt to partial label learning, semi-supervised learning, noisy label learning, and, more importantly, a mixture of these settings.
arXiv Detail & Related papers (2023-05-22T04:50:28Z) - Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - Structured Semantic Transfer for Multi-Label Recognition with Partial
Labels [85.6967666661044]
We propose a structured semantic transfer (SST) framework that enables training multi-label recognition models with partial labels.
The framework consists of two complementary transfer modules that explore within-image and cross-image semantic correlations.
Experiments on the Microsoft COCO, Visual Genome and Pascal VOC datasets show that the proposed SST framework obtains superior performance over current state-of-the-art algorithms.
arXiv Detail & Related papers (2021-12-21T02:15:01Z) - Extreme Zero-Shot Learning for Extreme Text Classification [80.95271050744624]
Extreme Zero-Shot XMC (EZ-XMC) and Few-Shot XMC (FS-XMC) are investigated.
We propose to pre-train Transformer-based encoders with self-supervised contrastive losses.
We develop a pre-training method MACLR, which thoroughly leverages the raw text with techniques including Multi-scale Adaptive Clustering, Label Regularization, and self-training with pseudo positive pairs.
arXiv Detail & Related papers (2021-12-16T06:06:42Z) - Label Disentanglement in Partition-based Extreme Multilabel
Classification [111.25321342479491]
We show that the label assignment problem in partition-based XMC can be formulated as an optimization problem.
We show that our method can successfully disentangle multi-modal labels, leading to state-of-the-art (SOTA) results on four XMC benchmarks.
arXiv Detail & Related papers (2021-06-24T03:24:18Z) - A Study on the Autoregressive and non-Autoregressive Multi-label
Learning [77.11075863067131]
We propose a self-attention based variational encoder-model to extract the label-label and label-feature dependencies jointly.
Our model can therefore be used to predict all labels in parallel while still including both label-label and label-feature dependencies.
arXiv Detail & Related papers (2020-12-03T05:41:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.