ECLARE: Extreme Classification with Label Graph Correlations
- URL: http://arxiv.org/abs/2108.00261v1
- Date: Sat, 31 Jul 2021 15:13:13 GMT
- Title: ECLARE: Extreme Classification with Label Graph Correlations
- Authors: Anshul Mittal, Noveen Sachdeva, Sheshansh Agrawal, Sumeet Agarwal,
Purushottam Kar, Manik Varma
- Abstract summary: This paper presents ECLARE, a scalable deep learning architecture that incorporates not only label text, but also label correlations, to offer accurate real-time predictions within a few milliseconds.
ECLARE offers predictions that are 2 to 14% more accurate on both publicly available benchmark datasets as well as proprietary datasets for a related products recommendation task sourced from the Bing search engine.
- Score: 13.429436351837653
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep extreme classification (XC) seeks to train deep architectures that can
tag a data point with its most relevant subset of labels from an extremely
large label set. The core utility of XC comes from predicting labels that are
rarely seen during training. Such rare labels hold the key to personalized
recommendations that can delight and surprise a user. However, the large number
of rare labels and small amount of training data per rare label offer
significant statistical and computational challenges. State-of-the-art deep XC
methods attempt to remedy this by incorporating textual descriptions of labels
but do not adequately address the problem. This paper presents ECLARE, a
scalable deep learning architecture that incorporates not only label text, but
also label correlations, to offer accurate real-time predictions within a few
milliseconds. Core contributions of ECLARE include a frugal architecture and
scalable techniques to train deep models along with label correlation graphs at
the scale of millions of labels. In particular, ECLARE offers predictions that
are 2 to 14% more accurate on both publicly available benchmark datasets as
well as proprietary datasets for a related products recommendation task sourced
from the Bing search engine. Code for ECLARE is available at
https://github.com/Extreme-classification/ECLARE.
Related papers
- Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets [51.74296438621836]
We introduce Scribbles for All, a label and training data generation algorithm for semantic segmentation trained on scribble labels.
The main limitation of scribbles as source for weak supervision is the lack of challenging datasets for scribble segmentation.
Scribbles for All provides scribble labels for several popular segmentation datasets and provides an algorithm to automatically generate scribble labels for any dataset with dense annotations.
arXiv Detail & Related papers (2024-08-22T15:29:08Z) - Zero-Shot Learning Over Large Output Spaces : Utilizing Indirect Knowledge Extraction from Large Language Models [3.908992369351976]
Extreme Zero-shot XMC (EZ-XMC) is a special setting of XMC wherein no supervision is provided.
Traditional state-of-the-art methods extract pseudo labels from the document title or segments.
We propose a framework to train a small bi-encoder model via the feedback from the large language model (LLM)
arXiv Detail & Related papers (2024-06-13T16:26:37Z) - Learning label-label correlations in Extreme Multi-label Classification via Label Features [44.00852282861121]
Extreme Multi-label Text Classification (XMC) involves learning a classifier that can assign an input with a subset of most relevant labels from millions of label choices.
Short-text XMC with label features has found numerous applications in areas such as query-to-ad-phrase matching in search ads, title-based product recommendation, prediction of related searches.
We propose Gandalf, a novel approach which makes use of a label co-occurrence graph to leverage label features as additional data points to supplement the training distribution.
arXiv Detail & Related papers (2024-05-03T21:18:43Z) - Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations [91.67511167969934]
imprecise label learning (ILL) is a framework for the unification of learning with various imprecise label configurations.
We demonstrate that ILL can seamlessly adapt to partial label learning, semi-supervised learning, noisy label learning, and, more importantly, a mixture of these settings.
arXiv Detail & Related papers (2023-05-22T04:50:28Z) - Open Vocabulary Extreme Classification Using Generative Models [24.17018785195843]
The extreme multi-label classification (XMC) task aims at tagging content with a subset of labels from an extremely large label set.
We propose GROOV, a fine-tuned seq2seq model for OXMC that generates the set of labels as a flat sequence and is trained using a novel loss independent of predicted label order.
We show the efficacy of the approach, experimenting with popular XMC datasets for which GROOV is able to predict meaningful labels outside the given vocabulary while performing on par with state-of-the-art solutions for known labels.
arXiv Detail & Related papers (2022-05-12T00:33:49Z) - Extreme Zero-Shot Learning for Extreme Text Classification [80.95271050744624]
Extreme Zero-Shot XMC (EZ-XMC) and Few-Shot XMC (FS-XMC) are investigated.
We propose to pre-train Transformer-based encoders with self-supervised contrastive losses.
We develop a pre-training method MACLR, which thoroughly leverages the raw text with techniques including Multi-scale Adaptive Clustering, Label Regularization, and self-training with pseudo positive pairs.
arXiv Detail & Related papers (2021-12-16T06:06:42Z) - DECAF: Deep Extreme Classification with Label Features [9.768907751312396]
Extreme multi-label classification (XML) involves tagging a data point with its most relevant subset of labels from an extremely large label set.
Leading XML algorithms scale to millions of labels, but they largely ignore label meta-data such as textual descriptions of the labels.
This paper develops the DECAF algorithm that addresses these challenges by learning models enriched by label metadata.
arXiv Detail & Related papers (2021-08-01T05:36:05Z) - GNN-XML: Graph Neural Networks for Extreme Multi-label Text
Classification [23.79498916023468]
Extreme multi-label text classification (XMTC) aims to tag a text instance with the most relevant subset of labels from an extremely large label set.
GNN-XML is a scalable graph neural network framework tailored for XMTC problems.
arXiv Detail & Related papers (2020-12-10T18:18:34Z) - A Study on the Autoregressive and non-Autoregressive Multi-label
Learning [77.11075863067131]
We propose a self-attention based variational encoder-model to extract the label-label and label-feature dependencies jointly.
Our model can therefore be used to predict all labels in parallel while still including both label-label and label-feature dependencies.
arXiv Detail & Related papers (2020-12-03T05:41:44Z) - Knowledge-Guided Multi-Label Few-Shot Learning for General Image
Recognition [75.44233392355711]
KGGR framework exploits prior knowledge of statistical label correlations with deep neural networks.
It first builds a structured knowledge graph to correlate different labels based on statistical label co-occurrence.
Then, it introduces the label semantics to guide learning semantic-specific features.
It exploits a graph propagation network to explore graph node interactions.
arXiv Detail & Related papers (2020-09-20T15:05:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.