Prototypical Extreme Multi-label Classification with a Dynamic Margin Loss
- URL: http://arxiv.org/abs/2410.20401v1
- Date: Sun, 27 Oct 2024 10:24:23 GMT
- Title: Prototypical Extreme Multi-label Classification with a Dynamic Margin Loss
- Authors: Kunal Dahiya, Diego Ortego, David Jiménez,
- Abstract summary: Extreme Multi-label Classification (XMC) methods predict relevant labels for a given query in an extremely large label space.
Recent works in XMC address this problem using deep encoders that project text descriptions to an embedding space suitable for recovering the closest labels.
We propose PRIME, a XMC method that employs a novel prototypical contrastive learning technique to reconcile efficiency and performance surpassing brute-force approaches.
- Score: 6.244642999033755
- License:
- Abstract: Extreme Multi-label Classification (XMC) methods predict relevant labels for a given query in an extremely large label space. Recent works in XMC address this problem using deep encoders that project text descriptions to an embedding space suitable for recovering the closest labels. However, learning deep models can be computationally expensive in large output spaces, resulting in a trade-off between high performing brute-force approaches and efficient solutions. In this paper, we propose PRIME, a XMC method that employs a novel prototypical contrastive learning technique to reconcile efficiency and performance surpassing brute-force approaches. We frame XMC as a data-to-prototype prediction task where label prototypes aggregate information from related queries. More precisely, we use a shallow transformer encoder that we coin as Label Prototype Network, which enriches label representations by aggregating text-based embeddings, label centroids and learnable free vectors. We jointly train a deep encoder and the Label Prototype Network using an adaptive triplet loss objective that better adapts to the high granularity and ambiguity of extreme label spaces. PRIME achieves state-of-the-art results in several public benchmarks of different sizes and domains, while keeping the model efficient.
Related papers
- Zero-Shot Learning Over Large Output Spaces : Utilizing Indirect Knowledge Extraction from Large Language Models [3.908992369351976]
Extreme Zero-shot XMC (EZ-XMC) is a special setting of XMC wherein no supervision is provided.
Traditional state-of-the-art methods extract pseudo labels from the document title or segments.
We propose a framework to train a small bi-encoder model via the feedback from the large language model (LLM)
arXiv Detail & Related papers (2024-06-13T16:26:37Z) - UniDEC : Unified Dual Encoder and Classifier Training for Extreme Multi-Label Classification [42.36546066941635]
Extreme Multi-label Classification (XMC) involves predicting a subset of relevant labels from an extremely large label space.
This work proposes UniDEC, a novel end-to-end trainable framework which trains the dual encoder and classifier in together.
arXiv Detail & Related papers (2024-05-04T17:27:51Z) - Learning label-label correlations in Extreme Multi-label Classification via Label Features [44.00852282861121]
Extreme Multi-label Text Classification (XMC) involves learning a classifier that can assign an input with a subset of most relevant labels from millions of label choices.
Short-text XMC with label features has found numerous applications in areas such as query-to-ad-phrase matching in search ads, title-based product recommendation, prediction of related searches.
We propose Gandalf, a novel approach which makes use of a label co-occurrence graph to leverage label features as additional data points to supplement the training distribution.
arXiv Detail & Related papers (2024-05-03T21:18:43Z) - PINA: Leveraging Side Information in eXtreme Multi-label Classification
via Predicted Instance Neighborhood Aggregation [105.52660004082766]
The eXtreme Multi-label Classification(XMC) problem seeks to find relevant labels from an exceptionally large label space.
We propose Predicted Instance Neighborhood Aggregation (PINA), a data enhancement method for the general XMC problem.
Unlike most existing XMC frameworks that treat labels and input instances as featureless indicators and independent entries, PINA extracts information from the label metadata and the correlations among training instances.
arXiv Detail & Related papers (2023-05-21T05:00:40Z) - Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - Ground Truth Inference for Weakly Supervised Entity Matching [76.6732856489872]
We propose a simple but powerful labeling model for weak supervision tasks.
We then tailor the labeling model specifically to the task of entity matching.
We show that our labeling model results in a 9% higher F1 score on average than the best existing method.
arXiv Detail & Related papers (2022-11-13T17:57:07Z) - LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds [62.49198183539889]
We propose a label-efficient semantic segmentation pipeline for outdoor scenes with LiDAR point clouds.
Our method co-designs an efficient labeling process with semi/weakly supervised learning.
Our proposed method is even highly competitive compared to the fully supervised counterpart with 100% labels.
arXiv Detail & Related papers (2022-10-14T19:13:36Z) - Long-tailed Extreme Multi-label Text Classification with Generated
Pseudo Label Descriptions [28.416742933744942]
This paper addresses the challenge of tail label prediction by proposing a novel approach.
It combines the effectiveness of a trained bag-of-words (BoW) classifier in generating informative label descriptions under severe data scarce conditions.
The proposed approach achieves state-of-the-art performance on XMTC benchmark datasets and significantly outperforms the best methods so far in the tail label prediction.
arXiv Detail & Related papers (2022-04-02T23:42:32Z) - Extreme Zero-Shot Learning for Extreme Text Classification [80.95271050744624]
Extreme Zero-Shot XMC (EZ-XMC) and Few-Shot XMC (FS-XMC) are investigated.
We propose to pre-train Transformer-based encoders with self-supervised contrastive losses.
We develop a pre-training method MACLR, which thoroughly leverages the raw text with techniques including Multi-scale Adaptive Clustering, Label Regularization, and self-training with pseudo positive pairs.
arXiv Detail & Related papers (2021-12-16T06:06:42Z) - Label Disentanglement in Partition-based Extreme Multilabel
Classification [111.25321342479491]
We show that the label assignment problem in partition-based XMC can be formulated as an optimization problem.
We show that our method can successfully disentangle multi-modal labels, leading to state-of-the-art (SOTA) results on four XMC benchmarks.
arXiv Detail & Related papers (2021-06-24T03:24:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.