LightXML: Transformer with Dynamic Negative Sampling for
High-Performance Extreme Multi-label Text Classification
- URL: http://arxiv.org/abs/2101.03305v1
- Date: Sat, 9 Jan 2021 07:04:18 GMT
- Title: LightXML: Transformer with Dynamic Negative Sampling for
High-Performance Extreme Multi-label Text Classification
- Authors: Ting Jiang, Deqing Wang, Leilei Sun, Huayi Yang, Zhengyang Zhao,
Fuzhen Zhuang
- Abstract summary: Extreme Multi-label text Classification (XMC) is a task of finding the most relevant labels from a large label set.
We propose LightXML, which adopts end-to-end training and dynamic negative labels sampling.
In experiments, LightXML outperforms state-of-the-art methods in five extreme multi-label datasets.
- Score: 27.80266694835677
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Extreme Multi-label text Classification (XMC) is a task of finding the most
relevant labels from a large label set. Nowadays deep learning-based methods
have shown significant success in XMC. However, the existing methods (e.g.,
AttentionXML and X-Transformer etc) still suffer from 1) combining several
models to train and predict for one dataset, and 2) sampling negative labels
statically during the process of training label ranking model, which reduces
both the efficiency and accuracy of the model. To address the above problems,
we proposed LightXML, which adopts end-to-end training and dynamic negative
labels sampling. In LightXML, we use generative cooperative networks to recall
and rank labels, in which label recalling part generates negative and positive
labels, and label ranking part distinguishes positive labels from these labels.
Through these networks, negative labels are sampled dynamically during label
ranking part training by feeding with the same text representation. Extensive
experiments show that LightXML outperforms state-of-the-art methods in five
extreme multi-label datasets with much smaller model size and lower
computational complexity. In particular, on the Amazon dataset with 670K
labels, LightXML can reduce the model size up to 72% compared to AttentionXML.
Related papers
- Learning label-label correlations in Extreme Multi-label Classification via Label Features [44.00852282861121]
Extreme Multi-label Text Classification (XMC) involves learning a classifier that can assign an input with a subset of most relevant labels from millions of label choices.
Short-text XMC with label features has found numerous applications in areas such as query-to-ad-phrase matching in search ads, title-based product recommendation, prediction of related searches.
We propose Gandalf, a novel approach which makes use of a label co-occurrence graph to leverage label features as additional data points to supplement the training distribution.
arXiv Detail & Related papers (2024-05-03T21:18:43Z) - MatchXML: An Efficient Text-label Matching Framework for Extreme
Multi-label Text Classification [13.799733640048672]
The eXtreme Multi-label text Classification(XMC) refers to training a classifier that assigns a text sample with relevant labels from a large-scale label set.
We propose MatchXML, an efficient text-label matching framework for XMC.
Experimental results demonstrate that MatchXML achieves state-of-the-art accuracy on five out of six datasets.
arXiv Detail & Related papers (2023-08-25T02:32:36Z) - Light-weight Deep Extreme Multilabel Classification [12.29534534973133]
Extreme multi-label (XML) classification refers to the task of supervised multi-label learning that involves a large number of labels.
We develop a method called LightDXML which modifies the recently developed deep learning based XML framework by using label embeddings.
LightDXML also removes the requirement of a re-ranker module, thereby, leading to further savings on time and memory requirements.
arXiv Detail & Related papers (2023-04-20T09:06:10Z) - Bridging the Gap between Model Explanations in Partially Annotated
Multi-label Classification [85.76130799062379]
We study how false negative labels affect the model's explanation.
We propose to boost the attribution scores of the model trained with partial labels to make its explanation resemble that of the model trained with full labels.
arXiv Detail & Related papers (2023-04-04T14:00:59Z) - Large Loss Matters in Weakly Supervised Multi-Label Classification [50.262533546999045]
We first regard unobserved labels as negative labels, casting the W task into noisy multi-label classification.
We propose novel methods for W which reject or correct the large loss samples to prevent model from memorizing the noisy label.
Our methodology actually works well, validating that treating large loss properly matters in a weakly supervised multi-label classification.
arXiv Detail & Related papers (2022-06-08T08:30:24Z) - Acknowledging the Unknown for Multi-label Learning with Single Positive
Labels [65.5889334964149]
Traditionally, all unannotated labels are assumed as negative labels in single positive multi-label learning (SPML)
We propose entropy-maximization (EM) loss to maximize the entropy of predicted probabilities for all unannotated labels.
Considering the positive-negative label imbalance of unannotated labels, we propose asymmetric pseudo-labeling (APL) with asymmetric-tolerance strategies and a self-paced procedure to provide more precise supervision.
arXiv Detail & Related papers (2022-03-30T11:43:59Z) - Label Disentanglement in Partition-based Extreme Multilabel
Classification [111.25321342479491]
We show that the label assignment problem in partition-based XMC can be formulated as an optimization problem.
We show that our method can successfully disentangle multi-modal labels, leading to state-of-the-art (SOTA) results on four XMC benchmarks.
arXiv Detail & Related papers (2021-06-24T03:24:18Z) - Group-aware Label Transfer for Domain Adaptive Person Re-identification [179.816105255584]
Unsupervised Adaptive Domain (UDA) person re-identification (ReID) aims at adapting the model trained on a labeled source-domain dataset to a target-domain dataset without any further annotations.
Most successful UDA-ReID approaches combine clustering-based pseudo-label prediction with representation learning and perform the two steps in an alternating fashion.
We propose a Group-aware Label Transfer (GLT) algorithm, which enables the online interaction and mutual promotion of pseudo-label prediction and representation learning.
arXiv Detail & Related papers (2021-03-23T07:57:39Z) - GNN-XML: Graph Neural Networks for Extreme Multi-label Text
Classification [23.79498916023468]
Extreme multi-label text classification (XMTC) aims to tag a text instance with the most relevant subset of labels from an extremely large label set.
GNN-XML is a scalable graph neural network framework tailored for XMTC problems.
arXiv Detail & Related papers (2020-12-10T18:18:34Z) - A Study on the Autoregressive and non-Autoregressive Multi-label
Learning [77.11075863067131]
We propose a self-attention based variational encoder-model to extract the label-label and label-feature dependencies jointly.
Our model can therefore be used to predict all labels in parallel while still including both label-label and label-feature dependencies.
arXiv Detail & Related papers (2020-12-03T05:41:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.