DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short
Text Documents
- URL: http://arxiv.org/abs/2111.06685v1
- Date: Fri, 12 Nov 2021 12:25:23 GMT
- Title: DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short
Text Documents
- Authors: Kunal Dahiya, Deepak Saini, Anshul Mittal, Ankush Shaw, Kushal Dave,
Akshay Soni, Himanshu Jain, Sumeet Agarwal, Manik Varma
- Abstract summary: This paper develops the DeepXML framework that addresses the challenges by decomposing the deep extreme multi-label task into four simpler sub-tasks each of which can be trained accurately and efficiently.
DeepXML yields the Astec algorithm that could be 2-12% more accurate and 5-30x faster to train than leading deep extreme classifiers on publically available short text datasets.
Astec could also efficiently train on Bing short text datasets containing up to 62 million labels while making predictions for billions of users and data points per day on commodity hardware.
- Score: 10.573976360424473
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scalability and accuracy are well recognized challenges in deep extreme
multi-label learning where the objective is to train architectures for
automatically annotating a data point with the most relevant subset of labels
from an extremely large label set. This paper develops the DeepXML framework
that addresses these challenges by decomposing the deep extreme multi-label
task into four simpler sub-tasks each of which can be trained accurately and
efficiently. Choosing different components for the four sub-tasks allows
DeepXML to generate a family of algorithms with varying trade-offs between
accuracy and scalability. In particular, DeepXML yields the Astec algorithm
that could be 2-12% more accurate and 5-30x faster to train than leading deep
extreme classifiers on publically available short text datasets. Astec could
also efficiently train on Bing short text datasets containing up to 62 million
labels while making predictions for billions of users and data points per day
on commodity hardware. This allowed Astec to be deployed on the Bing search
engine for a number of short text applications ranging from matching user
queries to advertiser bid phrases to showing personalized ads where it yielded
significant gains in click-through-rates, coverage, revenue and other online
metrics over state-of-the-art techniques currently in production. DeepXML's
code is available at https://github.com/Extreme-classification/deepxml
Related papers
- Hierarchical Text Classification (HTC) vs. eXtreme Multilabel Classification (XML): Two Sides of the Same Medal [4.750005231187266]
Hierarchical Text Classification (HTC) focuses on datasets with smaller label pools of hundreds of entries, accompanied by a semantic label hierarchy.
eXtreme Multi-Label Text Classification (XML) considers very large label pools with up to millions of entries, in which the labels are not arranged in any particular manner.
Here, we investigate how state-of-the-art models from one domain perform when trained and tested on datasets from the other domain.
arXiv Detail & Related papers (2024-11-20T20:07:25Z) - Multi-Label Knowledge Distillation [86.03990467785312]
We propose a novel multi-label knowledge distillation method.
On one hand, it exploits the informative semantic knowledge from the logits by dividing the multi-label learning problem into a set of binary classification problems.
On the other hand, it enhances the distinctiveness of the learned feature representations by leveraging the structural information of label-wise embeddings.
arXiv Detail & Related papers (2023-08-12T03:19:08Z) - A Survey on Extreme Multi-label Learning [72.8751573611815]
Multi-label learning has attracted significant attention from both academic and industry field in recent decades.
It is infeasible to directly adapt them to extremely large label space because of the compute and memory overhead.
eXtreme Multi-label Learning (XML) is becoming an important task and many effective approaches are proposed.
arXiv Detail & Related papers (2022-10-08T08:31:34Z) - InceptionXML: A Lightweight Framework with Synchronized Negative Sampling for Short Text Extreme Classification [5.637543626451507]
InceptionXML is light-weight, yet powerful, and robust to the inherent lack of word-order in short-text queries.
We show that not only can InceptionXML outperform existing approaches on benchmark datasets but also the transformer baselines requiring only 2% FLOPs.
arXiv Detail & Related papers (2021-09-13T18:55:37Z) - DECAF: Deep Extreme Classification with Label Features [9.768907751312396]
Extreme multi-label classification (XML) involves tagging a data point with its most relevant subset of labels from an extremely large label set.
Leading XML algorithms scale to millions of labels, but they largely ignore label meta-data such as textual descriptions of the labels.
This paper develops the DECAF algorithm that addresses these challenges by learning models enriched by label metadata.
arXiv Detail & Related papers (2021-08-01T05:36:05Z) - ECLARE: Extreme Classification with Label Graph Correlations [13.429436351837653]
This paper presents ECLARE, a scalable deep learning architecture that incorporates not only label text, but also label correlations, to offer accurate real-time predictions within a few milliseconds.
ECLARE offers predictions that are 2 to 14% more accurate on both publicly available benchmark datasets as well as proprietary datasets for a related products recommendation task sourced from the Bing search engine.
arXiv Detail & Related papers (2021-07-31T15:13:13Z) - HTCInfoMax: A Global Model for Hierarchical Text Classification via
Information Maximization [75.45291796263103]
The current state-of-the-art model HiAGM for hierarchical text classification has two limitations.
It correlates each text sample with all labels in the dataset which contains irrelevant information.
We propose HTCInfoMax to address these issues by introducing information which includes two modules.
arXiv Detail & Related papers (2021-04-12T06:04:20Z) - Minimally-Supervised Structure-Rich Text Categorization via Learning on
Text-Rich Networks [61.23408995934415]
We propose a novel framework for minimally supervised categorization by learning from the text-rich network.
Specifically, we jointly train two modules with different inductive biases -- a text analysis module for text understanding and a network learning module for class-discriminative, scalable network learning.
Our experiments show that given only three seed documents per category, our framework can achieve an accuracy of about 92%.
arXiv Detail & Related papers (2021-02-23T04:14:34Z) - MATCH: Metadata-Aware Text Classification in A Large Hierarchy [60.59183151617578]
MATCH is an end-to-end framework that leverages both metadata and hierarchy information.
We propose different ways to regularize the parameters and output probability of each child label by its parents.
Experiments on two massive text datasets with large-scale label hierarchies demonstrate the effectiveness of MATCH.
arXiv Detail & Related papers (2021-02-15T05:23:08Z) - Adaptive Self-training for Few-shot Neural Sequence Labeling [55.43109437200101]
We develop techniques to address the label scarcity challenge for neural sequence labeling models.
Self-training serves as an effective mechanism to learn from large amounts of unlabeled data.
meta-learning helps in adaptive sample re-weighting to mitigate error propagation from noisy pseudo-labels.
arXiv Detail & Related papers (2020-10-07T22:29:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.