Enabling Efficiency-Precision Trade-offs for Label Trees in Extreme
Classification
- URL: http://arxiv.org/abs/2106.00730v1
- Date: Tue, 1 Jun 2021 19:02:09 GMT
- Title: Enabling Efficiency-Precision Trade-offs for Label Trees in Extreme
Classification
- Authors: Tavor Z. Baharav, Daniel L. Jiang, Kedarnath Kolluri, Sujay Sanghavi,
Inderjit S. Dhillon
- Abstract summary: Extreme multi-label classification (XMC) aims to learn a model that can tag data points with a subset of relevant labels from an extremely large label set.
We propose an efficient information theory inspired algorithm to construct intermediary operating points that trade off between the benefits of both.
Our method can reduce a proxy for expected latency by up to 28% while maintaining the same accuracy as Parabel.
- Score: 43.840626501982314
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Extreme multi-label classification (XMC) aims to learn a model that can tag
data points with a subset of relevant labels from an extremely large label set.
Real world e-commerce applications like personalized recommendations and
product advertising can be formulated as XMC problems, where the objective is
to predict for a user a small subset of items from a catalog of several million
products. For such applications, a common approach is to organize these labels
into a tree, enabling training and inference times that are logarithmic in the
number of labels. While training a model once a label tree is available is well
studied, designing the structure of the tree is a difficult task that is not
yet well understood, and can dramatically impact both model latency and
statistical performance. Existing approaches to tree construction fall at an
extreme point, either optimizing exclusively for statistical performance, or
for latency. We propose an efficient information theory inspired algorithm to
construct intermediary operating points that trade off between the benefits of
both. Our algorithm enables interpolation between these objectives, which was
not previously possible. We corroborate our theoretical analysis with numerical
results, showing that on the Wiki-500K benchmark dataset our method can reduce
a proxy for expected latency by up to 28% while maintaining the same accuracy
as Parabel. On several datasets derived from e-commerce customer logs, our
modified label tree is able to improve this expected latency metric by up to
20% while maintaining the same accuracy. Finally, we discuss challenges in
realizing these latency improvements in deployed models.
Related papers
- Learning with Noisy Labels: Interconnection of Two
Expectation-Maximizations [41.65589788264123]
Labor-intensive labeling becomes a bottleneck in developing computer vision algorithms based on deep learning.
We address learning with noisy labels (LNL) problem, which is formalized as a task of finding a structured manifold in the midst of noisy data.
Our algorithm achieves state-of-the-art performance in multiple standard benchmarks with substantial margins under various types of label noise.
arXiv Detail & Related papers (2024-01-09T07:22:30Z) - Improving Text Matching in E-Commerce Search with A Rationalizable,
Intervenable and Fast Entity-Based Relevance Model [78.80174696043021]
We propose a novel model called the Entity-Based Relevance Model (EBRM)
The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy.
We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance.
arXiv Detail & Related papers (2023-07-01T15:44:53Z) - LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds [62.49198183539889]
We propose a label-efficient semantic segmentation pipeline for outdoor scenes with LiDAR point clouds.
Our method co-designs an efficient labeling process with semi/weakly supervised learning.
Our proposed method is even highly competitive compared to the fully supervised counterpart with 100% labels.
arXiv Detail & Related papers (2022-10-14T19:13:36Z) - Effective Token Graph Modeling using a Novel Labeling Strategy for
Structured Sentiment Analysis [39.770652220521384]
State-of-the-art model for structured sentiment analysis casts the task as a dependency parsing problem.
Label proportions for span prediction and span relation prediction are imbalanced.
Two nodes in a dependency graph cannot have multiple arcs, therefore some overlapped sentiments cannot be recognized.
arXiv Detail & Related papers (2022-03-21T08:23:03Z) - Towards Good Practices for Efficiently Annotating Large-Scale Image
Classification Datasets [90.61266099147053]
We investigate efficient annotation strategies for collecting multi-class classification labels for a large collection of images.
We propose modifications and best practices aimed at minimizing human labeling effort.
Simulated experiments on a 125k image subset of the ImageNet100 show that it can be annotated to 80% top-1 accuracy with 0.35 annotations per image on average.
arXiv Detail & Related papers (2021-04-26T16:29:32Z) - Robust Optimal Classification Trees under Noisy Labels [1.5039745292757671]
We propose a novel methodology to construct Optimal Classification Trees that takes into account that noisy labels may occur in the training sample.
Our approach rests on two main elements: (1) the splitting rules for the classification trees are designed to maximize the separation margin between classes applying the paradigm of SVM; and (2) some of the labels of the training sample are allowed to be changed during the construction of the tree trying to detect the label noise.
arXiv Detail & Related papers (2020-12-15T19:12:29Z) - SLADE: A Self-Training Framework For Distance Metric Learning [75.54078592084217]
We present a self-training framework, SLADE, to improve retrieval performance by leveraging additional unlabeled data.
We first train a teacher model on the labeled data and use it to generate pseudo labels for the unlabeled data.
We then train a student model on both labels and pseudo labels to generate final feature embeddings.
arXiv Detail & Related papers (2020-11-20T08:26:10Z) - Probabilistic Label Trees for Extreme Multi-label Classification [8.347190888362194]
Problems of extreme multi-label classification (XMLC) are efficiently handled by organizing labels as a tree.
PLTs can be treated as a generalization of hierarchical softmax for multi-label problems.
We introduce the model and discuss training and inference procedures and their computational costs.
We prove a specific equivalence between the fully online algorithm and an algorithm with a tree structure given in advance.
arXiv Detail & Related papers (2020-09-23T15:30:00Z) - Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm.
We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data.
Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z) - GraftNet: An Engineering Implementation of CNN for Fine-grained
Multi-label Task [17.885793498743723]
GraftNet is a customizable tree-like network with its trunk pretrained with a dynamic graph for generic feature extraction.
We show that it has good performance on our human attributes recognition task, which is fine-grained multi-label classification.
arXiv Detail & Related papers (2020-04-27T11:08:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.