Related papers: Enabling Efficiency-Precision Trade-offs for Label Trees in Extreme Classification

Enabling Efficiency-Precision Trade-offs for Label Trees in Extreme Classification

URL: http://arxiv.org/abs/2106.00730v1
Date: Tue, 1 Jun 2021 19:02:09 GMT
Title: Enabling Efficiency-Precision Trade-offs for Label Trees in Extreme Classification
Authors: Tavor Z. Baharav, Daniel L. Jiang, Kedarnath Kolluri, Sujay Sanghavi, Inderjit S. Dhillon
Abstract summary: Extreme multi-label classification (XMC) aims to learn a model that can tag data points with a subset of relevant labels from an extremely large label set. We propose an efficient information theory inspired algorithm to construct intermediary operating points that trade off between the benefits of both. Our method can reduce a proxy for expected latency by up to 28% while maintaining the same accuracy as Parabel.
Score: 43.840626501982314
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Extreme multi-label classification (XMC) aims to learn a model that can tag data points with a subset of relevant labels from an extremely large label set. Real world e-commerce applications like personalized recommendations and product advertising can be formulated as XMC problems, where the objective is to predict for a user a small subset of items from a catalog of several million products. For such applications, a common approach is to organize these labels into a tree, enabling training and inference times that are logarithmic in the number of labels. While training a model once a label tree is available is well studied, designing the structure of the tree is a difficult task that is not yet well understood, and can dramatically impact both model latency and statistical performance. Existing approaches to tree construction fall at an extreme point, either optimizing exclusively for statistical performance, or for latency. We propose an efficient information theory inspired algorithm to construct intermediary operating points that trade off between the benefits of both. Our algorithm enables interpolation between these objectives, which was not previously possible. We corroborate our theoretical analysis with numerical results, showing that on the Wiki-500K benchmark dataset our method can reduce a proxy for expected latency by up to 28% while maintaining the same accuracy as Parabel. On several datasets derived from e-commerce customer logs, our modified label tree is able to improve this expected latency metric by up to 20% while maintaining the same accuracy. Finally, we discuss challenges in realizing these latency improvements in deployed models.

Related papers

Learning with Noisy Labels: Interconnection of Two Expectation-Maximizations [41.65589788264123]
Labor-intensive labeling becomes a bottleneck in developing computer vision algorithms based on deep learning. We address learning with noisy labels (LNL) problem, which is formalized as a task of finding a structured manifold in the midst of noisy data. Our algorithm achieves state-of-the-art performance in multiple standard benchmarks with substantial margins under various types of label noise.
arXiv Detail & Related papers (2024-01-09T07:22:30Z)
Improving Text Matching in E-Commerce Search with A Rationalizable, Intervenable and Fast Entity-Based Relevance Model [78.80174696043021]
We propose a novel model called the Entity-Based Relevance Model (EBRM) The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy. We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance.
arXiv Detail & Related papers (2023-07-01T15:44:53Z)
LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds [62.49198183539889]
We propose a label-efficient semantic segmentation pipeline for outdoor scenes with LiDAR point clouds. Our method co-designs an efficient labeling process with semi/weakly supervised learning. Our proposed method is even highly competitive compared to the fully supervised counterpart with 100% labels.
arXiv Detail & Related papers (2022-10-14T19:13:36Z)
Effective Token Graph Modeling using a Novel Labeling Strategy for Structured Sentiment Analysis [39.770652220521384]
State-of-the-art model for structured sentiment analysis casts the task as a dependency parsing problem. Label proportions for span prediction and span relation prediction are imbalanced. Two nodes in a dependency graph cannot have multiple arcs, therefore some overlapped sentiments cannot be recognized.
arXiv Detail & Related papers (2022-03-21T08:23:03Z)
Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets [90.61266099147053]
We investigate efficient annotation strategies for collecting multi-class classification labels for a large collection of images. We propose modifications and best practices aimed at minimizing human labeling effort. Simulated experiments on a 125k image subset of the ImageNet100 show that it can be annotated to 80% top-1 accuracy with 0.35 annotations per image on average.
arXiv Detail & Related papers (2021-04-26T16:29:32Z)
Robust Optimal Classification Trees under Noisy Labels [1.5039745292757671]
We propose a novel methodology to construct Optimal Classification Trees that takes into account that noisy labels may occur in the training sample. Our approach rests on two main elements: (1) the splitting rules for the classification trees are designed to maximize the separation margin between classes applying the paradigm of SVM; and (2) some of the labels of the training sample are allowed to be changed during the construction of the tree trying to detect the label noise.
arXiv Detail & Related papers (2020-12-15T19:12:29Z)
SLADE: A Self-Training Framework For Distance Metric Learning [75.54078592084217]
We present a self-training framework, SLADE, to improve retrieval performance by leveraging additional unlabeled data. We first train a teacher model on the labeled data and use it to generate pseudo labels for the unlabeled data. We then train a student model on both labels and pseudo labels to generate final feature embeddings.
arXiv Detail & Related papers (2020-11-20T08:26:10Z)
Probabilistic Label Trees for Extreme Multi-label Classification [8.347190888362194]
Problems of extreme multi-label classification (XMLC) are efficiently handled by organizing labels as a tree. PLTs can be treated as a generalization of hierarchical softmax for multi-label problems. We introduce the model and discuss training and inference procedures and their computational costs. We prove a specific equivalence between the fully online algorithm and an algorithm with a tree structure given in advance.
arXiv Detail & Related papers (2020-09-23T15:30:00Z)
Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm. We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data. Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z)
GraftNet: An Engineering Implementation of CNN for Fine-grained Multi-label Task [17.885793498743723]
GraftNet is a customizable tree-like network with its trunk pretrained with a dynamic graph for generic feature extraction. We show that it has good performance on our human attributes recognition task, which is fine-grained multi-label classification.
arXiv Detail & Related papers (2020-04-27T11:08:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.