A Survey on Extreme Multi-label Learning
- URL: http://arxiv.org/abs/2210.03968v1
- Date: Sat, 8 Oct 2022 08:31:34 GMT
- Title: A Survey on Extreme Multi-label Learning
- Authors: Tong Wei, Zhen Mao, Jiang-Xin Shi, Yu-Feng Li, Min-Ling Zhang
- Abstract summary: Multi-label learning has attracted significant attention from both academic and industry field in recent decades.
It is infeasible to directly adapt them to extremely large label space because of the compute and memory overhead.
eXtreme Multi-label Learning (XML) is becoming an important task and many effective approaches are proposed.
- Score: 72.8751573611815
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-label learning has attracted significant attention from both academic
and industry field in recent decades. Although existing multi-label learning
algorithms achieved good performance in various tasks, they implicitly assume
the size of target label space is not huge, which can be restrictive for
real-world scenarios. Moreover, it is infeasible to directly adapt them to
extremely large label space because of the compute and memory overhead.
Therefore, eXtreme Multi-label Learning (XML) is becoming an important task and
many effective approaches are proposed. To fully understand XML, we conduct a
survey study in this paper. We first clarify a formal definition for XML from
the perspective of supervised learning. Then, based on different model
architectures and challenges of the problem, we provide a thorough discussion
of the advantages and disadvantages of each category of methods. For the
benefit of conducting empirical studies, we collect abundant resources
regarding XML, including code implementations, and useful tools. Lastly, we
propose possible research directions in XML, such as new evaluation metrics,
the tail label problem, and weakly supervised XML.
Related papers
- ICXML: An In-Context Learning Framework for Zero-Shot Extreme Multi-Label Classification [22.825115483590285]
This paper focuses on the task of Extreme Multi-Label Classification (XMC) whose goal is to predict multiple labels for each instance from an extremely large label space.
We introduce In-Context Extreme Multilabel Learning (ICXML), a two-stage framework that cuts down the search space by generating a set of candidate labels through incontext learning and then reranks them.
arXiv Detail & Related papers (2023-11-16T08:01:17Z) - Multi-Label Knowledge Distillation [86.03990467785312]
We propose a novel multi-label knowledge distillation method.
On one hand, it exploits the informative semantic knowledge from the logits by dividing the multi-label learning problem into a set of binary classification problems.
On the other hand, it enhances the distinctiveness of the learned feature representations by leveraging the structural information of label-wise embeddings.
arXiv Detail & Related papers (2023-08-12T03:19:08Z) - A Survey of Label-Efficient Deep Learning for 3D Point Clouds [109.07889215814589]
This paper presents the first comprehensive survey of label-efficient learning of point clouds.
We propose a taxonomy that organizes label-efficient learning methods based on the data prerequisites provided by different types of labels.
For each approach, we outline the problem setup and provide an extensive literature review that showcases relevant progress and challenges.
arXiv Detail & Related papers (2023-05-31T12:54:51Z) - Light-weight Deep Extreme Multilabel Classification [12.29534534973133]
Extreme multi-label (XML) classification refers to the task of supervised multi-label learning that involves a large number of labels.
We develop a method called LightDXML which modifies the recently developed deep learning based XML framework by using label embeddings.
LightDXML also removes the requirement of a re-ranker module, thereby, leading to further savings on time and memory requirements.
arXiv Detail & Related papers (2023-04-20T09:06:10Z) - A Multi-label Continual Learning Framework to Scale Deep Learning
Approaches for Packaging Equipment Monitoring [57.5099555438223]
We study multi-label classification in the continual scenario for the first time.
We propose an efficient approach that has a logarithmic complexity with regard to the number of tasks.
We validate our approach on a real-world multi-label Forecasting problem from the packaging industry.
arXiv Detail & Related papers (2022-08-08T15:58:39Z) - Large Loss Matters in Weakly Supervised Multi-Label Classification [50.262533546999045]
We first regard unobserved labels as negative labels, casting the W task into noisy multi-label classification.
We propose novel methods for W which reject or correct the large loss samples to prevent model from memorizing the noisy label.
Our methodology actually works well, validating that treating large loss properly matters in a weakly supervised multi-label classification.
arXiv Detail & Related papers (2022-06-08T08:30:24Z) - DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short
Text Documents [10.573976360424473]
This paper develops the DeepXML framework that addresses the challenges by decomposing the deep extreme multi-label task into four simpler sub-tasks each of which can be trained accurately and efficiently.
DeepXML yields the Astec algorithm that could be 2-12% more accurate and 5-30x faster to train than leading deep extreme classifiers on publically available short text datasets.
Astec could also efficiently train on Bing short text datasets containing up to 62 million labels while making predictions for billions of users and data points per day on commodity hardware.
arXiv Detail & Related papers (2021-11-12T12:25:23Z) - Propensity-scored Probabilistic Label Trees [3.764094942736144]
We introduce an inference procedure, based on the $A*$-search algorithm, that efficiently finds the optimal solution for XMLC problems.
We demonstrate the attractiveness of this approach in a wide empirical study on popular XMLC benchmark datasets.
arXiv Detail & Related papers (2021-10-20T22:10:20Z) - The Emerging Trends of Multi-Label Learning [45.63795570392158]
Exabytes of data are generated daily by humans, leading to the growing need for new efforts in dealing with the grand challenges for multi-label learning brought by big data.
There is a lack of systemic studies that focus explicitly on analyzing the emerging trends and new challenges of multi-label learning in the era of big data.
It is imperative to call for a comprehensive survey to fulfill this mission and delineate future research directions and new applications.
arXiv Detail & Related papers (2020-11-23T03:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.