Adopting the Multi-answer Questioning Task with an Auxiliary Metric for
Extreme Multi-label Text Classification Utilizing the Label Hierarchy
- URL: http://arxiv.org/abs/2303.01064v1
- Date: Thu, 2 Mar 2023 08:40:31 GMT
- Title: Adopting the Multi-answer Questioning Task with an Auxiliary Metric for
Extreme Multi-label Text Classification Utilizing the Label Hierarchy
- Authors: Li Wang, Ying Wah Teh, Mohammed Ali Al-Garadi
- Abstract summary: This paper adopts the multi-answer questioning task for extreme multi-label classification.
This study adopts the proposed method and the evaluation metric to the legal domain.
- Score: 10.87653109398961
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Extreme multi-label text classification utilizes the label hierarchy to
partition extreme labels into multiple label groups, turning the task into
simple multi-group multi-label classification tasks. Current research encodes
labels as a vector with fixed length which needs establish multiple classifiers
for different label groups. The problem is how to build only one classifier
without sacrificing the label relationship in the hierarchy. This paper adopts
the multi-answer questioning task for extreme multi-label classification. This
paper also proposes an auxiliary classification evaluation metric. This study
adopts the proposed method and the evaluation metric to the legal domain. The
utilization of legal Berts and the study on task distribution are discussed.
The experiment results show that the proposed hierarchy and multi-answer
questioning task can do extreme multi-label classification for EURLEX dataset.
And in minor/fine-tuning the multi-label classification task, the domain
adapted BERT models could not show apparent advantages in this experiment. The
method is also theoretically applicable to zero-shot learning.
Related papers
- Active Generalized Category Discovery [60.69060965936214]
Generalized Category Discovery (GCD) endeavors to cluster unlabeled samples from both novel and old classes.
We take the spirit of active learning and propose a new setting called Active Generalized Category Discovery (AGCD)
Our method achieves state-of-the-art performance on both generic and fine-grained datasets.
arXiv Detail & Related papers (2024-03-07T07:12:24Z) - Multi-Label Knowledge Distillation [86.03990467785312]
We propose a novel multi-label knowledge distillation method.
On one hand, it exploits the informative semantic knowledge from the logits by dividing the multi-label learning problem into a set of binary classification problems.
On the other hand, it enhances the distinctiveness of the learned feature representations by leveraging the structural information of label-wise embeddings.
arXiv Detail & Related papers (2023-08-12T03:19:08Z) - Towards Imbalanced Large Scale Multi-label Classification with Partially
Annotated Labels [8.977819892091]
Multi-label classification is a widely encountered problem in daily life, where an instance can be associated with multiple classes.
In this work, we address the issue of label imbalance and investigate how to train neural networks using partial labels.
arXiv Detail & Related papers (2023-07-31T21:50:48Z) - Learning from Stochastic Labels [8.178975818137937]
Annotating multi-class instances is a crucial task in the field of machine learning.
In this paper, we propose a novel suitable approach to learn from these labels.
arXiv Detail & Related papers (2023-02-01T08:04:27Z) - Multi-Instance Partial-Label Learning: Towards Exploiting Dual Inexact
Supervision [53.530957567507365]
In some real-world tasks, each training sample is associated with a candidate label set that contains one ground-truth label and some false positive labels.
In this paper, we formalize such problems as multi-instance partial-label learning (MIPL)
Existing multi-instance learning algorithms and partial-label learning algorithms are suboptimal for solving MIPL problems.
arXiv Detail & Related papers (2022-12-18T03:28:51Z) - TagRec++: Hierarchical Label Aware Attention Network for Question
Categorization [0.3683202928838613]
Online learning systems organize the content according to a well defined taxonomy of hierarchical nature.
The task of categorizing inputs to the hierarchical labels is usually cast as a flat multi-class classification problem.
We formulate the task as a dense retrieval problem to retrieve the appropriate hierarchical labels for each content.
arXiv Detail & Related papers (2022-08-10T05:08:37Z) - Expert Knowledge-Guided Length-Variant Hierarchical Label Generation for
Proposal Classification [21.190465278587045]
Proposal classification aims to classify a proposal into a length-variant sequence of labels.
We develop a new deep proposal classification framework to jointly model the three features.
Our model can automatically identify the best length of label sequence to stop next label prediction.
arXiv Detail & Related papers (2021-09-14T13:09:28Z) - MATCH: Metadata-Aware Text Classification in A Large Hierarchy [60.59183151617578]
MATCH is an end-to-end framework that leverages both metadata and hierarchy information.
We propose different ways to regularize the parameters and output probability of each child label by its parents.
Experiments on two massive text datasets with large-scale label hierarchies demonstrate the effectiveness of MATCH.
arXiv Detail & Related papers (2021-02-15T05:23:08Z) - Multilabel Classification by Hierarchical Partitioning and
Data-dependent Grouping [33.48217977134427]
We exploit the sparsity of label vectors and the hierarchical structure to embed them in low-dimensional space.
We present a novel data-dependent grouping approach, where we use a group construction based on a low-rank Nonnegative Matrix Factorization.
We then present a hierarchical partitioning approach that exploits the label hierarchy in large scale problems to divide up the large label space and create smaller sub-problems.
arXiv Detail & Related papers (2020-06-24T22:23:39Z) - Interaction Matching for Long-Tail Multi-Label Classification [57.262792333593644]
We present an elegant and effective approach for addressing limitations in existing multi-label classification models.
By performing soft n-gram interaction matching, we match labels with natural language descriptions.
arXiv Detail & Related papers (2020-05-18T15:27:55Z) - Unsupervised Person Re-identification via Multi-label Classification [55.65870468861157]
This paper formulates unsupervised person ReID as a multi-label classification task to progressively seek true labels.
Our method starts by assigning each person image with a single-class label, then evolves to multi-label classification by leveraging the updated ReID model for label prediction.
To boost the ReID model training efficiency in multi-label classification, we propose the memory-based multi-label classification loss (MMCL)
arXiv Detail & Related papers (2020-04-20T12:13:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.