Investigating Class-level Difficulty Factors in Multi-label
Classification Problems
- URL: http://arxiv.org/abs/2005.00430v1
- Date: Fri, 1 May 2020 15:06:53 GMT
- Title: Investigating Class-level Difficulty Factors in Multi-label
Classification Problems
- Authors: Mark Marsden, Kevin McGuinness, Joseph Antony, Haolin Wei, Milan
Redzic, Jian Tang, Zhilan Hu, Alan Smeaton, Noel E O'Connor
- Abstract summary: This work investigates the use of class-level difficulty factors in multi-label classification problems for the first time.
Four difficulty factors are proposed: frequency, visual variation, semantic abstraction, and class co-occurrence.
These difficulty factors are shown to have several potential applications including the prediction of class-level performance across datasets.
- Score: 23.51529285126783
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: This work investigates the use of class-level difficulty factors in
multi-label classification problems for the first time. Four class-level
difficulty factors are proposed: frequency, visual variation, semantic
abstraction, and class co-occurrence. Once computed for a given multi-label
classification dataset, these difficulty factors are shown to have several
potential applications including the prediction of class-level performance
across datasets and the improvement of predictive performance through
difficulty weighted optimisation. Significant improvements to mAP and AUC
performance are observed for two challenging multi-label datasets (WWW Crowd
and Visual Genome) with the inclusion of difficulty weighted optimisation. The
proposed technique does not require any additional computational complexity
during training or inference and can be extended over time with inclusion of
other class-level difficulty factors.
Related papers
- Exploring Contrastive Learning for Long-Tailed Multi-Label Text Classification [48.81069245141415]
We introduce a novel contrastive loss function for multi-label text classification.
It attains Micro-F1 scores that either match or surpass those obtained with other frequently employed loss functions.
It demonstrates a significant improvement in Macro-F1 scores across three multi-label datasets.
arXiv Detail & Related papers (2024-04-12T11:12:16Z) - Balanced Classification: A Unified Framework for Long-Tailed Object
Detection [74.94216414011326]
Conventional detectors suffer from performance degradation when dealing with long-tailed data due to a classification bias towards the majority head categories.
We introduce a unified framework called BAlanced CLassification (BACL), which enables adaptive rectification of inequalities caused by disparities in category distribution.
BACL consistently achieves performance improvements across various datasets with different backbones and architectures.
arXiv Detail & Related papers (2023-08-04T09:11:07Z) - Learning Prompt-Enhanced Context Features for Weakly-Supervised Video
Anomaly Detection [37.99031842449251]
Video anomaly detection under weak supervision presents significant challenges.
We present a weakly supervised anomaly detection framework that focuses on efficient context modeling and enhanced semantic discriminability.
Our approach significantly improves the detection accuracy of certain anomaly sub-classes, underscoring its practical value and efficacy.
arXiv Detail & Related papers (2023-06-26T06:45:16Z) - Difficulty-Net: Learning to Predict Difficulty for Long-Tailed
Recognition [5.977483447975081]
We propose Difficulty-Net, which learns to predict the difficulty of classes using the model's performance in a meta-learning framework.
We introduce two key concepts, namely the relative difficulty and the driver loss.
Experiments on popular long-tailed datasets demonstrated the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-09-07T07:04:08Z) - PercentMatch: Percentile-based Dynamic Thresholding for Multi-Label
Semi-Supervised Classification [64.39761523935613]
We propose a percentile-based threshold adjusting scheme to dynamically alter the score thresholds of positive and negative pseudo-labels for each class during the training.
We achieve strong performance on Pascal VOC2007 and MS-COCO datasets when compared to recent SSL methods.
arXiv Detail & Related papers (2022-08-30T01:27:48Z) - Let the Model Decide its Curriculum for Multitask Learning [22.043291547405545]
We propose two classes of techniques to arrange training instances into a learning curriculum based on difficulty scores computed via model-based approaches.
We show that instance-level and dataset-level techniques result in strong representations as they lead to an average performance improvement of 4.17% and 3.15% over their respective baselines.
arXiv Detail & Related papers (2022-05-19T23:34:22Z) - Multitask Learning for Class-Imbalanced Discourse Classification [74.41900374452472]
We show that a multitask approach can improve 7% Micro F1-score upon current state-of-the-art benchmarks.
We also offer a comparative review of additional techniques proposed to address resource-poor problems in NLP.
arXiv Detail & Related papers (2021-01-02T07:13:41Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Coherent Hierarchical Multi-Label Classification Networks [56.41950277906307]
C-HMCNN(h) is a novel approach for HMC problems, which exploits hierarchy information in order to produce predictions coherent with the constraint and improve performance.
We conduct an extensive experimental analysis showing the superior performance of C-HMCNN(h) when compared to state-of-the-art models.
arXiv Detail & Related papers (2020-10-20T09:37:02Z) - Combined Cleaning and Resampling Algorithm for Multi-Class Imbalanced
Data with Label Noise [11.868507571027626]
In this paper, we propose a novel oversampling technique, a Multi-Class Combined Cleaning and Resampling algorithm.
The proposed method utilizes an energy-based approach to modeling the regions suitable for oversampling, less affected by small disjuncts and outliers than SMOTE.
It combines it with a simultaneous cleaning operation, the aim of which is to reduce the effect of overlapping class distributions on the performance of the learning algorithms.
arXiv Detail & Related papers (2020-04-07T13:59:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.