Hierarchy-Consistent Learning and Adaptive Loss Balancing for Hierarchical Multi-Label Classification
- URL: http://arxiv.org/abs/2508.13452v1
- Date: Tue, 19 Aug 2025 02:15:41 GMT
- Title: Hierarchy-Consistent Learning and Adaptive Loss Balancing for Hierarchical Multi-Label Classification
- Authors: Ruobing Jiang, Mengzhe Liu, Haobing Liu, Yanwei Yu,
- Abstract summary: HMC faces challenges in maintaining structural consistency and balancing loss weighting in Multi-Task Learning.<n>We propose a classifier called HCAL based on MTL integrated with prototype contrastive learning and adaptive task-weighting mechanisms.
- Score: 8.889313669713918
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hierarchical Multi-Label Classification (HMC) faces critical challenges in maintaining structural consistency and balancing loss weighting in Multi-Task Learning (MTL). In order to address these issues, we propose a classifier called HCAL based on MTL integrated with prototype contrastive learning and adaptive task-weighting mechanisms. The most significant advantage of our classifier is semantic consistency including both prototype with explicitly modeling label and feature aggregation from child classes to parent classes. The other important advantage is an adaptive loss-weighting mechanism that dynamically allocates optimization resources by monitoring task-specific convergence rates. It effectively resolves the "one-strong-many-weak" optimization bias inherent in traditional MTL approaches. To further enhance robustness, a prototype perturbation mechanism is formulated by injecting controlled noise into prototype to expand decision boundaries. Additionally, we formalize a quantitative metric called Hierarchical Violation Rate (HVR) as to evaluate hierarchical consistency and generalization. Extensive experiments across three datasets demonstrate both the higher classification accuracy and reduced hierarchical violation rate of the proposed classifier over baseline models.
Related papers
- Reasoning-Driven Multimodal LLM for Domain Generalization [72.00754603114187]
We study the role of reasoning in domain generalization using DomainBed-Reasoning dataset.<n>We propose RD-MLDG, a framework with two components: MTCT (Multi-Task Cross-Training) and SARR (Self-Aligned Reasoning Regularization)<n>Experiments on standard DomainBed datasets demonstrate that RD-MLDG achieves complementary state-of-the-art performances.
arXiv Detail & Related papers (2026-02-27T08:10:06Z) - Multiscale Aggregated Hierarchical Attention (MAHA): A Game Theoretic and Optimization Driven Approach to Efficient Contextual Modeling in Large Language Models [0.0]
Multiscale Aggregated Hierarchical Attention (MAHA) is a novel architectural framework that reformulates the attention mechanism through hierarchical decomposition and mathematically rigorous aggregation.<n>MAHA dynamically partitions the input sequence into hierarchical scales via learnable downsampling operators.<n> Experimental evaluations demonstrate that MAHA achieves superior scalability; empirical FLOPs analysis confirms an 81% reduction in computational cost at a sequence length of 4096 compared to standard attention.
arXiv Detail & Related papers (2025-12-16T21:27:21Z) - An Integrated Fusion Framework for Ensemble Learning Leveraging Gradient Boosting and Fuzzy Rule-Based Models [59.13182819190547]
Fuzzy rule-based models excel in interpretability and have seen widespread application across diverse fields.<n>They face challenges such as complex design specifications and scalability issues with large datasets.<n>This paper proposes an Integrated Fusion Framework that merges the strengths of both paradigms to enhance model performance and interpretability.
arXiv Detail & Related papers (2025-11-11T10:28:23Z) - NDCG-Consistent Softmax Approximation with Accelerated Convergence [67.10365329542365]
We propose novel loss formulations that align directly with ranking metrics.<n>We integrate the proposed RG losses with the highly efficient Alternating Least Squares (ALS) optimization method.<n> Empirical evaluations on real-world datasets demonstrate that our approach achieves comparable or superior ranking performance.
arXiv Detail & Related papers (2025-06-11T06:59:17Z) - SpecRouter: Adaptive Routing for Multi-Level Speculative Decoding in Large Language Models [21.933379266533098]
Large Language Models (LLMs) present a critical trade-off between inference quality and computational cost.<n>Existing serving strategies often employ fixed model scales or static two-stage speculative decoding.<n>This paper introduces systemname, a novel framework that reimagines LLM inference as an adaptive routing problem.
arXiv Detail & Related papers (2025-05-12T15:46:28Z) - Unbiased Max-Min Embedding Classification for Transductive Few-Shot Learning: Clustering and Classification Are All You Need [83.10178754323955]
Few-shot learning enables models to generalize from only a few labeled examples.<n>We propose the Unbiased Max-Min Embedding Classification (UMMEC) Method, which addresses the key challenges in few-shot learning.<n>Our method significantly improves classification performance with minimal labeled data, advancing the state-of-the-art in annotatedL.
arXiv Detail & Related papers (2025-03-28T07:23:07Z) - Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion [6.749782429802639]
Multimodal learning is significantly constrained by modality imbalance.<n>We propose a novel approach to balance the classification ability of weak and strong modalities by incorporating the principle of boosting.
arXiv Detail & Related papers (2025-02-27T14:12:20Z) - Salvaging the Overlooked: Leveraging Class-Aware Contrastive Learning for Multi-Class Anomaly Detection [18.797864512898787]
In anomaly detection, early approaches often train separate models for individual classes, yielding high performance but posing challenges in scalability and resource management.<n>We investigate this performance observed in reconstruction-based methods, identifying the key issue: inter-class confusion.<n>This confusion emerges when a model trained in multi-class scenarios incorrectly reconstructs samples from one class as those of another, thereby exacerbating reconstruction errors.<n>By explicitly leveraging raw object category information (eg carpet or wood), we introduce local CL to refine multiscale dense features, and global CL to obtain more compact feature representations of normal patterns, thereby effectively adapting the models to multi-class
arXiv Detail & Related papers (2024-12-06T04:31:09Z) - Deep Matrix Factorization with Adaptive Weights for Multi-View Clustering [0.6037276428689637]
We introduce a novel Deep Matrix Factorization with Adaptive Weights for Multi-View Clustering (DMFAW)<n>Our method simultaneously incorporates feature selection and generates local partitions, enhancing clustering results.<n>Experiments on benchmark datasets highlight that DMFAW outperforms state-of-the-art methods in terms of clustering performance.
arXiv Detail & Related papers (2024-12-03T09:08:27Z) - Dynamic Correlation Learning and Regularization for Multi-Label Confidence Calibration [60.95748658638956]
This paper introduces the Multi-Label Confidence task, aiming to provide well-calibrated confidence scores in multi-label scenarios.
Existing single-label calibration methods fail to account for category correlations, which are crucial for addressing semantic confusion.
We propose the Dynamic Correlation Learning and Regularization algorithm, which leverages multi-grained semantic correlations to better model semantic confusion.
arXiv Detail & Related papers (2024-07-09T13:26:21Z) - Toward Multi-class Anomaly Detection: Exploring Class-aware Unified Model against Inter-class Interference [67.36605226797887]
We introduce a Multi-class Implicit Neural representation Transformer for unified Anomaly Detection (MINT-AD)
By learning the multi-class distributions, the model generates class-aware query embeddings for the transformer decoder.
MINT-AD can project category and position information into a feature embedding space, further supervised by classification and prior probability loss functions.
arXiv Detail & Related papers (2024-03-21T08:08:31Z) - A Robust Twin Parametric Margin Support Vector Machine for Multiclass Classification [0.0]
We introduce novel Twin Parametric Margin Support Vector Machine (TPMSVM) models designed to address multiclass classification tasks under feature uncertainty.<n>To handle data perturbations, we construct bounded-by-norm uncertainty set around each training observation and derive the robust counterparts of the deterministic models.<n>We validate the effectiveness of the proposed robust multiclass TPMSVM methodology on real-world datasets.
arXiv Detail & Related papers (2023-06-09T19:27:24Z) - Coherent Hierarchical Multi-Label Classification Networks [56.41950277906307]
C-HMCNN(h) is a novel approach for HMC problems, which exploits hierarchy information in order to produce predictions coherent with the constraint and improve performance.
We conduct an extensive experimental analysis showing the superior performance of C-HMCNN(h) when compared to state-of-the-art models.
arXiv Detail & Related papers (2020-10-20T09:37:02Z) - Revisiting LSTM Networks for Semi-Supervised Text Classification via
Mixed Objective Function [106.69643619725652]
We develop a training strategy that allows even a simple BiLSTM model, when trained with cross-entropy loss, to achieve competitive results.
We report state-of-the-art results for text classification task on several benchmark datasets.
arXiv Detail & Related papers (2020-09-08T21:55:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.