DeepChest: Dynamic Gradient-Free Task Weighting for Effective Multi-Task Learning in Chest X-ray Classification
- URL: http://arxiv.org/abs/2505.23595v1
- Date: Thu, 29 May 2025 16:08:26 GMT
- Title: DeepChest: Dynamic Gradient-Free Task Weighting for Effective Multi-Task Learning in Chest X-ray Classification
- Authors: Youssef Mohamed, Noran Mohamed, Khaled Abouhashad, Feilong Tang, Sara Atito, Shoaib Jameel, Imran Razzak, Ahmed B. Zaky,
- Abstract summary: DeepChest is a dynamic task-weighting framework specifically designed for multi-label chest X-ray (CXR) classification.<n>Given a network architecture (e.g., ResNet18), our model-agnostic approach adaptively adjusts task importance without requiring gradient access.<n>Experiments on a large-scale CXR dataset demonstrate that DeepChest outperforms state-of-the-art MTL methods by 7% in overall accuracy.
- Score: 18.13819296102142
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: While Multi-Task Learning (MTL) offers inherent advantages in complex domains such as medical imaging by enabling shared representation learning, effectively balancing task contributions remains a significant challenge. This paper addresses this critical issue by introducing DeepChest, a novel, computationally efficient and effective dynamic task-weighting framework specifically designed for multi-label chest X-ray (CXR) classification. Unlike existing heuristic or gradient-based methods that often incur substantial overhead, DeepChest leverages a performance-driven weighting mechanism based on effective analysis of task-specific loss trends. Given a network architecture (e.g., ResNet18), our model-agnostic approach adaptively adjusts task importance without requiring gradient access, thereby significantly reducing memory usage and achieving a threefold increase in training speed. It can be easily applied to improve various state-of-the-art methods. Extensive experiments on a large-scale CXR dataset demonstrate that DeepChest not only outperforms state-of-the-art MTL methods by 7% in overall accuracy but also yields substantial reductions in individual task losses, indicating improved generalization and effective mitigation of negative transfer. The efficiency and performance gains of DeepChest pave the way for more practical and robust deployment of deep learning in critical medical diagnostic applications. The code is publicly available at https://github.com/youssefkhalil320/DeepChest-MTL
Related papers
- Multiscale Stochastic Gradient Descent: Efficiently Training Convolutional Neural Networks [6.805997961535213]
Multiscale Gradient Descent (Multiscale-SGD) is a novel optimization approach that exploits coarse-to-fine training strategies to estimate the gradient at a fraction of the cost.<n>We introduce a new class of learnable, scale-independent Mesh-Free Convolutions (MFCs) that ensure consistent gradient behavior across resolutions.<n>Our results establish a new paradigm for the efficient training of deep networks, enabling practical scalability in high-resolution and multiscale learning tasks.
arXiv Detail & Related papers (2025-01-22T09:13:47Z) - ProKAN: Progressive Stacking of Kolmogorov-Arnold Networks for Efficient Liver Segmentation [0.0]
proKAN is a progressive stacking methodology for Kolmogorov-Arnold Networks (KANs) designed to address these challenges.<n> proKAN dynamically adjusts its complexity by progressively adding KAN blocks during training, based on overfitting behavior.<n>Our proposed architecture achieves state-of-the-art performance in liver segmentation tasks, outperforming standard Multi-Layer Perceptrons (MLPs) and fixed KAN architectures.
arXiv Detail & Related papers (2024-12-27T16:14:06Z) - Fine-tuning -- a Transfer Learning approach [0.22344294014777952]
Missingness in Electronic Health Records (EHRs) is often hampered by the abundance of missing data in this valuable resource.
Existing deep imputation methods rely on end-to-end pipelines that incorporate both imputation and downstream analyses.
This paper explores the development of a modular, deep learning-based imputation and classification pipeline.
arXiv Detail & Related papers (2024-11-06T14:18:23Z) - MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning [17.437573206368494]
Visual deep reinforcement learning (RL) enables robots to acquire skills from visual input for unstructured tasks.<n>We present MENTOR, a method that improves both the architecture and optimization of RL agents.<n>MenTOR outperforms state-of-the-art methods across three simulation benchmarks and achieves an average of 83% success rate on three challenging real-world robotic manipulation tasks.
arXiv Detail & Related papers (2024-10-19T04:31:54Z) - Robust Analysis of Multi-Task Learning Efficiency: New Benchmarks on Light-Weighed Backbones and Effective Measurement of Multi-Task Learning Challenges by Feature Disentanglement [69.51496713076253]
In this paper, we focus on the aforementioned efficiency aspects of existing MTL methods.
We first carry out large-scale experiments of the methods with smaller backbones and on a the MetaGraspNet dataset as a new test ground.
We also propose Feature Disentanglement measure as a novel and efficient identifier of the challenges in MTL.
arXiv Detail & Related papers (2024-02-05T22:15:55Z) - PREM: A Simple Yet Effective Approach for Node-Level Graph Anomaly
Detection [65.24854366973794]
Node-level graph anomaly detection (GAD) plays a critical role in identifying anomalous nodes from graph-structured data in domains such as medicine, social networks, and e-commerce.
We introduce a simple method termed PREprocessing and Matching (PREM for short) to improve the efficiency of GAD.
Our approach streamlines GAD, reducing time and memory consumption while maintaining powerful anomaly detection capabilities.
arXiv Detail & Related papers (2023-10-18T02:59:57Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Multi-Objective Optimization for Sparse Deep Multi-Task Learning [0.0]
We present a Multi-Objective Optimization algorithm using a modified Weighted Chebyshev scalarization for training Deep Neural Networks (DNNs)
Our work aims to address the (economical and also ecological) sustainability issue of DNN models, with particular focus on Deep Multi-Task models.
arXiv Detail & Related papers (2023-08-23T16:42:27Z) - FAMO: Fast Adaptive Multitask Optimization [48.59232177073481]
We introduce Fast Adaptive Multitask Optimization FAMO, a dynamic weighting method that decreases task losses in a balanced way.
Our results indicate that FAMO achieves comparable or superior performance to state-of-the-art gradient manipulation techniques.
arXiv Detail & Related papers (2023-06-06T15:39:54Z) - Human-Inspired Framework to Accelerate Reinforcement Learning [1.6317061277457001]
Reinforcement learning (RL) is crucial for data science decision-making but suffers from sample inefficiency.
This paper introduces a novel human-inspired framework to enhance RL algorithm sample efficiency.
arXiv Detail & Related papers (2023-02-28T13:15:04Z) - Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC)
Our algorithm alleviates problems with local minima through a smooth critic function.
We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv Detail & Related papers (2022-04-14T17:46:26Z) - Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples.
We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment.
We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z) - SLAW: Scaled Loss Approximate Weighting for Efficient Multi-Task
Learning [0.0]
Multi-task learning (MTL) is a subfield of machine learning with important applications.
The best MTL optimization methods require individually computing the gradient of each task's loss function.
We propose Scaled Loss Approximate Weighting (SLAW), a method for multi-task optimization that matches the performance of the best existing methods while being much more efficient.
arXiv Detail & Related papers (2021-09-16T20:58:40Z) - Gradient Surgery for Multi-Task Learning [119.675492088251]
Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks.
The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood.
We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient.
arXiv Detail & Related papers (2020-01-19T06:33:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.