In-Model Merging for Enhancing the Robustness of Medical Imaging Classification Models
- URL: http://arxiv.org/abs/2502.20516v1
- Date: Thu, 27 Feb 2025 20:52:55 GMT
- Title: In-Model Merging for Enhancing the Robustness of Medical Imaging Classification Models
- Authors: Hu Wang, Ibrahim Almakky, Congbo Ma, Numan Saeed, Mohammad Yaqub,
- Abstract summary: We propose in-model merging (InMerge), a novel approach that enhances the model's robustness.<n>We demonstrate the feasibility and effectiveness of this technique for different CNN architectures.<n>The proposed InMerge-trained model surpasses the typically-trained model by a substantial margin.
- Score: 5.871732354387235
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Model merging is an effective strategy to merge multiple models for enhancing model performances, and more efficient than ensemble learning as it will not introduce extra computation into inference. However, limited research explores if the merging process can occur within one model and enhance the model's robustness, which is particularly critical in the medical image domain. In the paper, we are the first to propose in-model merging (InMerge), a novel approach that enhances the model's robustness by selectively merging similar convolutional kernels in the deep layers of a single convolutional neural network (CNN) during the training process for classification. We also analytically reveal important characteristics that affect how in-model merging should be performed, serving as an insightful reference for the community. We demonstrate the feasibility and effectiveness of this technique for different CNN architectures on 4 prevalent datasets. The proposed InMerge-trained model surpasses the typically-trained model by a substantial margin. The code will be made public.
Related papers
- Training-free Heterogeneous Model Merging [40.681362819808136]
We propose an innovative model merging framework designed for heterogeneous models.<n>We show that the merging of structurally heterogeneous models can achieve performance levels comparable to those of homogeneous merging.<n>Our code is publicly available at https://github.com/zju-vipa/training_free_heterogeneous_model_merging.
arXiv Detail & Related papers (2024-12-29T04:49:11Z) - An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques [12.470257882838126]
We propose the Diffusion-Based Discriminative Model Enhancement Framework (DBMEF)<n>This framework seamlessly integrates discriminative and generative models in a training-free manner.<n>DBMEF can effectively enhance the classification accuracy and capability of discriminative models in a plug-and-play manner.
arXiv Detail & Related papers (2024-12-12T08:46:22Z) - A Collaborative Ensemble Framework for CTR Prediction [73.59868761656317]
We propose a novel framework, Collaborative Ensemble Training Network (CETNet), to leverage multiple distinct models.
Unlike naive model scaling, our approach emphasizes diversity and collaboration through collaborative learning.
We validate our framework on three public datasets and a large-scale industrial dataset from Meta.
arXiv Detail & Related papers (2024-11-20T20:38:56Z) - Exploring Model Kinship for Merging Large Language Models [52.01652098827454]
We introduce model kinship, the degree of similarity or relatedness between Large Language Models.
We find that there is a certain relationship between model kinship and the performance gains after model merging.
We propose a new model merging strategy: Top-k Greedy Merging with Model Kinship, which can yield better performance on benchmark datasets.
arXiv Detail & Related papers (2024-10-16T14:29:29Z) - What Matters for Model Merging at Scale? [94.26607564817786]
Model merging aims to combine multiple expert models into a more capable single model.
Previous studies have primarily focused on merging a few small models.
This study systematically evaluates the utility of model merging at scale.
arXiv Detail & Related papers (2024-10-04T17:17:19Z) - HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models [28.993221775758702]
Model merging is a technique that combines multiple large pretrained models into a single model with enhanced performance and broader task adaptability.
This paper marks a significant advance toward more flexible and comprehensive model merging techniques.
We train policy and value networks using offline sampling of weight vectors, which are then employed for the online optimization of merging strategies.
arXiv Detail & Related papers (2024-09-27T16:31:31Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - Has Your Pretrained Model Improved? A Multi-head Posterior Based
Approach [25.927323251675386]
We leverage the meta-features associated with each entity as a source of worldly knowledge and employ entity representations from the models.
We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models.
Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models and image models.
arXiv Detail & Related papers (2024-01-02T17:08:26Z) - An Empirical Study of Multimodal Model Merging [148.48412442848795]
Model merging is a technique that fuses multiple models trained on different tasks to generate a multi-task solution.
We conduct our study for a novel goal where we can merge vision, language, and cross-modal transformers of a modality-specific architecture.
We propose two metrics that assess the distance between weights to be merged and can serve as an indicator of the merging outcomes.
arXiv Detail & Related papers (2023-04-28T15:43:21Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.