Single-Shared Network with Prior-Inspired Loss for Parameter-Efficient Multi-Modal Imaging Skin Lesion Classification
- URL: http://arxiv.org/abs/2403.19203v1
- Date: Thu, 28 Mar 2024 08:00:14 GMT
- Title: Single-Shared Network with Prior-Inspired Loss for Parameter-Efficient Multi-Modal Imaging Skin Lesion Classification
- Authors: Peng Tang, Tobias Lasser,
- Abstract summary: We introduce a multi-modal approach that efficiently integrates multi-scale clinical and dermoscopy features within a single network.
Our method exhibits superiority in both accuracy and model parameters compared to currently advanced methods.
- Score: 6.195015783344803
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this study, we introduce a multi-modal approach that efficiently integrates multi-scale clinical and dermoscopy features within a single network, thereby substantially reducing model parameters. The proposed method includes three novel fusion schemes. Firstly, unlike current methods that usually employ two individual models for for clinical and dermoscopy modalities, we verified that multimodal feature can be learned by sharing the parameters of encoder while leaving the individual modal-specific classifiers. Secondly, the shared cross-attention module can replace the individual one to efficiently interact between two modalities at multiple layers. Thirdly, different from current methods that equally optimize dermoscopy and clinical branches, inspired by prior knowledge that dermoscopy images play a more significant role than clinical images, we propose a novel biased loss. This loss guides the single-shared network to prioritize dermoscopy information over clinical information, implicitly learning a better joint feature representation for the modal-specific task. Extensive experiments on a well-recognized Seven-Point Checklist (SPC) dataset and a collected dataset demonstrate the effectiveness of our method on both CNN and Transformer structures. Furthermore, our method exhibits superiority in both accuracy and model parameters compared to currently advanced methods.
Related papers
- Pay Less On Clinical Images: Asymmetric Multi-Modal Fusion Method For Efficient Multi-Label Skin Lesion Classification [6.195015783344803]
Existing multi-modal approaches primarily focus on enhancing multi-label skin lesion classification performance through advanced fusion modules.
We introduce a novel asymmetric multi-modal fusion method in this paper for efficient multi-label skin lesion classification.
arXiv Detail & Related papers (2024-07-13T20:46:04Z) - CLIP-MUSED: CLIP-Guided Multi-Subject Visual Neural Information Semantic
Decoding [14.484475792279671]
We propose a CLIP-guided Multi-sUbject visual neural information SEmantic Decoding (CLIP-MUSED) method.
Our method consists of a Transformer-based feature extractor to effectively model global neural representations.
It also incorporates learnable subject-specific tokens that facilitates the aggregation of multi-subject data.
arXiv Detail & Related papers (2024-02-14T07:41:48Z) - Convolutional autoencoder-based multimodal one-class classification [80.52334952912808]
One-class classification refers to approaches of learning using data from a single class only.
We propose a deep learning one-class classification method suitable for multimodal data.
arXiv Detail & Related papers (2023-09-25T12:31:18Z) - Multilayer Multiset Neuronal Networks -- MMNNs [55.2480439325792]
The present work describes multilayer multiset neuronal networks incorporating two or more layers of coincidence similarity neurons.
The work also explores the utilization of counter-prototype points, which are assigned to the image regions to be avoided.
arXiv Detail & Related papers (2023-08-28T12:55:13Z) - Graph-Ensemble Learning Model for Multi-label Skin Lesion Classification
using Dermoscopy and Clinical Images [7.159532626507458]
This study introduces a Graph Convolution Network (GCN) to exploit prior co-occurrence between each category as a correlation matrix into the deep learning model for the multi-label classification.
We propose a Graph-Ensemble Learning Model (GELN) that views the prediction from GCN as complementary information of the predictions from the fusion model.
arXiv Detail & Related papers (2023-07-04T13:19:57Z) - A Transformer-based representation-learning model with unified
processing of multimodal input for clinical diagnostics [63.106382317917344]
We report a Transformer-based representation-learning model as a clinical diagnostic aid that processes multimodal input in a unified manner.
The unified model outperformed an image-only model and non-unified multimodal diagnosis models in the identification of pulmonary diseases.
arXiv Detail & Related papers (2023-06-01T16:23:47Z) - Ambiguous Medical Image Segmentation using Diffusion Models [60.378180265885945]
We introduce a single diffusion model-based approach that produces multiple plausible outputs by learning a distribution over group insights.
Our proposed model generates a distribution of segmentation masks by leveraging the inherent sampling process of diffusion.
Comprehensive results show that our proposed approach outperforms existing state-of-the-art ambiguous segmentation networks.
arXiv Detail & Related papers (2023-04-10T17:58:22Z) - Rethinking Semi-Supervised Medical Image Segmentation: A
Variance-Reduction Perspective [51.70661197256033]
We propose ARCO, a semi-supervised contrastive learning framework with stratified group theory for medical image segmentation.
We first propose building ARCO through the concept of variance-reduced estimation and show that certain variance-reduction techniques are particularly beneficial in pixel/voxel-level segmentation tasks.
We experimentally validate our approaches on eight benchmarks, i.e., five 2D/3D medical and three semantic segmentation datasets, with different label settings.
arXiv Detail & Related papers (2023-02-03T13:50:25Z) - A Multi-modal Fusion Framework Based on Multi-task Correlation Learning
for Cancer Prognosis Prediction [8.476394437053477]
We present a multi-modal fusion framework based on multi-task correlation learning (MultiCoFusion) for survival analysis and cancer grade classification.
We systematically evaluate our framework using glioma datasets from The Cancer Genome Atlas (TCGA)
arXiv Detail & Related papers (2022-01-22T15:16:24Z) - Self-Supervised Multimodal Domino: in Search of Biomarkers for
Alzheimer's Disease [19.86082635340699]
We propose a taxonomy of all reasonable ways to organize self-supervised representation-learning algorithms.
We first evaluate models on toy multimodal MNIST datasets and then apply them to a multimodal neuroimaging dataset with Alzheimer's disease patients.
Results show that the proposed approach outperforms previous self-supervised encoder-decoder methods.
arXiv Detail & Related papers (2020-12-25T20:28:13Z) - Unpaired Multi-modal Segmentation via Knowledge Distillation [77.39798870702174]
We propose a novel learning scheme for unpaired cross-modality image segmentation.
In our method, we heavily reuse network parameters, by sharing all convolutional kernels across CT and MRI.
We have extensively validated our approach on two multi-class segmentation problems.
arXiv Detail & Related papers (2020-01-06T20:03:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.