Hybrid Dual Mean-Teacher Network With Double-Uncertainty Guidance for
Semi-Supervised Segmentation of MRI Scans
- URL: http://arxiv.org/abs/2303.05126v1
- Date: Thu, 9 Mar 2023 09:16:39 GMT
- Title: Hybrid Dual Mean-Teacher Network With Double-Uncertainty Guidance for
Semi-Supervised Segmentation of MRI Scans
- Authors: Jiayi Zhu, Bart Bolsterlee, Brian V. Y. Chow, Yang Song, Erik
Meijering
- Abstract summary: We present a Hybrid Dual Mean-Teacher (HD-Teacher) model with hybrid, semi-supervised, and multi-task learning to achieve highly effective semi-supervised segmentation.
- Score: 11.762045723792266
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semi-supervised learning has made significant progress in medical image
segmentation. However, existing methods primarily utilize information acquired
from a single dimensionality (2D/3D), resulting in sub-optimal performance on
challenging data, such as magnetic resonance imaging (MRI) scans with multiple
objects and highly anisotropic resolution. To address this issue, we present a
Hybrid Dual Mean-Teacher (HD-Teacher) model with hybrid, semi-supervised, and
multi-task learning to achieve highly effective semi-supervised segmentation.
HD-Teacher employs a 2D and a 3D mean-teacher network to produce segmentation
labels and signed distance fields from the hybrid information captured in both
dimensionalities. This hybrid learning mechanism allows HD-Teacher to combine
the `best of both worlds', utilizing features extracted from either 2D, 3D, or
both dimensions to produce outputs as it sees fit. Outputs from 2D and 3D
teacher models are also dynamically combined, based on their individual
uncertainty scores, into a single hybrid prediction, where the hybrid
uncertainty is estimated. We then propose a hybrid regularization module to
encourage both student models to produce results close to the
uncertainty-weighted hybrid prediction. The hybrid uncertainty suppresses
unreliable knowledge in the hybrid prediction, leaving only useful information
to improve network performance further. Extensive experiments of binary and
multi-class segmentation conducted on three MRI datasets demonstrate the
effectiveness of the proposed framework. Code is available at
https://github.com/ThisGame42/Hybrid-Teacher.
Related papers
- HybridTM: Combining Transformer and Mamba for 3D Semantic Segmentation [7.663855540620183]
We propose HybridTM, the first hybrid architecture that integrates Transformer and Mamba for 3D semantic segmentation.<n>In addition, we propose the Inner Layer Hybrid Strategy, which combines attention and Mamba at a finer granularity.<n>Our HybridTM achieves state-of-the-art performance on ScanNet, ScanNet200, and nuScenes benchmarks.
arXiv Detail & Related papers (2025-07-24T16:48:50Z) - Robust Multimodal Segmentation with Representation Regularization and Hybrid Prototype Distillation [9.418241223504252]
We propose a two-stage framework called RobustSeg to enhance multi-modal robustness.<n>In the first stage, RobustSeg pre-trains a multi-modal teacher model using complete modalities.<n>In the second stage, a student model is trained with random modality dropout while learning from the teacher via HPDM and RRM.
arXiv Detail & Related papers (2025-05-19T08:46:03Z) - Leveraging Labelled Data Knowledge: A Cooperative Rectification Learning Network for Semi-supervised 3D Medical Image Segmentation [27.94353306813293]
Semi-supervised 3D medical image segmentation aims to achieve accurate segmentation using few labelled data and numerous unlabelled data.
Main challenge in the design of semi-supervised learning methods is the effective use of the unlabelled data for training.
We introduce a new methodology to produce high-quality pseudo-labels for a consistency learning strategy.
arXiv Detail & Related papers (2025-02-17T05:29:50Z) - LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving [52.83707400688378]
LargeAD is a versatile and scalable framework designed for large-scale 3D pretraining across diverse real-world driving datasets.
Our framework leverages VFMs to extract semantically rich superpixels from 2D images, which are aligned with LiDAR point clouds to generate high-quality contrastive samples.
Our approach delivers significant performance improvements over state-of-the-art methods in both linear probing and fine-tuning tasks for both LiDAR-based segmentation and object detection.
arXiv Detail & Related papers (2025-01-07T18:59:59Z) - GEAL: Generalizable 3D Affordance Learning with Cross-Modal Consistency [50.11520458252128]
Existing 3D affordance learning methods struggle with generalization and robustness due to limited annotated data.<n>We propose GEAL, a novel framework designed to enhance the generalization and robustness of 3D affordance learning by leveraging large-scale pre-trained 2D models.<n>GEAL consistently outperforms existing methods across seen and novel object categories, as well as corrupted data.
arXiv Detail & Related papers (2024-12-12T17:59:03Z) - Reflective Teacher: Semi-Supervised Multimodal 3D Object Detection in Bird's-Eye-View via Uncertainty Measure [5.510678909146336]
We introduce a novel concept of Reflective Teacher where the student is trained by both labeled and pseudo labeled data.
We also propose Geometry Aware BEV Fusion (GA-BEV) for efficient alignment of multi-modal BEV features.
arXiv Detail & Related papers (2024-12-05T16:54:39Z) - A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision [65.33043028101471]
We introduce a diffusion model for Gaussian Splats, SplatDiffusion, to enable generation of three-dimensional structures from single images.
Existing methods rely on deterministic, feed-forward predictions, which limit their ability to handle the inherent ambiguity of 3D inference from 2D data.
arXiv Detail & Related papers (2024-12-01T00:29:57Z) - Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI [58.809276442508256]
We propose a hybrid network via the combination of convolution neural network (CNN) and transformer layers.
The experimental results on private and public DCE-MRI datasets demonstrate that the proposed hybrid network superior performance than the state-of-the-art methods.
arXiv Detail & Related papers (2024-08-11T15:46:00Z) - On Machine Learning Approaches for Protein-Ligand Binding Affinity Prediction [2.874893537471256]
This study evaluates the performance of classical tree-based models and advanced neural networks in protein-ligand binding affinity prediction.
We show that combining 2D and 3D model strengths improves active learning outcomes beyond current state-of-the-art approaches.
arXiv Detail & Related papers (2024-07-15T13:06:00Z) - Enhancing Single-Slice Segmentation with 3D-to-2D Unpaired Scan Distillation [21.69523493833432]
We propose a novel 3D-to-2D distillation framework, leveraging pre-trained 3D models to enhance 2D single-slice segmentation.
Unlike traditional knowledge distillation methods that require the same data input, our approach employs unpaired 3D CT scans with any contrast to guide the 2D student model.
arXiv Detail & Related papers (2024-06-18T04:06:02Z) - SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification.
The proposed framework has been validated through comprehensive experiments on two clinical datasets.
To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z) - Simultaneous Alignment and Surface Regression Using Hybrid 2D-3D
Networks for 3D Coherent Layer Segmentation of Retinal OCT Images with Full
and Sparse Annotations [32.69359482975795]
This work presents a novel framework based on hybrid 2D-3D convolutional neural networks (CNNs) to obtain continuous 3D retinal layer surfaces from OCT volumes.
Experiments on a synthetic dataset and three public clinical datasets show that our framework can effectively align the B-scans for potential motion correction.
arXiv Detail & Related papers (2023-12-04T08:32:31Z) - Interpretable 2D Vision Models for 3D Medical Images [47.75089895500738]
This study proposes a simple approach of adapting 2D networks with an intermediate feature representation for processing 3D images.
We show on all 3D MedMNIST datasets as benchmark and two real-world datasets consisting of several hundred high-resolution CT or MRI scans that our approach performs on par with existing methods.
arXiv Detail & Related papers (2023-07-13T08:27:09Z) - DetMatch: Two Teachers are Better Than One for Joint 2D and 3D
Semi-Supervised Object Detection [29.722784254501768]
DetMatch is a flexible framework for joint semi-supervised learning on 2D and 3D modalities.
By identifying objects detected in both sensors, our pipeline generates a cleaner, more robust set of pseudo-labels.
We leverage the richer semantics of RGB images to rectify incorrect 3D class predictions and improve localization of 3D boxes.
arXiv Detail & Related papers (2022-03-17T17:58:00Z) - Similarity-Aware Fusion Network for 3D Semantic Segmentation [87.51314162700315]
We propose a similarity-aware fusion network (SAFNet) to adaptively fuse 2D images and 3D point clouds for 3D semantic segmentation.
We employ a late fusion strategy where we first learn the geometric and contextual similarities between the input and back-projected (from 2D pixels) point clouds.
We show that SAFNet significantly outperforms existing state-of-the-art fusion-based approaches across various data integrity.
arXiv Detail & Related papers (2021-07-04T09:28:18Z) - HIVE-Net: Centerline-Aware HIerarchical View-Ensemble Convolutional
Network for Mitochondria Segmentation in EM Images [3.1498833540989413]
We introduce a novel hierarchical view-ensemble convolution (HVEC) to learn 3D spatial contexts using more efficient 2D convolutions.
The proposed method performs favorably against the state-of-the-art methods in accuracy and visual quality but with a greatly reduced model size.
arXiv Detail & Related papers (2021-01-08T06:56:40Z) - TSGCNet: Discriminative Geometric Feature Learning with Two-Stream
GraphConvolutional Network for 3D Dental Model Segmentation [141.2690520327948]
We propose a two-stream graph convolutional network (TSGCNet) to learn multi-view information from different geometric attributes.
We evaluate our proposed TSGCNet on a real-patient dataset of dental models acquired by 3D intraoral scanners.
arXiv Detail & Related papers (2020-12-26T08:02:56Z) - M2Net: Multi-modal Multi-channel Network for Overall Survival Time
Prediction of Brain Tumor Patients [151.4352001822956]
Early and accurate prediction of overall survival (OS) time can help to obtain better treatment planning for brain tumor patients.
Existing prediction methods rely on radiomic features at the local lesion area of a magnetic resonance (MR) volume.
We propose an end-to-end OS time prediction model; namely, Multi-modal Multi-channel Network (M2Net)
arXiv Detail & Related papers (2020-06-01T05:21:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.