HU-based Foreground Masking for 3D Medical Masked Image Modeling
- URL: http://arxiv.org/abs/2509.07534v1
- Date: Tue, 09 Sep 2025 09:11:38 GMT
- Title: HU-based Foreground Masking for 3D Medical Masked Image Modeling
- Authors: Jin Lee, Vu Dang, Gwang-Hyun Yu, Anh Le, Zahid Rahman, Jin-Ho Jang, Heonzoo Lee, Kun-Yung Kim, Jin-Sul Kim, Jin-Young Kim,
- Abstract summary: Masked Image Modeling (MIM) has revolutionized computer vision, but its adoption in 3D medical image computing has been limited.<n>We implement an HU-based Foreground Masking, which focuses on the intensity distribution of visceral organs and excludes non-tissue regions.<n>Experiments on five public 3D medical imaging datasets demonstrate that our masking consistently improves performance.
- Score: 3.1381929517927354
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: While Masked Image Modeling (MIM) has revolutionized fields of computer vision, its adoption in 3D medical image computing has been limited by the use of random masking, which overlooks the density of anatomical objects. To address this limitation, we enhance the pretext task with a simple yet effective masking strategy. Leveraging Hounsfield Unit (HU) measurements, we implement an HU-based Foreground Masking, which focuses on the intensity distribution of visceral organs and excludes non-tissue regions, such as air and fluid, that lack diagnostically meaningful features. Extensive experiments on five public 3D medical imaging datasets demonstrate that our masking consistently improves performance, both in quality of segmentation and Dice score (BTCV:~84.64\%, Flare22:~92.43\%, MM-WHS:~90.67\%, Amos22:~88.64\%, BraTS:~78.55\%). These results underscore the importance of domain-centric MIM and suggest a promising direction for representation learning in medical image segmentation. Implementation is available at github.com/AISeedHub/SubFore/.
Related papers
- Mask What Matters: Controllable Text-Guided Masking for Self-Supervised Medical Image Analysis [2.6554246520306624]
Mask What Matters is a controllable text-guided masking framework for self-supervised medical image analysis.<n>It consistently outperforms existing MIM methods, achieving gains of up to +3.1 percentage points in classification accuracy.<n>It achieves these improvements with substantially lower overall masking ratios.
arXiv Detail & Related papers (2025-09-27T02:26:56Z) - PathSegDiff: Pathology Segmentation using Diffusion model representations [63.20694440934692]
We propose PathSegDiff, a novel approach for histopathology image segmentation that leverages Latent Diffusion Models (LDMs) as pre-trained featured extractors.<n>Our method utilizes a pathology-specific LDM, guided by a self-supervised encoder, to extract rich semantic information from H&E stained histopathology images.<n>Our experiments demonstrate significant improvements over traditional methods on the BCSS and GlaS datasets.
arXiv Detail & Related papers (2025-04-09T14:58:21Z) - MRGen: Segmentation Data Engine for Underrepresented MRI Modalities [59.61465292965639]
Training medical image segmentation models for rare yet clinically important imaging modalities is challenging due to the scarcity of annotated data.<n>This paper investigates leveraging generative models to synthesize data, for training segmentation models for underrepresented modalities.<n>We present MRGen, a data engine for controllable medical image synthesis conditioned on text prompts and segmentation masks.
arXiv Detail & Related papers (2024-12-04T16:34:22Z) - Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline [16.70515066552565]
The IMed-361M benchmark dataset is a significant advancement in general IMIS research.
We collect and standardize over 6.4 million medical images and their corresponding ground truth masks from multiple data sources.
We developed an IMIS baseline network on this dataset that supports high-quality mask generation through interactive inputs.
arXiv Detail & Related papers (2024-11-19T19:06:29Z) - AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking [5.844539603252746]
Masked image modeling (MIM) has shown effectiveness by reconstructing randomly masked images to learn detailed representations.
We propose AnatoMask, a novel MIM method that leverages reconstruction loss to dynamically identify and mask out anatomically significant regions.
arXiv Detail & Related papers (2024-07-09T00:15:52Z) - Discriminative Hamiltonian Variational Autoencoder for Accurate Tumor Segmentation in Data-Scarce Regimes [2.8498944632323755]
We propose an end-to-end hybrid architecture for medical image segmentation.
We use Hamiltonian Variational Autoencoders (HVAE) and a discriminative regularization to improve the quality of generated images.
Our architecture operates on a slice-by-slice basis to segment 3D volumes, capitilizing on the richly augmented dataset.
arXiv Detail & Related papers (2024-06-17T15:42:08Z) - MiM: Mask in Mask Self-Supervised Pre-Training for 3D Medical Image Analysis [9.472502717128556]
Masked AutoEncoder (MAE) for feature pre-training can unleash the potential of ViT on various medical vision tasks.<n>We propose a novel textitMask in Mask (MiM) pre-training framework for 3D medical images.
arXiv Detail & Related papers (2024-04-24T01:14:33Z) - Enhancing Weakly Supervised 3D Medical Image Segmentation through Probabilistic-aware Learning [47.700298779672366]
3D medical image segmentation is a challenging task with crucial implications for disease diagnosis and treatment planning.<n>Recent advances in deep learning have significantly enhanced fully supervised medical image segmentation.<n>We propose a novel probabilistic-aware weakly supervised learning pipeline, specifically designed for 3D medical imaging.
arXiv Detail & Related papers (2024-03-05T00:46:53Z) - MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
Segmentation [58.53672866662472]
We introduce a modality-agnostic SAM adaptation framework, named as MA-SAM.
Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments.
By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data.
arXiv Detail & Related papers (2023-09-16T02:41:53Z) - Disruptive Autoencoders: Leveraging Low-level features for 3D Medical
Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images.
We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations.
The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Semantic segmentation of surgical hyperspectral images under geometric
domain shifts [69.91792194237212]
We present the first analysis of state-of-the-art semantic segmentation networks in the presence of geometric out-of-distribution (OOD) data.
We also address generalizability with a dedicated augmentation technique termed "Organ Transplantation"
Our scheme improves on the SOA DSC by up to 67 % (RGB) and 90 % (HSI) and renders performance on par with in-distribution performance on real OOD test data.
arXiv Detail & Related papers (2023-03-20T09:50:07Z) - Self Pre-training with Masked Autoencoders for Medical Image
Classification and Segmentation [37.25161294917211]
Masked Autoencoder (MAE) has been shown to be effective in pre-training Vision Transformers (ViT) for natural image analysis.
We investigate a self pre-training paradigm with MAE for medical image analysis tasks.
arXiv Detail & Related papers (2022-03-10T16:22:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.