Unlocking Fine-Grained Details with Wavelet-based High-Frequency
Enhancement in Transformers
- URL: http://arxiv.org/abs/2308.13442v2
- Date: Tue, 12 Sep 2023 18:41:16 GMT
- Title: Unlocking Fine-Grained Details with Wavelet-based High-Frequency
Enhancement in Transformers
- Authors: Reza Azad, Amirhossein Kazerouni, Alaa Sulaiman, Afshin Bozorgpour,
Ehsan Khodapanah Aghdam, Abin Jose, Dorit Merhof
- Abstract summary: Medical image segmentation is a critical task that plays a vital role in diagnosis, treatment planning, and disease monitoring.
We address the local feature deficiency of the Transformer model by carefully re-designing the self-attention map.
We propose a multi-scale context enhancement block within skip connections to adaptively model inter-scale dependencies.
- Score: 4.208461204572879
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Medical image segmentation is a critical task that plays a vital role in
diagnosis, treatment planning, and disease monitoring. Accurate segmentation of
anatomical structures and abnormalities from medical images can aid in the
early detection and treatment of various diseases. In this paper, we address
the local feature deficiency of the Transformer model by carefully re-designing
the self-attention map to produce accurate dense prediction in medical images.
To this end, we first apply the wavelet transformation to decompose the input
feature map into low-frequency (LF) and high-frequency (HF) subbands. The LF
segment is associated with coarse-grained features while the HF components
preserve fine-grained features such as texture and edge information. Next, we
reformulate the self-attention operation using the efficient Transformer to
perform both spatial and context attention on top of the frequency
representation. Furthermore, to intensify the importance of the boundary
information, we impose an additional attention map by creating a Gaussian
pyramid on top of the HF components. Moreover, we propose a multi-scale context
enhancement block within skip connections to adaptively model inter-scale
dependencies to overcome the semantic gap among stages of the encoder and
decoder modules. Throughout comprehensive experiments, we demonstrate the
effectiveness of our strategy on multi-organ and skin lesion segmentation
benchmarks. The implementation code will be available upon acceptance.
\href{https://github.com/mindflow-institue/WaveFormer}{GitHub}.
Related papers
- ASSNet: Adaptive Semantic Segmentation Network for Microtumors and Multi-Organ Segmentation [32.74195208408193]
Medical image segmentation is a crucial task in computer vision, supporting clinicians in diagnosis, treatment planning, and disease monitoring.
We propose the Adaptive Semantic Network (ASSNet), a transformer architecture that effectively integrates local and global features for precise medical image segmentation.
Tests on diverse medical image segmentation tasks, including multi-organ, liver tumor, and bladder tumor segmentation, demonstrate that ASSNet achieves state-of-the-art results.
arXiv Detail & Related papers (2024-09-12T06:25:44Z) - TransDAE: Dual Attention Mechanism in a Hierarchical Transformer for Efficient Medical Image Segmentation [7.013315283888431]
Medical image segmentation is crucial for accurate disease diagnosis and the development of effective treatment strategies.
We introduce TransDAE: a novel approach that reimagines the self-attention mechanism to include both spatial and channel-wise associations.
Remarkably, TransDAE outperforms existing state-of-the-art methods on the Synaps multi-organ dataset.
arXiv Detail & Related papers (2024-09-03T16:08:48Z) - CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images.
The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism.
We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Disruptive Autoencoders: Leveraging Low-level features for 3D Medical
Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images.
We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations.
The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z) - HST-MRF: Heterogeneous Swin Transformer with Multi-Receptive Field for
Medical Image Segmentation [5.51045524851432]
We propose a Heterogeneous Swin Transformer with Multi-Receptive Field (HST-MRF) model for medical image segmentation.
The main purpose is to solve the problem of loss of structural information caused by patch segmentation using transformer.
Experimental results show that our proposed method outperforms state-of-the-art models and can achieve superior performance.
arXiv Detail & Related papers (2023-04-10T14:30:03Z) - Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network.
We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module.
Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - TransAttUnet: Multi-level Attention-guided U-Net with Transformer for
Medical Image Segmentation [33.45471457058221]
This paper proposes a novel Transformer based medical image semantic segmentation framework called TransAttUnet.
In particular, we establish additional multi-scale skip connections between decoder blocks to aggregate the different semantic-scale upsampling features.
Our method consistently outperforms the state-of-the-art baselines.
arXiv Detail & Related papers (2021-07-12T09:17:06Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.