Progressive Frequency-Aware Network for Laparoscopic Image Desmoking
- URL: http://arxiv.org/abs/2312.12023v1
- Date: Tue, 19 Dec 2023 10:19:44 GMT
- Title: Progressive Frequency-Aware Network for Laparoscopic Image Desmoking
- Authors: Jiale Zhang and Wenfeng Huang, Xiangyun Liao, and Qiong Wang
- Abstract summary: We propose a lightweight GAN framework for laparoscopic image desmoking, combining the strengths of CNN and Transformer.
PFAN efficiently desmokes laparoscopic images even with limited training data.
Our method outperforms state-of-the-art approaches in PSNR, SSIM, CIEDE2000, and visual quality on the Cholec80 dataset.
- Score: 8.988060012957497
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Laparoscopic surgery offers minimally invasive procedures with better patient
outcomes, but smoke presence challenges visibility and safety. Existing
learning-based methods demand large datasets and high computational resources.
We propose the Progressive Frequency-Aware Network (PFAN), a lightweight GAN
framework for laparoscopic image desmoking, combining the strengths of CNN and
Transformer for progressive information extraction in the frequency domain.
PFAN features CNN-based Multi-scale Bottleneck-Inverting (MBI) Blocks for
capturing local high-frequency information and Locally-Enhanced Axial Attention
Transformers (LAT) for efficiently handling global low-frequency information.
PFAN efficiently desmokes laparoscopic images even with limited training data.
Our method outperforms state-of-the-art approaches in PSNR, SSIM, CIEDE2000,
and visual quality on the Cholec80 dataset and retains only 629K parameters.
Our code and models are made publicly available at:
https://github.com/jlzcode/PFAN.
Related papers
- Self-Supervised Learning via Flow-Guided Neural Operator on Time-Series Data [57.85958428020496]
Flow-Guided Neural Operator (FGNO) is a novel framework combining operator learning with flow matching for SSL training.<n>FGNO learns mappings in functional spaces by using Short-Time Fourier Transform to unify different time resolutions.<n>Unlike prior generative SSL methods that use noisy inputs during inference, we propose using clean inputs for representation extraction while learning representations with noise.
arXiv Detail & Related papers (2026-02-12T18:54:57Z) - Contour Refinement using Discrete Diffusion in Low Data Regime [0.15393457051344298]
We present a lightweight discrete diffusion contour refinement pipeline for robust boundary detection in the low data regime.<n>We use a Convolutional Neural Network(CNN) architecture with self-attention layers as the core of our pipeline, and condition on a segmentation mask, iteratively denoising a sparse contour representation.<n>Our method outperforms several SOTA baselines on the medical imaging dataset KVASIR, is competitive on HAM10K and our custom wildfire dataset, Smoke, while improving inference framerate by 3.5X.
arXiv Detail & Related papers (2026-02-05T16:55:08Z) - Frequency-enhanced Multi-granularity Context Network for Efficient Vertebrae Segmentation [33.99418884128739]
We introduce a Frequency-enhanced Multi-granularity Context Network (FMC-Net) to improve vertebrae segmentation accuracy.<n>For the high-frequency components, we apply a High-frequency Feature Refinement (HFR) to amplify the prominence of key features.<n>For the low-frequency components, we use a Multi-granularity State Space Model (MG-SSM) to aggregate feature representations with different receptive fields.
arXiv Detail & Related papers (2025-06-29T04:53:02Z) - FADPNet: Frequency-Aware Dual-Path Network for Face Super-Resolution [70.61549422952193]
Face super-resolution (FSR) under limited computational costs remains an open problem.<n>Existing approaches typically treat all facial pixels equally, resulting in suboptimal allocation of computational resources.<n>We propose FADPNet, a Frequency-Aware Dual-Path Network that decomposes facial features into low- and high-frequency components.
arXiv Detail & Related papers (2025-06-17T02:33:42Z) - A lightweight residual network for unsupervised deformable image registration [2.7309692684728617]
We propose a residual U-Net with embedded parallel dilated-convolutional blocks to enhance the receptive field.
The proposed method is evaluated on inter-patient and atlas-based datasets.
arXiv Detail & Related papers (2024-06-14T07:20:49Z) - CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images.
The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism.
We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z) - Frequency-Aware Deepfake Detection: Improving Generalizability through
Frequency Space Learning [81.98675881423131]
This research addresses the challenge of developing a universal deepfake detector that can effectively identify unseen deepfake images.
Existing frequency-based paradigms have relied on frequency-level artifacts introduced during the up-sampling in GAN pipelines to detect forgeries.
We introduce a novel frequency-aware approach called FreqNet, centered around frequency domain learning, specifically designed to enhance the generalizability of deepfake detectors.
arXiv Detail & Related papers (2024-03-12T01:28:00Z) - WATUNet: A Deep Neural Network for Segmentation of Volumetric Sweep
Imaging Ultrasound [1.2903292694072621]
Volume sweep imaging (VSI) is an innovative approach that enables untrained operators to capture quality ultrasound images.
We present a novel segmentation model known as Wavelet_Attention_UNet (WATUNet)
In this model, we incorporate wavelet gates (WGs) and attention gates (AGs) between the encoder and decoder instead of a simple connection to overcome the limitations mentioned.
arXiv Detail & Related papers (2023-11-17T20:32:37Z) - Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement
Learning [53.00683059396803]
Mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images.
We propose a decision-based MIM that utilizes reinforcement learning (RL) to automatically search for optimal image masking ratio and masking strategy.
Our approach has a significant advantage over alternative self-supervised methods on the task of neuron segmentation.
arXiv Detail & Related papers (2023-10-06T10:40:46Z) - A Complementary Global and Local Knowledge Network for Ultrasound
denoising with Fine-grained Refinement [0.7424725048947504]
Ultrasound imaging serves as an effective and non-invasive diagnostic tool commonly employed in clinical examinations.
Existing methods for speckle noise reduction induce excessive image smoothing or fail to preserve detailed information adequately.
We propose a complementary global and local knowledge network for ultrasound denoising with fine-grained refinement.
arXiv Detail & Related papers (2023-10-05T09:12:34Z) - Deep Multi-Threshold Spiking-UNet for Image Processing [51.88730892920031]
This paper introduces the novel concept of Spiking-UNet for image processing, which combines the power of Spiking Neural Networks (SNNs) with the U-Net architecture.
To achieve an efficient Spiking-UNet, we face two primary challenges: ensuring high-fidelity information propagation through the network via spikes and formulating an effective training strategy.
Experimental results show that, on image segmentation and denoising, our Spiking-UNet achieves comparable performance to its non-spiking counterpart.
arXiv Detail & Related papers (2023-07-20T16:00:19Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - Data Augmentation and CNN Classification For Automatic COVID-19
Diagnosis From CT-Scan Images On Small Dataset [0.0]
We present an automatic COVID1-19 diagnosis framework from lung CT images.
We propose a unique and effective data augmentation method using multiple Hounsfield Unit (HU) normalization windows.
On the training/validation dataset, we achieve a patient classification accuracy of 93.39%.
arXiv Detail & Related papers (2021-08-16T15:23:00Z) - ADRN: Attention-based Deep Residual Network for Hyperspectral Image
Denoising [52.01041506447195]
We propose an attention-based deep residual network to learn a mapping from noisy HSI to the clean one.
Experimental results demonstrate that our proposed ADRN scheme outperforms the state-of-the-art methods both in quantitative and visual evaluations.
arXiv Detail & Related papers (2020-03-04T08:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.