Related papers: Towards Any-Quality Image Segmentation via Generative and Adaptive Latent Space Enhancement

Towards Any-Quality Image Segmentation via Generative and Adaptive Latent Space Enhancement

URL: http://arxiv.org/abs/2601.02018v1
Date: Mon, 05 Jan 2026 11:28:58 GMT
Title: Towards Any-Quality Image Segmentation via Generative and Adaptive Latent Space Enhancement
Authors: Guangqian Guo, Aixi Ren, Yong Guo, Xuehui Yu, Jiacheng Tian, Wenli Li, Yaoxing Wang, Shan Gao,
Abstract summary: Segment Anything Models (SAMs) are known for their exceptional zero-shot segmentation performance.<n>However, their performance drops significantly on severely degraded, low-quality images, limiting their effectiveness in real-world scenarios.<n>We propose GleSAM++, which utilizes Generative Latent space Enhancement to boost robustness on low-quality images.
Score: 27.566673104431725
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Segment Anything Models (SAMs), known for their exceptional zero-shot segmentation performance, have garnered significant attention in the research community. Nevertheless, their performance drops significantly on severely degraded, low-quality images, limiting their effectiveness in real-world scenarios. To address this, we propose GleSAM++, which utilizes Generative Latent space Enhancement to boost robustness on low-quality images, thus enabling generalization across various image qualities. Additionally, to improve compatibility between the pre-trained diffusion model and the segmentation framework, we introduce two techniques, i.e., Feature Distribution Alignment (FDA) and Channel Replication and Expansion (CRE). However, the above components lack explicit guidance regarding the degree of degradation. The model is forced to implicitly fit a complex noise distribution that spans conditions from mild noise to severe artifacts, which substantially increases the learning burden and leads to suboptimal reconstructions. To address this issue, we further introduce a Degradation-aware Adaptive Enhancement (DAE) mechanism. The key principle of DAE is to decouple the reconstruction process for arbitrary-quality features into two stages: degradation-level prediction and degradation-aware reconstruction. Our method can be applied to pre-trained SAM and SAM2 with only minimal additional learnable parameters, allowing for efficient optimization. Extensive experiments demonstrate that GleSAM++ significantly improves segmentation robustness on complex degradations while maintaining generalization to clear images. Furthermore, GleSAM++ also performs well on unseen degradations, underscoring the versatility of our approach and dataset.

Related papers

ClusIR: Towards Cluster-Guided All-in-One Image Restoration [72.16989784735796]
ClusIR aims to recover high-quality images from diverse degradations within a unified framework.<n>ClusIR comprises two key components: a Probabilistic Cluster-Guided Routing Mechanism (PCGRM) and a Degradation-Aware Frequency Modulation Module (DAFMM)
arXiv Detail & Related papers (2025-12-11T18:59:47Z)
Physics-Guided Null-Space Diffusion with Sparse Masking for Corrective Sparse-View CT Reconstruction [5.479463752172751]
Diffusion models have demonstrated remarkable generative capabilities in image processing tasks.<n>We propose a Sparse condition Temporal Rewighted Integrated Distribution Estimation guided diffusion model (STRIDE) for sparse-view CT reconstruction.<n> Experimental results on both public and real datasets demonstrate that the proposed method achieves the best improvement of 2.58 dB in PSNR, increase of 2.37% in SSIM, and reduction of 0.236 in MSE.
arXiv Detail & Related papers (2025-09-07T09:42:16Z)
RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration [51.77917733024544]
latent diffusion models (LDMs) have improved the perceptual quality of All-in-One image Restoration (AiOR) methods.<n>LDMs suffer from slow inference due to their iterative denoising process, rendering them impractical for time-sensitive applications.<n>Visual autoregressive modeling ( VAR) performs scale-space autoregression and achieves comparable performance to that of state-of-the-art diffusion transformers.
arXiv Detail & Related papers (2025-05-23T15:52:26Z)
Beyond Degradation Redundancy: Contrastive Prompt Learning for All-in-One Image Restoration [109.38288333994407]
Contrastive Prompt Learning (CPL) is a novel framework that fundamentally enhances prompt-task alignment.<n>Our framework establishes new state-of-the-art performance while maintaining parameter efficiency, offering a principled solution for unified image restoration.
arXiv Detail & Related papers (2025-04-14T08:24:57Z)
Enhanced Semantic Extraction and Guidance for UGC Image Super Resolution [18.058473238611725]
We propose a novel approach to image super-resolution by integrating semantic guidance into a diffusion framework.<n>Our method addresses the inconsistency between degradations in wild and synthetic datasets.<n>Our model won second place in the CVIRE 2025 Short-form Image Super-Resolution Challenge.
arXiv Detail & Related papers (2025-04-14T05:26:24Z)
Segment Any-Quality Images with Generative Latent Space Enhancement [23.05638803781018]
We propose GleSAM to boost robustness on low-quality images.<n>We adapt the concept of latent diffusion to SAM-based segmentation frameworks.<n>We also introduce two techniques to improve compatibility between the pre-trained diffusion model and the segmentation framework.
arXiv Detail & Related papers (2025-03-16T13:58:13Z)
Feature Alignment with Equivariant Convolutions for Burst Image Super-Resolution [52.55429225242423]
We propose a novel framework for Burst Image Super-Resolution (BISR), featuring an equivariant convolution-based alignment.<n>This enables the alignment transformation to be learned via explicit supervision in the image domain and easily applied in the feature domain.<n>Experiments on BISR benchmarks show the superior performance of our approach in both quantitative metrics and visual quality.
arXiv Detail & Related papers (2025-03-11T11:13:10Z)
IPSeg: Image Posterior Mitigates Semantic Drift in Class-Incremental Segmentation [77.06177202334398]
We identify two critical challenges in CISS that contribute to semantic drift and degrade performance.<n>First, we highlight the issue of separate optimization, where different parts of the model are optimized in distinct incremental stages.<n>Second, we identify noisy semantics arising from inappropriate pseudo-labeling, which results in sub-optimal results.
arXiv Detail & Related papers (2025-02-07T12:19:37Z)
EchoIR: Advancing Image Restoration with Echo Upsampling and Bi-Level Optimization [0.0]
We introduce the EchoIR, an UNet-like image restoration network with a bilateral learnable upsampling mechanism to bridge this gap.<n>In pursuit of modeling a hierarchical model of image restoration and upsampling tasks, we propose the Approximated Sequential Bi-level Optimization (AS-BLO)
arXiv Detail & Related papers (2024-12-10T06:27:08Z)
Boosting Visual Recognition in Real-world Degradations via Unsupervised Feature Enhancement Module with Deep Channel Prior [22.323789227447755]
Fog, low-light, and motion blur degrade image quality and pose threats to the safety of autonomous driving. This work proposes a novel Deep Channel Prior (DCP) for degraded visual recognition. Based on this, a novel plug-and-play Unsupervised Feature Enhancement Module (UFEM) is proposed to achieve unsupervised feature correction.
arXiv Detail & Related papers (2024-04-02T07:16:56Z)
DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments. Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features. Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.