Restoration Adaptation for Semantic Segmentation on Low Quality Images
- URL: http://arxiv.org/abs/2602.14042v1
- Date: Sun, 15 Feb 2026 08:13:23 GMT
- Title: Restoration Adaptation for Semantic Segmentation on Low Quality Images
- Authors: Kai Guan, Rongyuan Wu, Shuai Li, Wentao Zhu, Wenjun Zeng, Lei Zhang,
- Abstract summary: In real-world scenarios, the performance of semantic segmentation often deteriorates when processing low-quality (LQ) images.<n>We propose a Semantic-trained Restoration (SCR) model, which injects segmentation priors into the restoration model.<n>Then, RASS transfers semantic restoration knowledge into segmentation through LoRA-based module merging and task-specific fine-tuning.
- Score: 29.60165376603045
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In real-world scenarios, the performance of semantic segmentation often deteriorates when processing low-quality (LQ) images, which may lack clear semantic structures and high-frequency details. Although image restoration techniques offer a promising direction for enhancing degraded visual content, conventional real-world image restoration (Real-IR) models primarily focus on pixel-level fidelity and often fail to recover task-relevant semantic cues, limiting their effectiveness when directly applied to downstream vision tasks. Conversely, existing segmentation models trained on high-quality data lack robustness under real-world degradations. In this paper, we propose Restoration Adaptation for Semantic Segmentation (RASS), which effectively integrates semantic image restoration into the segmentation process, enabling high-quality semantic segmentation on the LQ images directly. Specifically, we first propose a Semantic-Constrained Restoration (SCR) model, which injects segmentation priors into the restoration model by aligning its cross-attention maps with segmentation masks, encouraging semantically faithful image reconstruction. Then, RASS transfers semantic restoration knowledge into segmentation through LoRA-based module merging and task-specific fine-tuning, thereby enhancing the model's robustness to LQ images. To validate the effectiveness of our framework, we construct a real-world LQ image segmentation dataset with high-quality annotations, and conduct extensive experiments on both synthetic and real-world LQ benchmarks. The results show that SCR and RASS significantly outperform state-of-the-art methods in segmentation and restoration tasks. Code, models, and datasets will be available at https://github.com/Ka1Guan/RASS.git.
Related papers
- Towards Any-Quality Image Segmentation via Generative and Adaptive Latent Space Enhancement [27.566673104431725]
Segment Anything Models (SAMs) are known for their exceptional zero-shot segmentation performance.<n>However, their performance drops significantly on severely degraded, low-quality images, limiting their effectiveness in real-world scenarios.<n>We propose GleSAM++, which utilizes Generative Latent space Enhancement to boost robustness on low-quality images.
arXiv Detail & Related papers (2026-01-05T11:28:58Z) - Pruning Overparameterized Multi-Task Networks for Degraded Web Image Restoration [1.9336815376402718]
We propose a strategy for compressing multi-task image restoration models.<n>The proposed model, namely MIR-L, utilizes an iterative pruning strategy that removes low-magnitude weights.<n>Tests show that MIR-L retains only 10% of the trainable parameters while maintaining high image restoration performance.
arXiv Detail & Related papers (2025-10-16T09:04:05Z) - MoA-VR: A Mixture-of-Agents System Towards All-in-One Video Restoration [62.929029990341796]
Real-world videos often suffer from complex degradations, such as noise, compression artifacts, and low-light distortions.<n>We propose MoA-VR, which mimics the reasoning and processing procedures of human professionals through three coordinated agents.<n>Specifically, we construct a large-scale and high-resolution video degradation recognition benchmark and build a vision-language model (VLM) driven degradation identifier.
arXiv Detail & Related papers (2025-10-09T17:42:51Z) - No time to train! Training-Free Reference-Based Instance Segmentation [15.061599989448867]
This work investigates the task of object segmentation when provided with only a small set of reference images.<n>Our key insight is to leverage strong semantic priors, as learned by foundation models, to identify corresponding regions between a reference and a target image.<n>We find that correspondences enable automatic generation of instance-level segmentation masks for downstream tasks and instantiate our ideas via a multi-stage, training-free method.
arXiv Detail & Related papers (2025-07-03T16:59:01Z) - SSP-IR: Semantic and Structure Priors for Diffusion-based Realistic Image Restoration [20.873676111265656]
SSP-IR aims to fully exploit semantic and structure priors from low-quality images.<n>Our method outperforms other state-of-the-art methods overall on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-07-04T04:55:14Z) - CoSeR: Bridging Image and Language for Cognitive Super-Resolution [74.24752388179992]
We introduce the Cognitive Super-Resolution (CoSeR) framework, empowering SR models with the capacity to comprehend low-resolution images.
We achieve this by marrying image appearance and language understanding to generate a cognitive embedding.
To further improve image fidelity, we propose a novel condition injection scheme called "All-in-Attention"
arXiv Detail & Related papers (2023-11-27T16:33:29Z) - MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner
for Open-World Semantic Segmentation [110.09800389100599]
We propose MixReorg, a novel and straightforward pre-training paradigm for semantic segmentation.
Our approach involves generating fine-grained patch-text pairs data by mixing image patches while preserving the correspondence between patches and text.
With MixReorg as a mask learner, conventional text-supervised semantic segmentation models can achieve highly generalizable pixel-semantic alignment ability.
arXiv Detail & Related papers (2023-08-09T09:35:16Z) - Hierarchical Similarity Learning for Aliasing Suppression Image
Super-Resolution [64.15915577164894]
A hierarchical image super-resolution network (HSRNet) is proposed to suppress the influence of aliasing.
HSRNet achieves better quantitative and visual performance than other works, and remits the aliasing more effectively.
arXiv Detail & Related papers (2022-06-07T14:55:32Z) - A Simple Baseline for Zero-shot Semantic Segmentation with Pre-trained
Vision-language Model [61.58071099082296]
It is unclear how to make zero-shot recognition working well on broader vision problems, such as object detection and semantic segmentation.
In this paper, we target for zero-shot semantic segmentation, by building it on an off-the-shelf pre-trained vision-language model, i.e., CLIP.
Our experimental results show that this simple framework surpasses previous state-of-the-arts by a large margin.
arXiv Detail & Related papers (2021-12-29T18:56:18Z) - Spatially-Adaptive Image Restoration using Distortion-Guided Networks [51.89245800461537]
We present a learning-based solution for restoring images suffering from spatially-varying degradations.
We propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts to difficult regions in the image.
arXiv Detail & Related papers (2021-08-19T11:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.