UniSER: A Foundation Model for Unified Soft Effects Removal
- URL: http://arxiv.org/abs/2511.14183v1
- Date: Tue, 18 Nov 2025 06:39:39 GMT
- Title: UniSER: A Foundation Model for Unified Soft Effects Removal
- Authors: Jingdong Zhang, Lingzhi Zhang, Qing Liu, Mang Tik Chiu, Connelly Barnes, Yizhou Wang, Haoran You, Xiaoyang Liu, Yuqian Zhou, Zhe Lin, Eli Shechtman, Sohrab Amirghodsi, Xin Li, Wenping Wang, Xiaohang Zhan,
- Abstract summary: We introduce UniSER, capable of addressing diverse degradations caused by soft effects within a single framework.<n>Our methodology centers on curating a massive 3.8M-pair dataset to ensure robustness and generalization.<n>This synergistic approach allows UniSER to significantly outperform both specialist and generalist models.
- Score: 72.60782767314713
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Digital images are often degraded by soft effects such as lens flare, haze, shadows, and reflections, which reduce aesthetics even though the underlying pixels remain partially visible. The prevailing works address these degradations in isolation, developing highly specialized, specialist models that lack scalability and fail to exploit the shared underlying essences of these restoration problems. While specialist models are limited, recent large-scale pretrained generalist models offer powerful, text-driven image editing capabilities. while recent general-purpose systems (e.g., GPT-4o, Flux Kontext, Nano Banana) require detailed prompts and often fail to achieve robust removal on these fine-grained tasks or preserve identity of the scene. Leveraging the common essence of soft effects, i.e., semi-transparent occlusions, we introduce a foundational versatile model UniSER, capable of addressing diverse degradations caused by soft effects within a single framework. Our methodology centers on curating a massive 3.8M-pair dataset to ensure robustness and generalization, which includes novel, physically-plausible data to fill critical gaps in public benchmarks, and a tailored training pipeline that fine-tunes a Diffusion Transformer to learn robust restoration priors from this diverse data, integrating fine-grained mask and strength controls. This synergistic approach allows UniSER to significantly outperform both specialist and generalist models, achieving robust, high-fidelity restoration in the wild.
Related papers
- Learning to Restore Multi-Degraded Images via Ingredient Decoupling and Task-Aware Path Adaptation [51.10017611491389]
Real-world images often suffer from multiple coexisting degradations, such as rain, noise, and haze coexisting in a single image.<n>We propose an adaptive multi-degradation image restoration network that reconstructs images by leveraging decoupled representations of degradation ingredients.<n>The resulting tightly integrated architecture, termed IMDNet, is extensively validated through experiments.
arXiv Detail & Related papers (2025-11-07T01:50:36Z) - UniLDiff: Unlocking the Power of Diffusion Priors for All-in-One Image Restoration [16.493990086330985]
UniLDiff is a unified framework enhanced with degradation- and detail-aware mechanisms.<n>We introduce a Degradation-Aware Feature Fusion (DAFF) to dynamically inject low-quality features into each denoising step.<n>We also design a Detail-Aware Expert Module (DAEM) in the decoder to enhance texture and fine-structure recovery.
arXiv Detail & Related papers (2025-07-31T16:02:00Z) - UniRes: Universal Image Restoration for Complex Degradations [53.74404005987783]
Real-world image restoration is hampered by diverse degradations stemming from varying capture conditions, capture devices and post-processing pipelines.<n>A simple yet flexible diffusionbased framework, named UniRes, is proposed to address such degradations in an end-to-end manner.<n>Our proposed method is evaluated on both complex-degradation and single-degradation image restoration datasets.
arXiv Detail & Related papers (2025-06-05T21:25:39Z) - DIPLI: Deep Image Prior Lucky Imaging for Blind Astronomical Image Restoration [4.378167136812483]
This work proposes DIPLI - a framework that shifts from single-frame to multi-frame training using the Back Projection technique.<n>A comprehensive evaluation compares the method against Lucky Imaging, DIP, the transformer-based model RVRT, and the diffusion-based model DiffIR2VR-Zero.
arXiv Detail & Related papers (2025-03-20T09:33:16Z) - FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration [66.61201445650323]
Existing methods suffer from a generalization bottleneck in real-world scenarios.<n>We contribute a million-scale dataset with two notable advantages over existing training data.<n>We propose a robust model, FoundIR, to better address a broader range of restoration tasks in real-world scenarios.
arXiv Detail & Related papers (2024-12-02T12:08:40Z) - Adaptive Blind All-in-One Image Restoration [15.726917603679716]
Blind all-in-one image restoration models aim to recover a high-quality image from an input degraded with unknown distortions.<n>We introduce ABAIR, a simple yet effective adaptive blind all-in-one restoration model that handles multiple degradations and generalizes well to unseen distortions.<n>Our model not only surpasses state-of-the-art performance on five- and three-task IR setups but also demonstrates superior generalization to unseen degradations and composite distortions.
arXiv Detail & Related papers (2024-11-27T14:58:08Z) - Mixed Degradation Image Restoration via Local Dynamic Optimization and Conditional Embedding [67.57487747508179]
Multiple-in-one image restoration (IR) has made significant progress, aiming to handle all types of single degraded image restoration with a single model.
In this paper, we propose a novel multiple-in-one IR model that can effectively restore images with both single and mixed degradations.
arXiv Detail & Related papers (2024-11-25T09:26:34Z) - Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality [69.76121008898677]
Fine-grained Selective Calibrated CLIP integrates local hard negative loss and selective calibrated regularization.
Our evaluations show that FSC-CLIP not only achieves compositionality on par with state-of-the-art models but also retains strong multi-modal capabilities.
arXiv Detail & Related papers (2024-10-07T17:16:20Z) - Diff-Restorer: Unleashing Visual Prompts for Diffusion-based Universal Image Restoration [19.87693298262894]
We propose Diff-Restorer, a universal image restoration method based on the diffusion model.
We utilize the pre-trained visual language model to extract visual prompts from degraded images.
We also design a Degradation-aware Decoder to perform structural correction and convert the latent code to the pixel domain.
arXiv Detail & Related papers (2024-07-04T05:01:10Z) - SSP-IR: Semantic and Structure Priors for Diffusion-based Realistic Image Restoration [20.873676111265656]
SSP-IR aims to fully exploit semantic and structure priors from low-quality images.<n>Our method outperforms other state-of-the-art methods overall on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-07-04T04:55:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.