Reflection Removal through Efficient Adaptation of Diffusion Transformers
- URL: http://arxiv.org/abs/2512.05000v1
- Date: Thu, 04 Dec 2025 17:12:39 GMT
- Title: Reflection Removal through Efficient Adaptation of Diffusion Transformers
- Authors: Daniyar Zakarin, Thiemo Wandel, Anton Obukhov, Dengxin Dai,
- Abstract summary: We introduce a diffusion-transformer (DiT) framework for single-image reflection removal.<n>We analyze existing reflection removal data sources for diversity, scalability, and photorealism.<n>We construct a physically based rendering pipeline in Blender to synthesize realistic glass materials and reflection effects.
- Score: 30.68558779968187
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We introduce a diffusion-transformer (DiT) framework for single-image reflection removal that leverages the generalization strengths of foundation diffusion models in the restoration setting. Rather than relying on task-specific architectures, we repurpose a pre-trained DiT-based foundation model by conditioning it on reflection-contaminated inputs and guiding it toward clean transmission layers. We systematically analyze existing reflection removal data sources for diversity, scalability, and photorealism. To address the shortage of suitable data, we construct a physically based rendering (PBR) pipeline in Blender, built around the Principled BSDF, to synthesize realistic glass materials and reflection effects. Efficient LoRA-based adaptation of the foundation model, combined with the proposed synthetic data, achieves state-of-the-art performance on in-domain and zero-shot benchmarks. These results demonstrate that pretrained diffusion transformers, when paired with physically grounded data synthesis and efficient adaptation, offer a scalable and high-fidelity solution for reflection removal. Project page: https://hf.co/spaces/huawei-bayerlab/windowseat-reflection-removal-web
Related papers
- SIRR-LMM: Single-image Reflection Removal via Large Multimodal Model [9.069411665770266]
Existing datasets suffer from limited physical realism in synthetic data or insufficient scale in real captures.<n>We introduce a synthetic dataset generation framework that path-traces 3D glass models over real background imagery to create physically accurate reflection scenarios.
arXiv Detail & Related papers (2026-01-12T05:03:12Z) - Enhancing Diffusion-based Restoration Models via Difficulty-Adaptive Reinforcement Learning with IQA Reward [93.04811239892852]
Reinforcement Learning (RL) has recently been incorporated into diffusion models.<n>In this paper, we investigate how to effectively integrate RL into diffusion-based restoration models.
arXiv Detail & Related papers (2025-11-03T14:57:57Z) - ResPF: Residual Poisson Flow for Efficient and Physically Consistent Sparse-View CT Reconstruction [7.644299873269135]
Sparse-view computed tomography (CT) is a practical solution to reduce radiation dose, but the resulting inverse problem poses significant challenges for accurate image reconstruction.<n>Recent advances in generative modeling, particularly Poisson Flow Generative Models (PFGM), enable high-fidelity image synthesis.<n>We propose Residual Poisson Flow (ResPF) Generative Models for efficient and accurate sparse-view CT reconstruction.
arXiv Detail & Related papers (2025-06-06T01:43:35Z) - DiffusionRenderer: Neural Inverse and Forward Rendering with Video Diffusion Models [83.28670336340608]
We introduce DiffusionRenderer, a neural approach that addresses the dual problem of inverse and forward rendering.<n>Our model enables practical applications from a single video input--including relighting, material editing, and realistic object insertion.
arXiv Detail & Related papers (2025-01-30T18:59:11Z) - NeRF as a Non-Distant Environment Emitter in Physics-based Inverse Rendering [15.876404576998372]
We introduce NeRF as a non-distant environment emitter into the inverse rendering pipeline.
Our results demonstrate that our NeRF-based emitter offers a more precise representation of scene lighting, thereby improving the accuracy of inverse rendering.
arXiv Detail & Related papers (2024-02-07T13:25:16Z) - Frequency Compensated Diffusion Model for Real-scene Dehazing [6.105813272271171]
We consider a dehazing framework based on conditional diffusion models for improved generalization to real haze.
The proposed dehazing diffusion model significantly outperforms state-of-the-art methods on real-world images.
arXiv Detail & Related papers (2023-08-21T06:50:44Z) - Ref-DVGO: Reflection-Aware Direct Voxel Grid Optimization for an
Improved Quality-Efficiency Trade-Off in Reflective Scene Reconstruction [40.90266517194767]
We propose an implicit-explicit approach to enhance the reconstruction quality and accelerate the training and rendering processes.
Our proposed reflection-aware approach achieves a competitive quality efficiency trade-off compared to competing methods.
arXiv Detail & Related papers (2023-08-16T17:40:18Z) - Physics-Driven Turbulence Image Restoration with Stochastic Refinement [80.79900297089176]
Image distortion by atmospheric turbulence is a critical problem in long-range optical imaging systems.
Fast and physics-grounded simulation tools have been introduced to help the deep-learning models adapt to real-world turbulence conditions.
This paper proposes the Physics-integrated Restoration Network (PiRN) to help the network to disentangle theity from the degradation and the underlying image.
arXiv Detail & Related papers (2023-07-20T05:49:21Z) - Relightify: Relightable 3D Faces from a Single Image via Diffusion
Models [86.3927548091627]
We present the first approach to use diffusion models as a prior for highly accurate 3D facial BRDF reconstruction from a single image.
In contrast to existing methods, we directly acquire the observed texture from the input image, thus, resulting in more faithful and consistent estimation.
arXiv Detail & Related papers (2023-05-10T11:57:49Z) - DIB-R++: Learning to Predict Lighting and Material with a Hybrid
Differentiable Renderer [78.91753256634453]
We consider the challenging problem of predicting intrinsic object properties from a single image by exploiting differentiables.
In this work, we propose DIBR++, a hybrid differentiable which supports these effects by combining specularization and ray-tracing.
Compared to more advanced physics-based differentiables, DIBR++ is highly performant due to its compact and expressive model.
arXiv Detail & Related papers (2021-10-30T01:59:39Z) - Neural BRDF Representation and Importance Sampling [79.84316447473873]
We present a compact neural network-based representation of reflectance BRDF data.
We encode BRDFs as lightweight networks, and propose a training scheme with adaptive angular sampling.
We evaluate encoding results on isotropic and anisotropic BRDFs from multiple real-world datasets.
arXiv Detail & Related papers (2021-02-11T12:00:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.