Related papers: SIRR-LMM: Single-image Reflection Removal via Large Multimodal Model

SIRR-LMM: Single-image Reflection Removal via Large Multimodal Model

URL: http://arxiv.org/abs/2601.07209v1
Date: Mon, 12 Jan 2026 05:03:12 GMT
Title: SIRR-LMM: Single-image Reflection Removal via Large Multimodal Model
Authors: Yu Guo, Zhiqiang Lao, Xiyun Song, Yubin Zhou, Heather Yu,
Abstract summary: Existing datasets suffer from limited physical realism in synthetic data or insufficient scale in real captures.<n>We introduce a synthetic dataset generation framework that path-traces 3D glass models over real background imagery to create physically accurate reflection scenarios.
Score: 9.069411665770266
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Glass surfaces create complex interactions of reflected and transmitted light, making single-image reflection removal (SIRR) challenging. Existing datasets suffer from limited physical realism in synthetic data or insufficient scale in real captures. We introduce a synthetic dataset generation framework that path-traces 3D glass models over real background imagery to create physically accurate reflection scenarios with varied glass properties, camera settings, and post-processing effects. To leverage the capabilities of Large Multimodal Model (LMM), we concatenate the image layers into a single composite input, apply joint captioning, and fine-tune the model using task-specific LoRA rather than full-parameter training. This enables our approach to achieve improved reflection removal and separation performance compared to state-of-the-art methods.

Related papers

Reflection Removal through Efficient Adaptation of Diffusion Transformers [30.68558779968187]
We introduce a diffusion-transformer (DiT) framework for single-image reflection removal.<n>We analyze existing reflection removal data sources for diversity, scalability, and photorealism.<n>We construct a physically based rendering pipeline in Blender to synthesize realistic glass materials and reflection effects.
arXiv Detail & Related papers (2025-12-04T17:12:39Z)
Reflect3r: Single-View 3D Stereo Reconstruction Aided by Mirror Reflections [55.248092751290834]
Mirror reflections are common in everyday environments and can provide stereo information within a single capture.<n>We exploit this property by treating the reflection as an auxiliary view and designing a transformation that constructs a physically valid virtual camera.<n>This enables a multi-view stereo setup from a single image, simplifying the imaging process.
arXiv Detail & Related papers (2025-09-24T23:00:22Z)
Utilizing Multi-step Loss for Single Image Reflection Removal [0.9208007322096532]
Distorted images can negatively impact tasks like object detection and image segmentation.<n>We present a novel approach for image reflection removal using a single image.
arXiv Detail & Related papers (2024-12-11T17:57:25Z)
Mixed Degradation Image Restoration via Local Dynamic Optimization and Conditional Embedding [67.57487747508179]
Multiple-in-one image restoration (IR) has made significant progress, aiming to handle all types of single degraded image restoration with a single model. In this paper, we propose a novel multiple-in-one IR model that can effectively restore images with both single and mixed degradations.
arXiv Detail & Related papers (2024-11-25T09:26:34Z)
Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections [26.02117310176884]
We tackle the problem of generating highly realistic and plausible mirror reflections using diffusion-based generative models.<n>We propose a novel depth-conditioned inpainting method called MirrorFusion, which generates high-quality, realistic, shape and appearance-aware reflections of real-world objects.<n>MirrorFusion outperforms state-of-the-art methods on SynMirror, as demonstrated by extensive quantitative and qualitative analysis.
arXiv Detail & Related papers (2024-09-23T02:59:07Z)
Relightify: Relightable 3D Faces from a Single Image via Diffusion Models [86.3927548091627]
We present the first approach to use diffusion models as a prior for highly accurate 3D facial BRDF reconstruction from a single image. In contrast to existing methods, we directly acquire the observed texture from the input image, thus, resulting in more faithful and consistent estimation.
arXiv Detail & Related papers (2023-05-10T11:57:49Z)
DIB-R++: Learning to Predict Lighting and Material with a Hybrid Differentiable Renderer [78.91753256634453]
We consider the challenging problem of predicting intrinsic object properties from a single image by exploiting differentiables. In this work, we propose DIBR++, a hybrid differentiable which supports these effects by combining specularization and ray-tracing. Compared to more advanced physics-based differentiables, DIBR++ is highly performant due to its compact and expressive model.
arXiv Detail & Related papers (2021-10-30T01:59:39Z)
A Categorized Reflection Removal Dataset with Diverse Real-world Scenes [54.662456878340215]
We construct a new reflection removal dataset that is categorized, diverse, and real-world (CDR) The dataset is constructed using diverse glass types under various environments to ensure diversity. We show that state-of-the-art reflection removal methods generally perform well on blurry reflection but fail in obtaining satisfying performance on other types of real-world reflection.
arXiv Detail & Related papers (2021-08-07T06:56:57Z)
ReflectNet -- A Generative Adversarial Method for Single Image Reflection Suppression [0.6980076213134382]
We propose a single image reflection removal method based on context understanding modules and adversarial training. Our proposed reflection removal method outperforms state-of-the-art methods in terms of PSNR and SSIM on the SIR benchmark dataset.
arXiv Detail & Related papers (2021-05-11T17:33:40Z)
SIR: Self-supervised Image Rectification via Seeing the Same Scene from Multiple Different Lenses [82.56853587380168]
We propose a novel self-supervised image rectification (SIR) method based on an important insight that the rectified results of distorted images of the same scene from different lens should be the same. We leverage a differentiable warping module to generate the rectified images and re-distorted images from the distortion parameters. Our method achieves comparable or even better performance than the supervised baseline method and representative state-of-the-art methods.
arXiv Detail & Related papers (2020-11-30T08:23:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.