ReflexSplit: Single Image Reflection Separation via Layer Fusion-Separation
- URL: http://arxiv.org/abs/2601.17468v1
- Date: Sat, 24 Jan 2026 13:52:21 GMT
- Title: ReflexSplit: Single Image Reflection Separation via Layer Fusion-Separation
- Authors: Chia-Ming Lee, Yu-Fan Lin, Jing-Hui Jung, Yu-Jou Hsiao, Chih-Chung Hsu, Yu-Lun Liu,
- Abstract summary: Single Image Reflection Separation (SIRS) disentangles mixed images into transmission and reflection layers.<n>Existing methods suffer from transmission-reflection confusion under nonlinear mixing.<n>We propose ReflexSplit, a dual-stream framework with three key innovations.
- Score: 13.290464696196366
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Single Image Reflection Separation (SIRS) disentangles mixed images into transmission and reflection layers. Existing methods suffer from transmission-reflection confusion under nonlinear mixing, particularly in deep decoder layers, due to implicit fusion mechanisms and inadequate multi-scale coordination. We propose ReflexSplit, a dual-stream framework with three key innovations. (1) Cross-scale Gated Fusion (CrGF) adaptively aggregates semantic priors, texture details, and decoder context across hierarchical depths, stabilizing gradient flow and maintaining feature consistency. (2) Layer Fusion-Separation Blocks (LFSB) alternate between fusion for shared structure extraction and differential separation for layer-specific disentanglement. Inspired by Differential Transformer, we extend attention cancellation to dual-stream separation via cross-stream subtraction. (3) Curriculum training progressively strengthens differential separation through depth-dependent initialization and epoch-wise warmup. Extensive experiments on synthetic and real-world benchmarks demonstrate state-of-the-art performance with superior perceptual quality and robust generalization. Our code is available at https://github.com/wuw2135/ReflexSplit.
Related papers
- Coupled Degradation Modeling and Fusion: A VLM-Guided Degradation-Coupled Network for Degradation-Aware Infrared and Visible Image Fusion [9.915632806109555]
We propose a novel VLM-Guided Degradation-Coupled Fusion network (VGDCFusion)<n>Our VGDCFusion significantly outperforms existing state-of-the-art fusion approaches under various degraded image scenarios.
arXiv Detail & Related papers (2025-10-13T14:26:33Z) - Excavate the potential of Single-Scale Features: A Decomposition Network for Water-Related Optical Image Enhancement [22.353926184394002]
Single-scale feature extraction can match or surpass the performance of multi-scale methods.<n>SSD-Net combines CNN's local feature extraction capabilities with Transformer's global modeling strengths.
arXiv Detail & Related papers (2025-08-06T06:41:58Z) - AFUNet: Cross-Iterative Alignment-Fusion Synergy for HDR Reconstruction via Deep Unfolding Paradigm [41.09028235123695]
Existing learning-based methods effectively reconstruct HDR images from multi-exposure LDR inputs with extended dynamic range and improved detail.<n>We propose the cross-iterative Alignment and Fusion deep Unfolding Network (AFUNet) to address these limitations.<n>Our method formulates multi-exposure HDR reconstruction from a Maximum A Posteriori (MAP) estimation perspective.
arXiv Detail & Related papers (2025-06-30T06:03:34Z) - Manifold-aware Representation Learning for Degradation-agnostic Image Restoration [135.90908995927194]
Image Restoration (IR) aims to recover high quality images from degraded inputs affected by various corruptions such as noise, blur, haze, rain, and low light conditions.<n>We present MIRAGE, a unified framework for all in one IR that explicitly decomposes the input feature space into three semantically aligned parallel branches.<n>This modular decomposition significantly improves generalization and efficiency across diverse degradations.
arXiv Detail & Related papers (2025-05-24T12:52:10Z) - Single Image Reflection Removal via inter-layer Complementarity [63.37693451363996]
We introduce a novel inter-layer complementarity model and an efficient inter-layer complementarity attention mechanism for dual-stream architectures.<n>Our method achieves state-of-the-art separation quality on multiple public datasets while significantly reducing both computational cost and model complexity.
arXiv Detail & Related papers (2025-05-19T02:50:15Z) - FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion [92.4205087439928]
Image-event joint depth estimation methods leverage complementary modalities for robust perception, yet face challenges in generalizability.<n>We propose the Self-supervised Transfer (PST) and the FrequencyDe-coupled Fusion module (FreDF)<n>PST establishes cross-modal knowledge transfer through latent space alignment with image foundation models, effectively mitigating data scarcity.<n>FreDF explicitly decouples high-frequency edge features from low-frequency structural components, resolving modality-specific frequency mismatches.<n>This combined approach enables FUSE to construct a universal image-event that only requires lightweight decoder adaptation for target datasets.
arXiv Detail & Related papers (2025-03-25T15:04:53Z) - A Hybrid Transformer-Mamba Network for Single Image Deraining [70.64069487982916]
Existing deraining Transformers employ self-attention mechanisms with fixed-range windows or along channel dimensions.
We introduce a novel dual-branch hybrid Transformer-Mamba network, denoted as TransMamba, aimed at effectively capturing long-range rain-related dependencies.
arXiv Detail & Related papers (2024-08-31T10:03:19Z) - Mutual Information-driven Triple Interaction Network for Efficient Image
Dehazing [54.168567276280505]
We propose a novel Mutual Information-driven Triple interaction Network (MITNet) for image dehazing.
The first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal.
The second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum.
arXiv Detail & Related papers (2023-08-14T08:23:58Z) - Layer-wise Representation Fusion for Compositional Generalization [26.771056871444692]
A key reason for failure on compositional generalization is that the syntactic and semantic representations of sequences in both the uppermost layer of the encoder and decoder are entangled.
We explain why it exists by analyzing the representation evolving mechanism from the bottom to the top of the Transformer layers.
Inspired by this, we propose LRF, a novel textbfLayer-wise textbfRepresentation textbfFusion framework for CG, which learns to fuse previous layers' information back into the encoding and decoding process.
arXiv Detail & Related papers (2023-07-20T12:01:40Z) - Self-Supervised Generative-Contrastive Learning of Multi-Modal Euclidean Input for 3D Shape Latent Representations: A Dynamic Switching Approach [53.376029341079054]
We propose a combined generative and contrastive neural architecture for learning latent representations of 3D shapes.<n>The architecture uses two encoder branches for voxel grids and multi-view images from the same underlying shape.
arXiv Detail & Related papers (2023-01-11T18:14:24Z) - CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for
Multi-Modality Image Fusion [138.40422469153145]
We propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network.
We show that CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2022-11-26T02:40:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.