RCNet: Deep Recurrent Collaborative Network for Multi-View Low-Light Image Enhancement
- URL: http://arxiv.org/abs/2409.04363v1
- Date: Fri, 6 Sep 2024 15:49:49 GMT
- Title: RCNet: Deep Recurrent Collaborative Network for Multi-View Low-Light Image Enhancement
- Authors: Hao Luo, Baoliang Chen, Lingyu Zhu, Peilin Chen, Shiqi Wang,
- Abstract summary: We make the first attempt to investigate multi-view low-light image enhancement.
We propose a deep multi-view enhancement framework based on the Recurrent Collaborative Network (RCNet)
Experimental results demonstrate that our RCNet significantly outperforms other state-of-the-art methods.
- Score: 19.751696790765635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scene observation from multiple perspectives would bring a more comprehensive visual experience. However, in the context of acquiring multiple views in the dark, the highly correlated views are seriously alienated, making it challenging to improve scene understanding with auxiliary views. Recent single image-based enhancement methods may not be able to provide consistently desirable restoration performance for all views due to the ignorance of potential feature correspondence among different views. To alleviate this issue, we make the first attempt to investigate multi-view low-light image enhancement. First, we construct a new dataset called Multi-View Low-light Triplets (MVLT), including 1,860 pairs of triple images with large illumination ranges and wide noise distribution. Each triplet is equipped with three different viewpoints towards the same scene. Second, we propose a deep multi-view enhancement framework based on the Recurrent Collaborative Network (RCNet). Specifically, in order to benefit from similar texture correspondence across different views, we design the recurrent feature enhancement, alignment and fusion (ReEAF) module, in which intra-view feature enhancement (Intra-view EN) followed by inter-view feature alignment and fusion (Inter-view AF) is performed to model the intra-view and inter-view feature propagation sequentially via multi-view collaboration. In addition, two different modules from enhancement to alignment (E2A) and from alignment to enhancement (A2E) are developed to enable the interactions between Intra-view EN and Inter-view AF, which explicitly utilize attentive feature weighting and sampling for enhancement and alignment, respectively. Experimental results demonstrate that our RCNet significantly outperforms other state-of-the-art methods. All of our dataset, code, and model will be available at https://github.com/hluo29/RCNet.
Related papers
- SDI-Net: Toward Sufficient Dual-View Interaction for Low-light Stereo Image Enhancement [38.66838623890922]
Most low-light image enhancement methods only consider information from a single view.
We propose a model called Toward Sufficient Dual-View Interaction for Low-light Stereo Image Enhancement (Sufficient-Net)
We design a module named Cross-View Sufficient Interaction Module (CSIM) aiming to fully exploit the correlations between the binocular views via the attention mechanism.
arXiv Detail & Related papers (2024-08-20T15:17:11Z) - Multi-view Aggregation Network for Dichotomous Image Segmentation [76.75904424539543]
Dichotomous Image (DIS) has recently emerged towards high-precision object segmentation from high-resolution natural images.
Existing methods rely on tedious multiple encoder-decoder streams and stages to gradually complete the global localization and local refinement.
Inspired by it, we model DIS as a multi-view object perception problem and provide a parsimonious multi-view aggregation network (MVANet)
Experiments on the popular DIS-5K dataset show that our MVANet significantly outperforms state-of-the-art methods in both accuracy and speed.
arXiv Detail & Related papers (2024-04-11T03:00:00Z) - Consolidating Attention Features for Multi-view Image Editing [126.19731971010475]
We focus on spatial control-based geometric manipulations and introduce a method to consolidate the editing process across various views.
We introduce QNeRF, a neural radiance field trained on the internal query features of the edited images.
We refine the process through a progressive, iterative method that better consolidates queries across the diffusion timesteps.
arXiv Detail & Related papers (2024-02-22T18:50:18Z) - MuVieCAST: Multi-View Consistent Artistic Style Transfer [6.767885381740952]
We introduce MuVieCAST, a modular multi-view consistent style transfer network architecture.
MuVieCAST supports both sparse and dense views, making it versatile enough to handle a wide range of multi-view image datasets.
arXiv Detail & Related papers (2023-12-08T14:01:03Z) - M$^3$Net: Multi-view Encoding, Matching, and Fusion for Few-shot
Fine-grained Action Recognition [80.21796574234287]
M$3$Net is a matching-based framework for few-shot fine-grained (FS-FG) action recognition.
It incorporates textitmulti-view encoding, textitmulti-view matching, and textitmulti-view fusion to facilitate embedding encoding, similarity matching, and decision making.
Explainable visualizations and experimental results demonstrate the superiority of M$3$Net in capturing fine-grained action details.
arXiv Detail & Related papers (2023-08-06T09:15:14Z) - Multi-Spectral Image Stitching via Spatial Graph Reasoning [52.27796682972484]
We propose a spatial graph reasoning based multi-spectral image stitching method.
We embed multi-scale complementary features from the same view position into a set of nodes.
By introducing long-range coherence along spatial and channel dimensions, the complementarity of pixel relations and channel interdependencies aids in the reconstruction of aligned multi-view features.
arXiv Detail & Related papers (2023-07-31T15:04:52Z) - 3M3D: Multi-view, Multi-path, Multi-representation for 3D Object
Detection [0.5156484100374059]
We propose 3M3D: A Multi-view, Multi-path, Multi-representation for 3D Object Detection.
We update both multi-view features and query features to enhance the representation of the scene in both fine panoramic view and coarse global view.
We show performance improvements on nuScenes benchmark dataset on top of our baselines.
arXiv Detail & Related papers (2023-02-16T11:28:30Z) - A Simple Baseline for Multi-Camera 3D Object Detection [94.63944826540491]
3D object detection with surrounding cameras has been a promising direction for autonomous driving.
We present SimMOD, a Simple baseline for Multi-camera Object Detection.
We conduct extensive experiments on the 3D object detection benchmark of nuScenes to demonstrate the effectiveness of SimMOD.
arXiv Detail & Related papers (2022-08-22T03:38:01Z) - VPFusion: Joint 3D Volume and Pixel-Aligned Feature Fusion for Single
and Multi-view 3D Reconstruction [23.21446438011893]
VPFusionattains high-quality reconstruction using both - 3D feature volume to capture 3D-structure-aware context.
Existing approaches use RNN, feature pooling, or attention computed independently in each view for multi-view fusion.
We show improved multi-view feature fusion by establishing transformer-based pairwise view association.
arXiv Detail & Related papers (2022-03-14T23:30:58Z) - Single-View View Synthesis with Multiplane Images [64.46556656209769]
We apply deep learning to generate multiplane images given two or more input images at known viewpoints.
Our method learns to predict a multiplane image directly from a single image input.
It additionally generates reasonable depth maps and fills in content behind the edges of foreground objects in background layers.
arXiv Detail & Related papers (2020-04-23T17:59:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.