StereoINR: Cross-View Geometry Consistent Stereo Super Resolution with Implicit Neural Representation
- URL: http://arxiv.org/abs/2505.05509v2
- Date: Sat, 05 Jul 2025 05:53:04 GMT
- Title: StereoINR: Cross-View Geometry Consistent Stereo Super Resolution with Implicit Neural Representation
- Authors: Yi Liu, Xinyi Liu, Yi Wan, Panwang Xia, Qiong Wu, Yongjun Zhang,
- Abstract summary: Stereo image super-resolution (SSR) aims to enhance high-resolution details by leveraging information from stereo image pairs.<n>Previous upsampling methods use convolution to independently process deep features of different views, lacking cross-view and non-local information perception.<n>We propose Stereo Implicit Neural Representation (StereoINR), which innovatively models stereo image pairs as continuous implicit representations.<n>This continuous representation breaks through the scale limitations, providing a unified solution for arbitrary-scale stereo super-resolution reconstruction of left-right views.
- Score: 15.167871410210353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stereo image super-resolution (SSR) aims to enhance high-resolution details by leveraging information from stereo image pairs. However, existing stereo super-resolution (SSR) upsampling methods (e.g., pixel shuffle) often overlook cross-view geometric consistency and are limited to fixed-scale upsampling. The key issue is that previous upsampling methods use convolution to independently process deep features of different views, lacking cross-view and non-local information perception, making it difficult to select beneficial information from multi-view scenes adaptively. In this work, we propose Stereo Implicit Neural Representation (StereoINR), which innovatively models stereo image pairs as continuous implicit representations. This continuous representation breaks through the scale limitations, providing a unified solution for arbitrary-scale stereo super-resolution reconstruction of left-right views. Furthermore, by incorporating spatial warping and cross-attention mechanisms, StereoINR enables effective cross-view information fusion and achieves significant improvements in pixel-level geometric consistency. Extensive experiments across multiple datasets show that StereoINR outperforms out-of-training-distribution scale upsampling and matches state-of-the-art SSR methods within training-distribution scales.
Related papers
- Rotation Equivariant Arbitrary-scale Image Super-Resolution [62.41329042683779]
The arbitrary-scale image super-resolution (ASISR) aims to achieve arbitrary-scale high-resolution recoveries from a low-resolution input image.<n>We make efforts to construct a rotation equivariant ASISR method in this study.
arXiv Detail & Related papers (2025-08-07T08:51:03Z) - A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding [76.44979557843367]
We propose a novel multi-view stereo (MVS) framework that gets rid of the depth range prior.<n>We introduce a Multi-view Disparity Attention (MDA) module to aggregate long-range context information.<n>We explicitly estimate the quality of the current pixel corresponding to sampled points on the epipolar line of the source image.
arXiv Detail & Related papers (2024-11-04T08:50:16Z) - QMambaBSR: Burst Image Super-Resolution with Query State Space Model [55.56075874424194]
Burst super-resolution aims to reconstruct high-resolution images with higher quality and richer details by fusing the sub-pixel information from multiple burst low-resolution frames.<n>In BusrtSR, the key challenge lies in extracting the base frame's content complementary sub-pixel details while simultaneously suppressing high-frequency noise disturbance.<n>We introduce a novel Query Mamba Burst Super-Resolution (QMambaBSR) network, which incorporates a Query State Space Model (QSSM) and Adaptive Up-sampling module (AdaUp)
arXiv Detail & Related papers (2024-08-16T11:15:29Z) - Low-light Stereo Image Enhancement and De-noising in the Low-frequency
Information Enhanced Image Space [5.1569866461097185]
Methods are proposed to perform enhancement and de-noising simultaneously.
Low-frequency information enhanced module (IEM) is proposed to suppress noise and produce a new image space.
Cross-channel and spatial context information mining module (CSM) is proposed to encode long-range spatial dependencies.
An encoder-decoder structure is constructed, incorporating cross-view and cross-scale feature interactions.
arXiv Detail & Related papers (2024-01-15T15:03:32Z) - Toward Real World Stereo Image Super-Resolution via Hybrid Degradation
Model and Discriminator for Implied Stereo Image Information [10.957275128743529]
Real-world stereo image super-resolution has a significant influence on enhancing the performance of computer vision systems.
Existing methods for single-image super-resolution can be applied to improve stereo images.
This paper proposes a novel approach that integrates a implicit stereo information discriminator and a hybrid degradation model.
arXiv Detail & Related papers (2023-12-13T07:24:50Z) - Cross-View Hierarchy Network for Stereo Image Super-Resolution [14.574538513341277]
Stereo image super-resolution aims to improve the quality of high-resolution stereo image pairs by exploiting complementary information across views.
We propose a novel method, named Cross-View-Hierarchy Network for Stereo Image Super-Resolution (CVHSSR)
CVHSSR achieves the best stereo image super-resolution performance than other state-of-the-art methods while using fewer parameters.
arXiv Detail & Related papers (2023-04-13T03:11:30Z) - Rank-Enhanced Low-Dimensional Convolution Set for Hyperspectral Image
Denoising [50.039949798156826]
This paper tackles the challenging problem of hyperspectral (HS) image denoising.
We propose rank-enhanced low-dimensional convolution set (Re-ConvSet)
We then incorporate Re-ConvSet into the widely-used U-Net architecture to construct an HS image denoising method.
arXiv Detail & Related papers (2022-07-09T13:35:12Z) - RIAV-MVS: Recurrent-Indexing an Asymmetric Volume for Multi-View Stereo [20.470182157606818]
"Learning-to-optimize" paradigm iteratively indexes a plane-sweeping cost volume and regresses the depth map via a convolutional Gated Recurrent Unit (GRU)
We conduct extensive experiments on real-world MVS datasets and show that our method achieves state-of-the-art performance in terms of both within-dataset evaluation and cross-dataset generalization.
arXiv Detail & Related papers (2022-05-28T03:32:56Z) - A New Dataset and Transformer for Stereoscopic Video Super-Resolution [4.332879001008757]
Stereo video super-resolution aims to enhance the resolution of the low-resolution by reconstructing the high-resolution video.
Key challenges in SVSR are preserving the stereo-consistency and temporal-consistency, without which viewers may experience 3D fatigue.
In this paper, we propose a novel Transformer-based model for SVSR, namely Trans-SVSR.
arXiv Detail & Related papers (2022-04-21T11:49:29Z) - Cross-MPI: Cross-scale Stereo for Image Super-Resolution using
Multiplane Images [44.85260985973405]
Cross-MPI is an end-to-end RefSR network composed of a novel plane-aware MPI mechanism, a multiscale guided upsampling module and a super-resolution synthesis and fusion module.
Experimental results on both digitally synthesized and optical zoom cross-scale data show that the Cross-MPI framework can achieve superior performance against the existing RefSR methods.
arXiv Detail & Related papers (2020-11-30T09:14:07Z) - A Parallel Down-Up Fusion Network for Salient Object Detection in
Optical Remote Sensing Images [82.87122287748791]
We propose a novel Parallel Down-up Fusion network (PDF-Net) for salient object detection in optical remote sensing images (RSIs)
It takes full advantage of the in-path low- and high-level features and cross-path multi-resolution features to distinguish diversely scaled salient objects and suppress the cluttered backgrounds.
Experiments on the ORSSD dataset demonstrate that the proposed network is superior to the state-of-the-art approaches both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-10-02T05:27:57Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.