SIR-DIFF: Sparse Image Sets Restoration with Multi-View Diffusion Model
- URL: http://arxiv.org/abs/2503.14463v1
- Date: Tue, 18 Mar 2025 17:42:34 GMT
- Title: SIR-DIFF: Sparse Image Sets Restoration with Multi-View Diffusion Model
- Authors: Yucheng Mao, Boyang Wang, Nilesh Kulkarni, Jeong Joon Park,
- Abstract summary: We tackle image restoration from a different perspective by jointly denoising multiple photographs of the same scene.<n>Our core hypothesis is that degraded images capturing a shared scene contain complementary information that, when combined, better constrains the restoration problem.<n>Our model is trained to output 3D consistent images, making it a promising tool for applications requiring robust multi-view integration.
- Score: 9.821302965981515
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The computer vision community has developed numerous techniques for digitally restoring true scene information from single-view degraded photographs, an important yet extremely ill-posed task. In this work, we tackle image restoration from a different perspective by jointly denoising multiple photographs of the same scene. Our core hypothesis is that degraded images capturing a shared scene contain complementary information that, when combined, better constrains the restoration problem. To this end, we implement a powerful multi-view diffusion model that jointly generates uncorrupted views by extracting rich information from multi-view relationships. Our experiments show that our multi-view approach outperforms existing single-view image and even video-based methods on image deblurring and super-resolution tasks. Critically, our model is trained to output 3D consistent images, making it a promising tool for applications requiring robust multi-view integration, such as 3D reconstruction or pose estimation.
Related papers
- UniCoRN: Latent Diffusion-based Unified Controllable Image Restoration Network across Multiple Degradations [4.892790389883125]
We propose UniCoRN, a unified image restoration approach capable of handling multiple degradation types simultaneously.
Specifically, we uncover the potential of low-level visual cues extracted from images in guiding a controllable diffusion model.
We also introduce MetaRestore, a metalens imaging benchmark containing images with multiple degradations and artifacts.
arXiv Detail & Related papers (2025-03-20T05:42:13Z) - MEAT: Multiview Diffusion Model for Human Generation on Megapixels with Mesh Attention [83.56588173102594]
We introduce a solution called mesh attention to enable training at 1024x1024 resolution.
This approach significantly reduces the complexity of multiview attention while maintaining cross-view consistency.
Building on this foundation, we devise a mesh attention block and combine it with keypoint conditioning to create our human-specific multiview diffusion model, MEAT.
arXiv Detail & Related papers (2025-03-11T17:50:59Z) - Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention [87.02613021058484]
We introduce Era3D, a novel multiview diffusion method that generates high-resolution multiview images from a single-view image.<n>Era3D generates high-quality multiview images with up to a 512*512 resolution while reducing complexity by 12x times.
arXiv Detail & Related papers (2024-05-19T17:13:16Z) - Many-to-many Image Generation with Auto-regressive Diffusion Models [59.5041405824704]
This paper introduces a domain-general framework for many-to-many image generation, capable of producing interrelated image series from a given set of images.
We present MIS, a novel large-scale multi-image dataset, containing 12M synthetic multi-image samples, each with 25 interconnected images.
We learn M2M, an autoregressive model for many-to-many generation, where each image is modeled within a diffusion framework.
arXiv Detail & Related papers (2024-04-03T23:20:40Z) - VideoMV: Consistent Multi-View Generation Based on Large Video Generative Model [34.35449902855767]
Two fundamental questions are what data we use for training and how to ensure multi-view consistency.
We propose a dense consistent multi-view generation model that is fine-tuned from off-the-shelf video generative models.
Our approach can generate 24 dense views and converges much faster in training than state-of-the-art approaches.
arXiv Detail & Related papers (2024-03-18T17:48:15Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - DeepMultiCap: Performance Capture of Multiple Characters Using Sparse
Multiview Cameras [63.186486240525554]
DeepMultiCap is a novel method for multi-person performance capture using sparse multi-view cameras.
Our method can capture time varying surface details without the need of using pre-scanned template models.
arXiv Detail & Related papers (2021-05-01T14:32:13Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.