Event-guided Multi-patch Network with Self-supervision for Non-uniform
Motion Deblurring
- URL: http://arxiv.org/abs/2302.07689v1
- Date: Tue, 14 Feb 2023 15:58:00 GMT
- Title: Event-guided Multi-patch Network with Self-supervision for Non-uniform
Motion Deblurring
- Authors: Hongguang Zhang, Limeng Zhang, Yuchao Dai, Hongdong Li, Piotr Koniusz
- Abstract summary: We present a novel self-supervised event-guided deep hierarchical Multi-patch Network to deal with blurry images and videos.
We also propose an event-guided architecture to exploit motion cues contained in videos to tackle complex blur in videos.
Our MPN achieves the state of the art on the GoPro and VideoDeblurring datasets with a 40x faster runtime compared to current multi-scale methods.
- Score: 113.96237446327795
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contemporary deep learning multi-scale deblurring models suffer from many
issues: 1) They perform poorly on non-uniformly blurred images/videos; 2)
Simply increasing the model depth with finer-scale levels cannot improve
deblurring; 3) Individual RGB frames contain a limited motion information for
deblurring; 4) Previous models have a limited robustness to spatial
transformations and noise. Below, we extend the DMPHN model by several
mechanisms to address the above issues: I) We present a novel self-supervised
event-guided deep hierarchical Multi-patch Network (MPN) to deal with blurry
images and videos via fine-to-coarse hierarchical localized representations;
II) We propose a novel stacked pipeline, StackMPN, to improve the deblurring
performance under the increased network depth; III) We propose an event-guided
architecture to exploit motion cues contained in videos to tackle complex blur
in videos; IV) We propose a novel self-supervised step to expose the model to
random transformations (rotations, scale changes), and make it robust to
Gaussian noises. Our MPN achieves the state of the art on the GoPro and
VideoDeblur datasets with a 40x faster runtime compared to current multi-scale
methods. With 30ms to process an image at 1280x720 resolution, it is the first
real-time deep motion deblurring model for 720p images at 30fps. For StackMPN,
we obtain significant improvements over 1.2dB on the GoPro dataset by
increasing the network depth. Utilizing the event information and
self-supervision further boost results to 33.83dB.
Related papers
- Pixel-Aligned Multi-View Generation with Depth Guided Decoder [86.1813201212539]
We propose a novel method for pixel-level image-to-multi-view generation.
Unlike prior work, we incorporate attention layers across multi-view images in the VAE decoder of a latent video diffusion model.
Our model enables better pixel alignment across multi-view images.
arXiv Detail & Related papers (2024-08-26T04:56:41Z) - Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention [87.02613021058484]
We introduce Era3D, a novel multiview diffusion method that generates high-resolution multiview images from a single-view image.
Era3D generates high-quality multiview images with up to a 512*512 resolution while reducing complexity by 12x times.
arXiv Detail & Related papers (2024-05-19T17:13:16Z) - Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized
Photography [54.36608424943729]
We show that in a ''long-burst'', forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth.
We devise a test-time optimization approach that fits a neural RGB-D representation to long-burst data and simultaneously estimates scene depth and camera motion.
arXiv Detail & Related papers (2022-12-22T18:54:34Z) - Learning to Deblur and Rotate Motion-Blurred Faces [43.673660541417995]
We train a neural network to reconstruct a 3D video representation from a single image and the corresponding face gaze.
We then provide a camera viewpoint relative to the estimated gaze and the blurry image as input to an encoder-decoder network to generate a video of sharp frames with a novel camera viewpoint.
arXiv Detail & Related papers (2021-12-14T17:51:19Z) - MEFNet: Multi-scale Event Fusion Network for Motion Deblurring [62.60878284671317]
Traditional frame-based cameras inevitably suffer from motion blur due to long exposure times.
As a kind of bio-inspired camera, the event camera records the intensity changes in an asynchronous way with high temporal resolution.
In this paper, we rethink the event-based image deblurring problem and unfold it into an end-to-end two-stage image restoration network.
arXiv Detail & Related papers (2021-11-30T23:18:35Z) - DeepMultiCap: Performance Capture of Multiple Characters Using Sparse
Multiview Cameras [63.186486240525554]
DeepMultiCap is a novel method for multi-person performance capture using sparse multi-view cameras.
Our method can capture time varying surface details without the need of using pre-scanned template models.
arXiv Detail & Related papers (2021-05-01T14:32:13Z) - Memory-Efficient Network for Large-scale Video Compressive Sensing [21.040260603729227]
Video snapshot imaging (SCI) captures a sequence of video frames in a single shot using a 2D detector.
In this paper, we develop a memory-efficient network for large-scale video SCI based on multi-group reversible 3D convolutional neural networks.
arXiv Detail & Related papers (2021-03-04T15:14:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.