Memory-Efficient Network for Large-scale Video Compressive Sensing
- URL: http://arxiv.org/abs/2103.03089v2
- Date: Fri, 5 Mar 2021 08:52:14 GMT
- Title: Memory-Efficient Network for Large-scale Video Compressive Sensing
- Authors: Ziheng Cheng, Bo Chen, Guanliang Liu, Hao Zhang, Ruiying Lu, Zhengjue
Wang, Xin Yuan
- Abstract summary: Video snapshot imaging (SCI) captures a sequence of video frames in a single shot using a 2D detector.
In this paper, we develop a memory-efficient network for large-scale video SCI based on multi-group reversible 3D convolutional neural networks.
- Score: 21.040260603729227
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video snapshot compressive imaging (SCI) captures a sequence of video frames
in a single shot using a 2D detector. The underlying principle is that during
one exposure time, different masks are imposed on the high-speed scene to form
a compressed measurement. With the knowledge of masks, optimization algorithms
or deep learning methods are employed to reconstruct the desired high-speed
video frames from this snapshot measurement. Unfortunately, though these
methods can achieve decent results, the long running time of optimization
algorithms or huge training memory occupation of deep networks still preclude
them in practical applications. In this paper, we develop a memory-efficient
network for large-scale video SCI based on multi-group reversible 3D
convolutional neural networks. In addition to the basic model for the grayscale
SCI system, we take one step further to combine demosaicing and SCI
reconstruction to directly recover color video from Bayer measurements.
Extensive results on both simulation and real data captured by SCI cameras
demonstrate that our proposed model outperforms previous state-of-the-art with
less memory and thus can be used in large-scale problems. The code is at
https://github.com/BoChenGroup/RevSCI-net.
Related papers
- SIGMA:Sinkhorn-Guided Masked Video Modeling [69.31715194419091]
Sinkhorn-guided Masked Video Modelling ( SIGMA) is a novel video pretraining method.
We distribute features of space-time tubes evenly across a limited number of learnable clusters.
Experimental results on ten datasets validate the effectiveness of SIGMA in learning more performant, temporally-aware, and robust video representations.
arXiv Detail & Related papers (2024-07-22T08:04:09Z) - Deep Optics for Video Snapshot Compressive Imaging [10.830072985735175]
Video snapshot imaging (SCI) aims to capture a sequence of video frames with only a single shot of a 2D detector.
This paper presents a framework to jointly optimize masks and a reconstruction network.
We believe this is a milestone for real-world video SCI.
arXiv Detail & Related papers (2024-04-08T08:04:44Z) - Splatter Image: Ultra-Fast Single-View 3D Reconstruction [67.96212093828179]
Splatter Image is based on Gaussian Splatting, which allows fast and high-quality reconstruction of 3D scenes from multiple images.
We learn a neural network that, at test time, performs reconstruction in a feed-forward manner, at 38 FPS.
On several synthetic, real, multi-category and large-scale benchmark datasets, we achieve better results in terms of PSNR, LPIPS, and other metrics while training and evaluating much faster than prior works.
arXiv Detail & Related papers (2023-12-20T16:14:58Z) - A Simple Recipe for Contrastively Pre-training Video-First Encoders
Beyond 16 Frames [54.90226700939778]
We build on the common paradigm of transferring large-scale, image--text models to video via shallow temporal fusion.
We expose two limitations to the approach: (1) decreased spatial capabilities, likely due to poor video--language alignment in standard video datasets, and (2) higher memory consumption, bottlenecking the number of frames that can be processed.
arXiv Detail & Related papers (2023-12-12T16:10:19Z) - EfficientSCI: Densely Connected Network with Space-time Factorization
for Large-scale Video Snapshot Compressive Imaging [6.8372546605486555]
We show that an UHD color video with high compression ratio can be reconstructed from a snapshot 2D measurement using a single end-to-end deep learning model with PSNR above 32 dB.
Our method significantly outperforms all previous SOTA algorithms with better real-time performance.
arXiv Detail & Related papers (2023-05-17T07:28:46Z) - GLEAM: Greedy Learning for Large-Scale Accelerated MRI Reconstruction [50.248694764703714]
Unrolled neural networks have recently achieved state-of-the-art accelerated MRI reconstruction.
These networks unroll iterative optimization algorithms by alternating between physics-based consistency and neural-network based regularization.
We propose Greedy LEarning for Accelerated MRI reconstruction, an efficient training strategy for high-dimensional imaging settings.
arXiv Detail & Related papers (2022-07-18T06:01:29Z) - Dual-view Snapshot Compressive Imaging via Optical Flow Aided Recurrent
Neural Network [14.796204921975733]
Dual-view snapshot compressive imaging (SCI) aims to capture videos from two field-of-views (FoVs) in a single snapshot.
It is challenging for existing model-based decoding algorithms to reconstruct each individual scene.
We propose an optical flow-aided recurrent neural network for dual video SCI systems, which provides high-quality decoding in seconds.
arXiv Detail & Related papers (2021-09-11T14:24:44Z) - 10-mega pixel snapshot compressive imaging with a hybrid coded aperture [48.95666098332693]
High resolution images are widely used in our daily life, whereas high-speed video capture is challenging due to the low frame rate of cameras working at the high resolution mode.
snapshot imaging (SCI) was proposed as a solution to the low throughput of existing imaging systems.
arXiv Detail & Related papers (2021-06-30T01:09:24Z) - MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive
Sensing [21.243762976995544]
Video snapshot compressive imaging (SCI) is a promising system, where the video frames are coded by different masks and then compressed to a snapshot measurement.
We develop a Meta Modulated Convolutional Network for SCI reconstruction, dubbed MetaSCI.
arXiv Detail & Related papers (2021-03-02T14:53:00Z) - Plug-and-Play Algorithms for Video Snapshot Compressive Imaging [41.818167109996885]
We consider the reconstruction problem of snapshot video imaging (SCI) using a low-speed 2D sensor (detector)
The underlying principle SCI is to modulate frames with different masks and then encoded frames are integrated into a snapshot on the sensor.
Applying SCI to largescale problems (HD or UHD videos) in our daily life is still challenging one bottlenecks lies in the reconstruction algorithm.
arXiv Detail & Related papers (2021-01-13T00:51:49Z) - A Real-time Action Representation with Temporal Encoding and Deep
Compression [115.3739774920845]
We propose a new real-time convolutional architecture, called Temporal Convolutional 3D Network (T-C3D), for action representation.
T-C3D learns video action representations in a hierarchical multi-granularity manner while obtaining a high process speed.
Our method achieves clear improvements on UCF101 action recognition benchmark against state-of-the-art real-time methods by 5.4% in terms of accuracy and 2 times faster in terms of inference speed with a less than 5MB storage model.
arXiv Detail & Related papers (2020-06-17T06:30:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.