Hierarchical Interactive Reconstruction Network For Video Compressive
Sensing
- URL: http://arxiv.org/abs/2304.07473v1
- Date: Sat, 15 Apr 2023 04:57:57 GMT
- Title: Hierarchical Interactive Reconstruction Network For Video Compressive
Sensing
- Authors: Tong Zhang, Wenxue Cui, Chen Hui, Feng Jiang
- Abstract summary: We propose a novel Hierarchical InTeractive Video CS Reconstruction Network(HIT-VCSNet), which can cooperatively exploit the deep priors in both spatial and temporal domains.
In the temporal domain, a novel hierarchical interaction mechanism is proposed, which can cooperatively learn correlations among different frames in the multiscale space.
- Score: 17.27398750515051
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep network-based image and video Compressive Sensing(CS) has attracted
increasing attentions in recent years. However, in the existing deep
network-based CS methods, a simple stacked convolutional network is usually
adopted, which not only weakens the perception of rich contextual prior
knowledge, but also limits the exploration of the correlations between temporal
video frames. In this paper, we propose a novel Hierarchical InTeractive Video
CS Reconstruction Network(HIT-VCSNet), which can cooperatively exploit the deep
priors in both spatial and temporal domains to improve the reconstruction
quality. Specifically, in the spatial domain, a novel hierarchical structure is
designed, which can hierarchically extract deep features from keyframes and
non-keyframes. In the temporal domain, a novel hierarchical interaction
mechanism is proposed, which can cooperatively learn the correlations among
different frames in the multiscale space. Extensive experiments manifest that
the proposed HIT-VCSNet outperforms the existing state-of-the-art video and
image CS methods in a large margin.
Related papers
- Eigenspace Restructuring: a Principle of Space and Frequency in Neural
Networks [11.480563447698172]
We show that the eigenstructure of infinite-width multilayer perceptrons (MLPs) depends solely on the concept frequency.
We show that the topologies from deep convolutional networks (CNNs) restructure the associated eigenspaces into finer subspaces.
The resulting fine-grained eigenstructure dramatically improves the network's learnability.
arXiv Detail & Related papers (2021-12-10T15:44:14Z) - Image Compressed Sensing Using Non-local Neural Network [43.51101614942895]
In this paper, a novel image CS framework using non-local neural network (NL-CSNet) is proposed.
In the proposed NL-CSNet, two non-localworks are constructed for utilizing the non-local self-similarity priors.
In the subnetwork of multi-scale feature domain, the affinities between the dense feature representations are explored.
arXiv Detail & Related papers (2021-12-07T14:06:12Z) - Event and Activity Recognition in Video Surveillance for Cyber-Physical
Systems [0.0]
Long-term motion patterns alone play a pivotal role in the task of recognizing an event.
We show that the long-term motion patterns alone play a pivotal role in the task of recognizing an event.
Only the temporal features are exploited using a hybrid Convolutional Neural Network (CNN) + Recurrent Neural Network (RNN) architecture.
arXiv Detail & Related papers (2021-11-03T08:30:38Z) - Relational Self-Attention: What's Missing in Attention for Video
Understanding [52.38780998425556]
We introduce a relational feature transform, dubbed the relational self-attention (RSA)
Our experiments and ablation studies show that the RSA network substantially outperforms convolution and self-attention counterparts.
arXiv Detail & Related papers (2021-11-02T15:36:11Z) - Spatial Dependency Networks: Neural Layers for Improved Generative Image
Modeling [79.15521784128102]
We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs)
In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way.
We show that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation.
arXiv Detail & Related papers (2021-03-16T07:01:08Z) - A Deep-Unfolded Reference-Based RPCA Network For Video
Foreground-Background Separation [86.35434065681925]
This paper proposes a new deep-unfolding-based network design for the problem of Robust Principal Component Analysis (RPCA)
Unlike existing designs, our approach focuses on modeling the temporal correlation between the sparse representations of consecutive video frames.
Experimentation using the moving MNIST dataset shows that the proposed network outperforms a recently proposed state-of-the-art RPCA network in the task of video foreground-background separation.
arXiv Detail & Related papers (2020-10-02T11:40:09Z) - Co-Saliency Spatio-Temporal Interaction Network for Person
Re-Identification in Videos [85.6430597108455]
We propose a novel Co-Saliency Spatio-Temporal Interaction Network (CSTNet) for person re-identification in videos.
It captures the common salient foreground regions among video frames and explores the spatial-temporal long-range context interdependency from such regions.
Multiple spatialtemporal interaction modules within CSTNet are proposed, which exploit the spatial and temporal long-range context interdependencies on such features and spatial-temporal information correlation.
arXiv Detail & Related papers (2020-04-10T10:23:58Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z) - See More, Know More: Unsupervised Video Object Segmentation with
Co-Attention Siamese Networks [184.4379622593225]
We introduce a novel network, called CO-attention Siamese Network (COSNet), to address the unsupervised video object segmentation task.
We emphasize the importance of inherent correlation among video frames and incorporate a global co-attention mechanism.
We propose a unified and end-to-end trainable framework where different co-attention variants can be derived for mining the rich context within videos.
arXiv Detail & Related papers (2020-01-19T11:10:39Z) - Shift Aggregate Extract Networks [3.3263205689999453]
We introduce an architecture based on deep hierarchical decompositions to learn effective representations of large graphs.
Our framework extends classic R-decompositions used in kernel methods, enabling nested part-of-part relations.
We show empirically that our approach is able to outperform current state-of-the-art graph classification methods on large social network datasets.
arXiv Detail & Related papers (2017-03-16T09:52:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.