Speeding Up Action Recognition Using Dynamic Accumulation of Residuals
in Compressed Domain
- URL: http://arxiv.org/abs/2209.14757v1
- Date: Thu, 29 Sep 2022 13:08:49 GMT
- Title: Speeding Up Action Recognition Using Dynamic Accumulation of Residuals
in Compressed Domain
- Authors: Ali Abdari, Pouria Amirjan, Azadeh Mansouri
- Abstract summary: Temporal redundancy and the sheer size of raw videos are the two most common problematic issues related to video processing algorithms.
This paper presents an approach for using residual data, available in compressed videos directly, which can be obtained by a light partially decoding procedure.
Applying neural networks exclusively for accumulated residuals in the compressed domain accelerates performance, while the classification results are highly competitive with raw video approaches.
- Score: 2.062593640149623
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the widespread use of installed cameras, video-based monitoring
approaches have seized considerable attention for different purposes like
assisted living. Temporal redundancy and the sheer size of raw videos are the
two most common problematic issues related to video processing algorithms. Most
of the existing methods mainly focused on increasing accuracy by exploring
consecutive frames, which is laborious and cannot be considered for real-time
applications. Since videos are mostly stored and transmitted in compressed
format, these kinds of videos are available on many devices. Compressed videos
contain a multitude of beneficial information, such as motion vectors and
quantized coefficients. Proper use of this available information can greatly
improve the video understanding methods' performance. This paper presents an
approach for using residual data, available in compressed videos directly,
which can be obtained by a light partially decoding procedure. In addition, a
method for accumulating similar residuals is proposed, which dramatically
reduces the number of processed frames for action recognition. Applying neural
networks exclusively for accumulated residuals in the compressed domain
accelerates performance, while the classification results are highly
competitive with raw video approaches.
Related papers
- High-Efficiency Neural Video Compression via Hierarchical Predictive Learning [27.41398149573729]
Enhanced Deep Hierarchical Video Compression-DHVC 2.0- introduces superior compression performance and impressive complexity efficiency.
Uses hierarchical predictive coding to transform each video frame into multiscale representations.
Supports transmission-friendly progressive decoding, making it particularly advantageous for networked video applications in the presence of packet loss.
arXiv Detail & Related papers (2024-10-03T15:40:58Z) - Blurry Video Compression: A Trade-off between Visual Enhancement and
Data Compression [65.8148169700705]
Existing video compression (VC) methods primarily aim to reduce the spatial and temporal redundancies between consecutive frames in a video.
Previous works have achieved remarkable results on videos acquired under specific settings such as instant (known) exposure time and shutter speed.
In this work, we tackle the VC problem in a general scenario where a given video can be blurry due to predefined camera settings or dynamics in the scene.
arXiv Detail & Related papers (2023-11-08T02:17:54Z) - Differentiable Resolution Compression and Alignment for Efficient Video
Classification and Retrieval [16.497758750494537]
We propose an efficient video representation network with Differentiable Resolution Compression and Alignment mechanism.
We leverage a Differentiable Context-aware Compression Module to encode the saliency and non-saliency frame features.
We introduce a new Resolution-Align Transformer Layer to capture global temporal correlations among frame features with different resolutions.
arXiv Detail & Related papers (2023-09-15T05:31:53Z) - Compressed Vision for Efficient Video Understanding [83.97689018324732]
We propose a framework enabling research on hour-long videos with the same hardware that can now process second-long videos.
We replace standard video compression, e.g. JPEG, with neural compression and show that we can directly feed compressed videos as inputs to regular video networks.
arXiv Detail & Related papers (2022-10-06T15:35:49Z) - A Detection Method of Temporally Operated Videos Using Robust Hashing [12.27887776401573]
Most conventional methods for detecting tampered videos/images are not robust enough against such operations.
We propose a novel method with a robust hashing algorithm for detecting temporally operated videos even when applying resizing and compression to the videos.
arXiv Detail & Related papers (2022-08-10T07:36:07Z) - Exploring Long- and Short-Range Temporal Information for Learned Video
Compression [54.91301930491466]
We focus on exploiting the unique characteristics of video content and exploring temporal information to enhance compression performance.
For long-range temporal information exploitation, we propose temporal prior that can update continuously within the group of pictures (GOP) during inference.
In that case temporal prior contains valuable temporal information of all decoded images within the current GOP.
In detail, we design a hierarchical structure to achieve multi-scale compensation.
arXiv Detail & Related papers (2022-08-07T15:57:18Z) - Self-Conditioned Probabilistic Learning of Video Rescaling [70.10092286301997]
We propose a self-conditioned probabilistic framework for video rescaling to learn the paired downscaling and upscaling procedures simultaneously.
We decrease the entropy of the information lost in the downscaling by maximizing its conditioned probability on the strong spatial-temporal prior information.
We extend the framework to a lossy video compression system, in which a gradient estimator for non-differential industrial lossy codecs is proposed.
arXiv Detail & Related papers (2021-07-24T15:57:15Z) - COMISR: Compression-Informed Video Super-Resolution [76.94152284740858]
Most videos on the web or mobile devices are compressed, and the compression can be severe when the bandwidth is limited.
We propose a new compression-informed video super-resolution model to restore high-resolution content without introducing artifacts caused by compression.
arXiv Detail & Related papers (2021-05-04T01:24:44Z) - Faster and Accurate Compressed Video Action Recognition Straight from
the Frequency Domain [1.9214041945441434]
Deep learning has been successfully used to learn powerful and interpretable features for recognizing human actions in videos.
Most of the existing deep learning approaches have been designed for processing video information as RGB image sequences.
We propose a deep neural network capable of learning straight from compressed video.
arXiv Detail & Related papers (2020-12-26T12:43:53Z) - Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system.
Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame.
Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.