AccDecoder: Accelerated Decoding for Neural-enhanced Video Analytics
- URL: http://arxiv.org/abs/2301.08664v2
- Date: Tue, 24 Jan 2023 11:04:47 GMT
- Title: AccDecoder: Accelerated Decoding for Neural-enhanced Video Analytics
- Authors: Tingting Yuan, Liang Mi, Weijun Wang, Haipeng Dai, Xiaoming Fu
- Abstract summary: Low-quality video is collected by existing surveillance systems because of poor quality cameras or over-compressed/pruned video streaming protocols.
We present AccDecoder, a novel accelerated decoder for real-time and neural network-based video analytics.
- Score: 26.012783785622073
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The quality of the video stream is key to neural network-based video
analytics. However, low-quality video is inevitably collected by existing
surveillance systems because of poor quality cameras or over-compressed/pruned
video streaming protocols, e.g., as a result of upstream bandwidth limit. To
address this issue, existing studies use quality enhancers (e.g., neural
super-resolution) to improve the quality of videos (e.g., resolution) and
eventually ensure inference accuracy. Nevertheless, directly applying quality
enhancers does not work in practice because it will introduce unacceptable
latency. In this paper, we present AccDecoder, a novel accelerated decoder for
real-time and neural-enhanced video analytics. AccDecoder can select a few
frames adaptively via Deep Reinforcement Learning (DRL) to enhance the quality
by neural super-resolution and then up-scale the unselected frames that
reference them, which leads to 6-21% accuracy improvement. AccDecoder provides
efficient inference capability via filtering important frames using DRL for
DNN-based inference and reusing the results for the other frames via extracting
the reference relationship among frames and blocks, which results in a latency
reduction of 20-80% than baselines.
Related papers
- Standard compliant video coding using low complexity, switchable neural wrappers [8.149130379436759]
We propose a new framework featuring standard compatibility, high performance, and low decoding complexity.
We employ a set of jointly optimized neural pre- and post-processors, wrapping a standard video, to encode videos at different resolutions.
We design a low complexity neural post-processor architecture that can handle different upsampling ratios.
arXiv Detail & Related papers (2024-07-10T06:36:45Z) - Prediction and Reference Quality Adaptation for Learned Video Compression [54.58691829087094]
We propose a confidence-based prediction quality adaptation (PQA) module to provide explicit discrimination for the spatial and channel-wise prediction quality difference.
We also propose a reference quality adaptation (RQA) module and an associated repeat-long training strategy to provide dynamic spatially variant filters for diverse reference qualities.
arXiv Detail & Related papers (2024-06-20T09:03:26Z) - Boosting Neural Representations for Videos with a Conditional Decoder [28.073607937396552]
Implicit neural representations (INRs) have emerged as a promising approach for video storage and processing.
This paper introduces a universal boosting framework for current implicit video representation approaches.
arXiv Detail & Related papers (2024-02-28T08:32:19Z) - ILCAS: Imitation Learning-Based Configuration-Adaptive Streaming for
Live Video Analytics with Cross-Camera Collaboration [53.29046841099947]
This paper proposes the first imitation learning (IL) based configuration-adaptive live video analytics (VA) streaming system.
ILCAS trains the agent with demonstrations collected from the expert which is designed as an offline optimal policy.
experiments confirm the superiority of ILCAS compared with state-of-the-art solutions, with 2-20.9% improvement of mean accuracy and 19.9-85.3% reduction of chunk upload lag.
arXiv Detail & Related papers (2023-08-19T16:20:59Z) - CaDM: Codec-aware Diffusion Modeling for Neural-enhanced Video Streaming [15.115975994657514]
We present Codec-aware Diffusion Modeling (CaDM), a novel Neural-enhanced Video Streaming (NVS) paradigm.
First, CaDM improves the encoder's compression efficiency by simultaneously reducing resolution and color bit-depth video frames.
arXiv Detail & Related papers (2022-11-15T05:14:48Z) - Rethinking Resolution in the Context of Efficient Video Recognition [49.957690643214576]
Cross-resolution KD (ResKD) is a simple but effective method to boost recognition accuracy on low-resolution frames.
We extensively demonstrate its effectiveness over state-of-the-art architectures, i.e., 3D-CNNs and Video Transformers.
arXiv Detail & Related papers (2022-09-26T15:50:44Z) - Graph Neural Networks for Channel Decoding [71.15576353630667]
We showcase competitive decoding performance for various coding schemes, such as low-density parity-check (LDPC) and BCH codes.
The idea is to let a neural network (NN) learn a generalized message passing algorithm over a given graph.
We benchmark our proposed decoder against state-of-the-art in conventional channel decoding as well as against recent deep learning-based results.
arXiv Detail & Related papers (2022-07-29T15:29:18Z) - NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition [89.84188594758588]
A novel Non-saliency Suppression Network (NSNet) is proposed to suppress the responses of non-salient frames.
NSNet achieves the state-of-the-art accuracy-efficiency trade-off and presents a significantly faster (2.44.3x) practical inference speed than state-of-the-art methods.
arXiv Detail & Related papers (2022-07-21T09:41:22Z) - STIP: A SpatioTemporal Information-Preserving and Perception-Augmented
Model for High-Resolution Video Prediction [78.129039340528]
We propose a Stemporal Information-Preserving and Perception-Augmented Model (STIP) to solve the above two problems.
The proposed model aims to preserve thetemporal information for videos during the feature extraction and the state transitions.
Experimental results show that the proposed STIP can predict videos with more satisfactory visual quality compared with a variety of state-of-the-art methods.
arXiv Detail & Related papers (2022-06-09T09:49:04Z) - Perceptually-inspired super-resolution of compressed videos [18.72040343193715]
spatial resolution adaptation is a technique which has often been employed in video compression to enhance coding efficiency.
Recent work has employed advanced super-resolution methods based on convolutional neural networks (CNNs) to further improve reconstruction quality.
In this paper, a perceptually-inspired super-resolution approach (M-SRGAN) is proposed for spatial upsampling of compressed video using a modified CNN model.
arXiv Detail & Related papers (2021-06-15T13:50:24Z) - Video compression with low complexity CNN-based spatial resolution
adaptation [15.431248645312309]
spatial resolution adaptation can be integrated within video compression to improve overall coding performance.
A novel framework is proposed which supports the flexible allocation of complexity between the encoder and decoder.
arXiv Detail & Related papers (2020-07-29T10:20:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.