A Reinforcement-Learning-Based Energy-Efficient Framework for Multi-Task
Video Analytics Pipeline
- URL: http://arxiv.org/abs/2104.04443v1
- Date: Fri, 9 Apr 2021 15:44:06 GMT
- Title: A Reinforcement-Learning-Based Energy-Efficient Framework for Multi-Task
Video Analytics Pipeline
- Authors: Yingying Zhao, Mingzhi Dong, Yujiang Wang, Da Feng, Qin Lv, Robert
Dick, Dongsheng Li, Tun Lu, Ning Gu, Li Shang
- Abstract summary: Video analytics pipelines are energy-intensive due to high data rates and reliance on complex inference algorithms.
We propose an adaptive-resolution optimization framework to minimize the energy use of multi-task video analytics pipelines.
Our framework has significantly surpassed all baseline methods of similar accuracy on the YouTube-VIS dataset.
- Score: 16.72264118199915
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep-learning-based video processing has yielded transformative results in
recent years. However, the video analytics pipeline is energy-intensive due to
high data rates and reliance on complex inference algorithms, which limits its
adoption in energy-constrained applications. Motivated by the observation of
high and variable spatial redundancy and temporal dynamics in video data
streams, we design and evaluate an adaptive-resolution optimization framework
to minimize the energy use of multi-task video analytics pipelines. Instead of
heuristically tuning the input data resolution of individual tasks, our
framework utilizes deep reinforcement learning to dynamically govern the input
resolution and computation of the entire video analytics pipeline. By
monitoring the impact of varying resolution on the quality of high-dimensional
video analytics features, hence the accuracy of video analytics results, the
proposed end-to-end optimization framework learns the best non-myopic policy
for dynamically controlling the resolution of input video streams to achieve
globally optimize energy efficiency. Governed by reinforcement learning,
optical flow is incorporated into the framework to minimize unnecessary
spatio-temporal redundancy that leads to re-computation, while preserving
accuracy. The proposed framework is applied to video instance segmentation
which is one of the most challenging machine vision tasks, and the energy
consumption efficiency of the proposed framework has significantly surpassed
all baseline methods of similar accuracy on the YouTube-VIS dataset.
Related papers
- Edge Computing Enabled Real-Time Video Analysis via Adaptive
Spatial-Temporal Semantic Filtering [18.55091203660391]
This paper proposes a novel edge computing enabled real-time video analysis system for intelligent visual devices.
The proposed system consists of a tracking-assisted object detection module (TAODM) and a region of interesting module (ROIM)
TAODM adaptively determines the offloading decision to process each video frame locally with a tracking algorithm or to offload it to the edge server inferred by an object detection model.
arXiv Detail & Related papers (2024-02-29T07:42:03Z) - Video Dynamics Prior: An Internal Learning Approach for Robust Video
Enhancements [83.5820690348833]
We present a framework for low-level vision tasks that does not require any external training data corpus.
Our approach learns neural modules by optimizing over a corrupted sequence, leveraging the weights of the coherence-temporal test and statistics internal statistics.
arXiv Detail & Related papers (2023-12-13T01:57:11Z) - Differentiable Resolution Compression and Alignment for Efficient Video
Classification and Retrieval [16.497758750494537]
We propose an efficient video representation network with Differentiable Resolution Compression and Alignment mechanism.
We leverage a Differentiable Context-aware Compression Module to encode the saliency and non-saliency frame features.
We introduce a new Resolution-Align Transformer Layer to capture global temporal correlations among frame features with different resolutions.
arXiv Detail & Related papers (2023-09-15T05:31:53Z) - Task-Oriented Communication for Edge Video Analytics [11.03999024164301]
This paper proposes a task-oriented communication framework for edge video analytics.
Multiple devices collect visual sensory data and transmit the informative features to an edge server for processing.
We show that the proposed framework effectively encodes task-relevant information of video data and achieves a better rate-performance tradeoff than existing methods.
arXiv Detail & Related papers (2022-11-25T12:09:12Z) - Dynamic Network Quantization for Efficient Video Inference [60.109250720206425]
We propose a dynamic network quantization framework, that selects optimal precision for each frame conditioned on the input for efficient video recognition.
We train both networks effectively using standard backpropagation with a loss to achieve both competitive performance and resource efficiency.
arXiv Detail & Related papers (2021-08-23T20:23:57Z) - A Survey of Performance Optimization in Neural Network-Based Video
Analytics Systems [0.9558392439655014]
Video analytics systems perform automatic events, movements, and actions recognition in a video.
We provide a review of the techniques that focus on optimizing the performance of Neural Network-Based Video Analytics Systems.
arXiv Detail & Related papers (2021-05-10T17:06:44Z) - An Efficient Recurrent Adversarial Framework for Unsupervised Real-Time
Video Enhancement [132.60976158877608]
We propose an efficient adversarial video enhancement framework that learns directly from unpaired video examples.
In particular, our framework introduces new recurrent cells that consist of interleaved local and global modules for implicit integration of spatial and temporal information.
The proposed design allows our recurrent cells to efficiently propagate-temporal-information across frames and reduces the need for high complexity networks.
arXiv Detail & Related papers (2020-12-24T00:03:29Z) - Coherent Loss: A Generic Framework for Stable Video Segmentation [103.78087255807482]
We investigate how a jittering artifact degrades the visual quality of video segmentation results.
We propose a Coherent Loss with a generic framework to enhance the performance of a neural network against jittering artifacts.
arXiv Detail & Related papers (2020-10-25T10:48:28Z) - Scene-Adaptive Video Frame Interpolation via Meta-Learning [54.87696619177496]
We propose to adapt the model to each video by making use of additional information that is readily available at test time.
We obtain significant performance gains with only a single gradient update without any additional parameters.
arXiv Detail & Related papers (2020-04-02T02:46:44Z) - Accelerating Deep Reinforcement Learning With the Aid of Partial Model:
Energy-Efficient Predictive Video Streaming [97.75330397207742]
Predictive power allocation is conceived for energy-efficient video streaming over mobile networks using deep reinforcement learning.
To handle the continuous state and action spaces, we resort to deep deterministic policy gradient (DDPG) algorithm.
Our simulation results show that the proposed policies converge to the optimal policy that is derived based on perfect large-scale channel prediction.
arXiv Detail & Related papers (2020-03-21T17:36:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.