An End-to-End Integrated Computation and Communication Architecture for
Goal-oriented Networking: A Perspective on Live Surveillance Video
- URL: http://arxiv.org/abs/2204.01987v1
- Date: Tue, 5 Apr 2022 04:59:54 GMT
- Title: An End-to-End Integrated Computation and Communication Architecture for
Goal-oriented Networking: A Perspective on Live Surveillance Video
- Authors: Suvadip Batabyal, Ozgur Ercetin
- Abstract summary: We propose situation-aware streaming, for real-time identification of important events from live-feeds at the source.
We show that the proposed scheme is able to reduce the required power consumption of the transmitter by 38.5% for 2160p (UHD) video.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Real-time video surveillance has become a crucial technology for smart
cities, made possible through the large-scale deployment of mobile and fixed
video cameras. In this paper, we propose situation-aware streaming, for
real-time identification of important events from live-feeds at the source
rather than a cloud based analysis. For this, we first identify the frames
containing a specific situation and assign them a high scale-of-importance
(SI). The identification is made at the source using a tiny neural network
(having a small number of hidden layers), which incurs a small computational
resource, albeit at the cost of accuracy. The frames with a high SI value are
then streamed with a certain required Signal-to-Noise-Ratio (SNR) to retain the
frame quality, while the remaining ones are transmitted with a small SNR. The
received frames are then analyzed using a deep neural network (with many hidden
layers) to extract the situation accurately. We show that the proposed scheme
is able to reduce the required power consumption of the transmitter by 38.5%
for 2160p (UHD) video, while achieving a classification accuracy of 97.5%, for
the given situation.
Related papers
- Temporal-Spatial Processing of Event Camera Data via Delay-Loop Reservoir Neural Network [0.11309478649967238]
We study a conjecture motivated by our previous study of video processing with delay loop reservoir neural network.
In this paper, we will exploit this new finding to guide our design of a delay-loop reservoir neural network for event camera classification.
arXiv Detail & Related papers (2024-02-12T16:24:13Z) - DroneAttention: Sparse Weighted Temporal Attention for Drone-Camera
Based Activity Recognition [2.705905918316948]
Human activity recognition (HAR) using drone-mounted cameras has attracted considerable interest from the computer vision research community in recent years.
We propose a novel Sparse Weighted Temporal Attention (SWTA) module to utilize sparsely sampled video frames for obtaining global weighted temporal attention.
The proposed model has received an accuracy of 72.76%, 92.56%, and 78.86% on the respective datasets.
arXiv Detail & Related papers (2022-12-07T00:33:40Z) - Scalable Neural Video Representations with Learnable Positional Features [73.51591757726493]
We show how to train neural representations with learnable positional features (NVP) that effectively amortize a video as latent codes.
We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07rightarrow$34.57 (measured with the PSNR metric)
arXiv Detail & Related papers (2022-10-13T08:15:08Z) - Neighbourhood Representative Sampling for Efficient End-to-end Video
Quality Assessment [60.57703721744873]
The increased resolution of real-world videos presents a dilemma between efficiency and accuracy for deep Video Quality Assessment (VQA)
In this work, we propose a unified scheme, spatial-temporal grid mini-cube sampling (St-GMS) to get a novel type of sample, named fragments.
With fragments and FANet, the proposed efficient end-to-end FAST-VQA and FasterVQA achieve significantly better performance than existing approaches on all VQA benchmarks.
arXiv Detail & Related papers (2022-10-11T11:38:07Z) - STIP: A SpatioTemporal Information-Preserving and Perception-Augmented
Model for High-Resolution Video Prediction [78.129039340528]
We propose a Stemporal Information-Preserving and Perception-Augmented Model (STIP) to solve the above two problems.
The proposed model aims to preserve thetemporal information for videos during the feature extraction and the state transitions.
Experimental results show that the proposed STIP can predict videos with more satisfactory visual quality compared with a variety of state-of-the-art methods.
arXiv Detail & Related papers (2022-06-09T09:49:04Z) - A Study of Designing Compact Audio-Visual Wake Word Spotting System
Based on Iterative Fine-Tuning in Neural Network Pruning [57.28467469709369]
We investigate on designing a compact audio-visual wake word spotting (WWS) system by utilizing visual information.
We introduce a neural network pruning strategy via the lottery ticket hypothesis in an iterative fine-tuning manner (LTH-IF)
The proposed audio-visual system achieves significant performance improvements over the single-modality (audio-only or video-only) system under different noisy conditions.
arXiv Detail & Related papers (2022-02-17T08:26:25Z) - CANS: Communication Limited Camera Network Self-Configuration for
Intelligent Industrial Surveillance [8.360870648463653]
Realtime and intelligent video surveillance via camera networks involve computation-intensive vision detection tasks with massive video data.
Multiple video streams compete for limited communication resources on the link between edge devices and camera networks.
An adaptive camera network self-configuration method (CANS) of video surveillance is proposed to cope with multiple video streams of heterogeneous quality of service.
arXiv Detail & Related papers (2021-09-13T01:54:33Z) - Fast Motion Understanding with Spatiotemporal Neural Networks and
Dynamic Vision Sensors [99.94079901071163]
This paper presents a Dynamic Vision Sensor (DVS) based system for reasoning about high speed motion.
We consider the case of a robot at rest reacting to a small, fast approaching object at speeds higher than 15m/s.
We highlight the results of our system to a toy dart moving at 23.4m/s with a 24.73deg error in $theta$, 18.4mm average discretized radius prediction error, and 25.03% median time to collision prediction error.
arXiv Detail & Related papers (2020-11-18T17:55:07Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z) - HyNNA: Improved Performance for Neuromorphic Vision Sensor based
Surveillance using Hybrid Neural Network Architecture [7.293414498855147]
We improve on a recently proposed hybrid event-frame approach by using morphological image processing algorithms for region proposal.
We also address the low-power requirement for object detection and classification by exploring various convolutional neural network (CNN) architectures.
Specifically, we compare the results obtained from our object detection framework against the state-of-the-art low-power NVS surveillance system and show an improved accuracy of 82.16% from 63.1%.
arXiv Detail & Related papers (2020-03-19T07:18:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.