DMVC: Multi-Camera Video Compression Network aimed at Improving Deep Learning Accuracy
- URL: http://arxiv.org/abs/2410.18400v1
- Date: Thu, 24 Oct 2024 03:29:57 GMT
- Title: DMVC: Multi-Camera Video Compression Network aimed at Improving Deep Learning Accuracy
- Authors: Huan Cui, Qing Li, Hanling Wang, Yong jiang,
- Abstract summary: We introduce a cutting-edge video compression framework tailored for the age of ubiquitous video data.
Unlike traditional compression methods that prioritize human visual perception, our innovative approach focuses on preserving semantic information critical for deep learning accuracy.
Based on a designed deep learning algorithms, it adeptly segregates essential information from redundancy, ensuring machine learning tasks are fed with data of the highest relevance.
- Score: 22.871591373774802
- License:
- Abstract: We introduce a cutting-edge video compression framework tailored for the age of ubiquitous video data, uniquely designed to serve machine learning applications. Unlike traditional compression methods that prioritize human visual perception, our innovative approach focuses on preserving semantic information critical for deep learning accuracy, while efficiently reducing data size. The framework operates on a batch basis, capable of handling multiple video streams simultaneously, thereby enhancing scalability and processing efficiency. It features a dual reconstruction mode: lightweight for real-time applications requiring swift responses, and high-precision for scenarios where accuracy is crucial. Based on a designed deep learning algorithms, it adeptly segregates essential information from redundancy, ensuring machine learning tasks are fed with data of the highest relevance. Our experimental results, derived from diverse datasets including urban surveillance and autonomous vehicle navigation, showcase DMVC's superiority in maintaining or improving machine learning task accuracy, while achieving significant data compression. This breakthrough paves the way for smarter, scalable video analysis systems, promising immense potential across various applications from smart city infrastructure to autonomous systems, establishing a new benchmark for integrating video compression with machine learning.
Related papers
- Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Spatiotemporal Attention-based Semantic Compression for Real-time Video
Recognition [117.98023585449808]
We propose a temporal attention-based autoencoder (STAE) architecture to evaluate the importance of frames and pixels in each frame.
We develop a lightweight decoder that leverages a 3D-2D CNN combined to reconstruct missing information.
Experimental results show that ViT_STAE can compress the video dataset H51 by 104x with only 5% accuracy loss.
arXiv Detail & Related papers (2023-05-22T07:47:27Z) - Treasure What You Have: Exploiting Similarity in Deep Neural Networks
for Efficient Video Processing [1.5749416770494706]
This paper proposes a similarity-aware training methodology that exploits data redundancy in video frames for efficient processing.
We validate our methodology on two critical real-time applications, lane detection and scene parsing.
arXiv Detail & Related papers (2023-05-10T23:18:47Z) - A Survey of Task-Based Machine Learning Content Extraction Services for
VIDINT [0.0]
Video intelligence (VIDINT) data has become a critical intelligence source in the past decade.
The need for AI-based analytics and automation tools to extract and structure content from video has quickly become a priority for organizations.
This paper reviews and compares products, software resources and video analytics capabilities based on tasks relevant to extracting information from video with machine learning techniques.
arXiv Detail & Related papers (2022-07-09T00:02:08Z) - Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval [55.088635195893325]
We propose the first quantized representation learning method for cross-view video retrieval, namely Hybrid Contrastive Quantization (HCQ)
HCQ learns both coarse-grained and fine-grained quantizations with transformers, which provide complementary understandings for texts and videos.
Experiments on three Web video benchmark datasets demonstrate that HCQ achieves competitive performance with state-of-the-art non-compressed retrieval methods.
arXiv Detail & Related papers (2022-02-07T18:04:10Z) - A Coding Framework and Benchmark towards Low-Bitrate Video Understanding [63.05385140193666]
We propose a traditional-neural mixed coding framework that takes advantage of both traditional codecs and neural networks (NNs)
The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved.
We build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach.
arXiv Detail & Related papers (2022-02-06T16:29:15Z) - A Deeper Look into DeepCap [96.67706102518238]
We propose a novel deep learning approach for monocular dense human performance capture.
Our method is trained in a weakly supervised manner based on multi-view supervision.
Our approach outperforms the state of the art in terms of quality and robustness.
arXiv Detail & Related papers (2021-11-20T11:34:33Z) - Video Coding for Machine: Compact Visual Representation Compression for
Intelligent Collaborative Analytics [101.35754364753409]
Video Coding for Machines (VCM) is committed to bridging to an extent separate research tracks of video/image compression and feature compression.
This paper summarizes VCM methodology and philosophy based on existing academia and industrial efforts.
arXiv Detail & Related papers (2021-10-18T12:42:13Z) - Towards Transparent Application of Machine Learning in Video Processing [3.491870689686827]
Machine learning techniques for more efficient video compression and video enhancement have been developed thanks to breakthroughs in deep learning.
New techniques typically come in the form of resource-hungry black-boxes (overly complex with little transparency regarding the inner workings)
The aim of this work is to understand and optimise learned models in video processing applications so systems that incorporate them can be used in a more trustworthy manner.
arXiv Detail & Related papers (2021-05-26T17:24:23Z) - Faster and Accurate Compressed Video Action Recognition Straight from
the Frequency Domain [1.9214041945441434]
Deep learning has been successfully used to learn powerful and interpretable features for recognizing human actions in videos.
Most of the existing deep learning approaches have been designed for processing video information as RGB image sequences.
We propose a deep neural network capable of learning straight from compressed video.
arXiv Detail & Related papers (2020-12-26T12:43:53Z) - Detecting Deepfakes with Metric Learning [9.94524884861004]
We analyze several deep learning approaches in the context of deepfakes classification in high compression scenario.
We demonstrate that a proposed approach based on metric learning can be very effective in performing such a classification.
Our approach is especially helpful on social media platforms where data compression is inevitable.
arXiv Detail & Related papers (2020-03-19T09:44:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.