VVC Extension Scheme for Object Detection Using Contrast Reduction
- URL: http://arxiv.org/abs/2305.18782v1
- Date: Tue, 30 May 2023 06:29:04 GMT
- Title: VVC Extension Scheme for Object Detection Using Contrast Reduction
- Authors: Takahiro Shindo, Taiju Watanabe, Kein Yamada, Hiroshi Watanabe
- Abstract summary: We propose an extention scheme of video coding for object detection using Versatile Video Coding (VVC)
In our proposed scheme, the original image is reduced in size and contrast, then coded with VVC encoder to achieve high compression performance.
Experimental results show that the proposed video coding scheme achieves better coding performance than regular VVC in terms of object detection accuracy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, video analysis using Artificial Intelligence (AI) has been
widely used, due to the remarkable development of image recognition technology
using deep learning. In 2019, the Moving Picture Experts Group (MPEG) has
started standardization of Video Coding for Machines (VCM) as a video coding
technology for image recognition. In the framework of VCM, both higher image
recognition accuracy and video compression performance are required. In this
paper, we propose an extention scheme of video coding for object detection
using Versatile Video Coding (VVC). Unlike video for human vision, video used
for object detection does not require a large image size or high contrast.
Since downsampling of the image can reduce the amount of information to be
transmitted. Due to the decrease in image contrast, entropy of the image
becomes smaller. Therefore, in our proposed scheme, the original image is
reduced in size and contrast, then coded with VVC encoder to achieve high
compression performance. Then, the output image from the VVC decoder is
restored to its original image size using the bicubic method. Experimental
results show that the proposed video coding scheme achieves better coding
performance than regular VVC in terms of object detection accuracy.
Related papers
- When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding [112.44822009714461]
Cross-Modality Video Coding (CMVC) is a pioneering approach to explore multimodality representation and video generative models in video coding.
During decoding, previously encoded components and video generation models are leveraged to create multiple encoding-decoding modes.
Experiments indicate that TT2V achieves effective semantic reconstruction, while IT2V exhibits competitive perceptual consistency.
arXiv Detail & Related papers (2024-08-15T11:36:18Z) - NN-VVC: Versatile Video Coding boosted by self-supervisedly learned
image coding for machines [19.183883119933558]
This paper proposes a hybrid for machines called NN-VVC, which combines the advantages of an E2E-learned image and a CVC to achieve high performance in both image and video coding for machines.
Our experiments show that the proposed system achieved up to -43.20% and -26.8% Bjontegaard Delta rate reduction over VVC for image and video data, respectively.
arXiv Detail & Related papers (2024-01-19T15:33:46Z) - End-to-End Learnable Multi-Scale Feature Compression for VCM [8.037759667748768]
We propose a novel multi-scale feature compression method that enables the end-to-end optimization on the extracted features and the design of lightweight encoders.
Our model outperforms previous approaches by at least 52% BD-rate reduction and has $times5$ to $times27$ times less encoding time for object detection.
arXiv Detail & Related papers (2023-06-29T04:05:13Z) - VNVC: A Versatile Neural Video Coding Framework for Efficient
Human-Machine Vision [59.632286735304156]
It is more efficient to enhance/analyze the coded representations directly without decoding them into pixels.
We propose a versatile neural video coding (VNVC) framework, which targets learning compact representations to support both reconstruction and direct enhancement/analysis.
arXiv Detail & Related papers (2023-06-19T03:04:57Z) - Accuracy Improvement of Object Detection in VVC Coded Video Using
YOLO-v7 Features [0.0]
In general, when the image quality deteriorates due to image encoding, the image recognition accuracy also falls.
We propose a neural-network-based approach to improve image recognition accuracy by applying post-processing to the encoded video.
We show that the combination of the proposed method and VVC achieves better coding performance than regular VVC in object detection accuracy.
arXiv Detail & Related papers (2023-04-03T02:38:54Z) - Saliency-Driven Versatile Video Coding for Neural Object Detection [7.367608892486084]
We propose a saliency-driven coding framework for the video coding for machines task using the latest video coding standard Versatile Video Coding (VVC)
To determine the salient regions before encoding, we employ the real-time-capable object detection network You Only Look Once(YOLO) in combination with a novel decision criterion.
We find that, compared to the reference VVC with a constant quality, up to 29 % of accuracy can be saved with the same detection at the decoder side by applying the proposed saliency-driven framework.
arXiv Detail & Related papers (2022-03-11T14:27:43Z) - A New Image Codec Paradigm for Human and Machine Uses [53.48873918537017]
A new scalable image paradigm for both human and machine uses is proposed in this work.
The high-level instance segmentation map and the low-level signal features are extracted with neural networks.
An image is designed and trained to achieve the general-quality image reconstruction with the 16-bit gray-scale profile and signal features.
arXiv Detail & Related papers (2021-12-19T06:17:38Z) - Multitask Learning for VVC Quality Enhancement and Super-Resolution [11.446576112498596]
We propose a learning-based solution as a post-processing step to enhance the decoded VVC video quality.
Our method relies on multitask learning to perform both quality enhancement and super-resolution using a single shared network optimized for multiple levels.
arXiv Detail & Related papers (2021-04-16T19:05:26Z) - Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system.
Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame.
Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z) - Video Coding for Machines: A Paradigm of Collaborative Compression and
Intelligent Analytics [127.65410486227007]
Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale.
Recent endeavors in imminent trends of video compression, e.g. deep learning based coding tools and end-to-end image/video coding, and MPEG-7 compact feature descriptor standards, promote the sustainable and fast development in their own directions.
In this paper, thanks to booming AI technology, e.g. prediction and generation models, we carry out exploration in the new area, Video Coding for Machines (VCM), arising from the emerging MPEG
arXiv Detail & Related papers (2020-01-10T17:24:13Z) - An Emerging Coding Paradigm VCM: A Scalable Coding Approach Beyond
Feature and Signal [99.49099501559652]
Video Coding for Machine (VCM) aims to bridge the gap between visual feature compression and classical video coding.
We employ a conditional deep generation network to reconstruct video frames with the guidance of learned motion pattern.
By learning to extract sparse motion pattern via a predictive model, the network elegantly leverages the feature representation to generate the appearance of to-be-coded frames.
arXiv Detail & Related papers (2020-01-09T14:18:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.