Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines
- URL: http://arxiv.org/abs/2406.12367v1
- Date: Tue, 18 Jun 2024 07:45:57 GMT
- Title: Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines
- Authors: Honglei Zhang, Jukka I. Ahonen, Nam Le, Ruiying Yang, Francesco Cricri,
- Abstract summary: This paper investigates the efficacy of jointly optimizing content-specific post-processing filters to adapt a human oriented video/image into a machine vision task.
We propose a novel training strategy based on competitive learning principles.
Experiments on the OpenImages dataset show an improvement in the BD-rate reduction from -41.3% to -44.6%.
- Score: 5.155405463139862
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper investigates the efficacy of jointly optimizing content-specific post-processing filters to adapt a human oriented video/image codec into a codec suitable for machine vision tasks. By observing that artifacts produced by video/image codecs are content-dependent, we propose a novel training strategy based on competitive learning principles. This strategy assigns training samples to filters dynamically, in a fuzzy manner, which further optimizes the winning filter on the given sample. Inspired by simulated annealing optimization techniques, we employ a softmax function with a temperature variable as the weight allocation function to mitigate the effects of random initialization. Our evaluation, conducted on a system utilizing multiple post-processing filters within a Versatile Video Coding (VVC) codec framework, demonstrates the superiority of content-specific filters trained with our proposed strategies, specifically, when images are processed in blocks. Using VVC reference software VTM 12.0 as the anchor, experiments on the OpenImages dataset show an improvement in the BD-rate reduction from -41.3% and -44.6% to -42.3% and -44.7% for object detection and instance segmentation tasks, respectively, compared to independently trained filters. The statistics of the filter usage align with our hypothesis and underscore the importance of jointly optimizing filters for both content and reconstruction quality. Our findings pave the way for further improving the performance of video/image codecs.
Related papers
- RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression [68.31184784672227]
In modern applications such as autonomous driving, an overwhelming majority of videos serve as input for AI systems performing tasks.
It is therefore useful to optimize the encoder for a downstream task instead of for image quality.
Here, we address this challenge by controlling the Quantization Parameters (QPs) at the macro-block level to optimize the downstream task.
arXiv Detail & Related papers (2025-01-21T15:36:08Z) - Video Decomposition Prior: A Methodology to Decompose Videos into Layers [74.36790196133505]
This paper introduces a novel video decomposition prior VDP' framework which derives inspiration from professional video editing practices.
VDP framework decomposes a video sequence into a set of multiple RGB layers and associated opacity levels.
We address tasks such as video object segmentation, dehazing, and relighting.
arXiv Detail & Related papers (2024-12-06T10:35:45Z) - In-Loop Filtering via Trained Look-Up Tables [45.6756570330982]
In-loop filtering (ILF) is a key technology for removing the artifacts in image/video coding standards.
We propose an efficient and practical in-loop filtering scheme by adopting the Look-up Table (LUT)
Experimental results show that the ultrafast, very fast, and fast mode of the proposed method achieves on average 0.13%/0.34%/0.51%, and 0.10%/0.27%/0.39% BD-rate reduction.
arXiv Detail & Related papers (2024-07-15T17:25:42Z) - CLIPVQA:Video Quality Assessment via CLIP [56.94085651315878]
We propose an efficient CLIP-based Transformer method for the VQA problem ( CLIPVQA)
The proposed CLIPVQA achieves new state-of-the-art VQA performance and up to 37% better generalizability than existing benchmark VQA methods.
arXiv Detail & Related papers (2024-07-06T02:32:28Z) - Adapting Learned Image Codecs to Screen Content via Adjustable
Transformations [1.9249287163937978]
We propose to introduce parameterized and invertible linear transformations into the coding pipeline without changing the underlying baseline's operation flow.
Our end-to-end trained solution achieves up to 10% savings on SC compression compared to the baseline LICs.
arXiv Detail & Related papers (2024-02-27T14:34:14Z) - Filter Pruning for Efficient CNNs via Knowledge-driven Differential
Filter Sampler [103.97487121678276]
Filter pruning simultaneously accelerates the computation and reduces the memory overhead of CNNs.
We propose a novel Knowledge-driven Differential Filter Sampler(KDFS) with Masked Filter Modeling(MFM) framework for filter pruning.
arXiv Detail & Related papers (2023-07-01T02:28:41Z) - End-to-End Rate-Distortion Optimized Learned Hierarchical Bi-Directional
Video Compression [10.885590093103344]
Learned VC allows end-to-end rate-distortion (R-D) optimized training of nonlinear transform, motion and entropy model simultaneously.
This paper proposes a learned hierarchical bi-directional video (LHBDC) that combines the benefits of hierarchical motion-sampling and end-to-end optimization.
arXiv Detail & Related papers (2021-12-17T14:30:22Z) - ELF-VC: Efficient Learned Flexible-Rate Video Coding [61.10102916737163]
We propose several novel ideas for learned video compression which allow for improved performance for the low-latency mode.
We benchmark our method, which we call ELF-VC, on popular video test sets UVG and MCL-JCV.
Our approach runs at least 5x faster and has fewer parameters than all ML codecs which report these figures.
arXiv Detail & Related papers (2021-04-29T17:50:35Z) - Multi-Density Attention Network for Loop Filtering in Video Compression [9.322800480045336]
We propose a on-line scaling based multi-density attention network for loop filtering in video compression.
Experimental results show that 10.18% bit-rate reduction at the same video quality can be achieved over the latest Versatile Video Coding (VVC) standard.
arXiv Detail & Related papers (2021-04-08T05:46:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.