In-Loop Filtering via Trained Look-Up Tables
- URL: http://arxiv.org/abs/2407.10926v2
- Date: Wed, 11 Sep 2024 08:12:55 GMT
- Title: In-Loop Filtering via Trained Look-Up Tables
- Authors: Zhuoyuan Li, Jiacheng Li, Yao Li, Li Li, Dong Liu, Feng Wu,
- Abstract summary: In-loop filtering (ILF) is a key technology for removing the artifacts in image/video coding standards.
We propose an efficient and practical in-loop filtering scheme by adopting the Look-up Table (LUT)
Experimental results show that the ultrafast, very fast, and fast mode of the proposed method achieves on average 0.13%/0.34%/0.51%, and 0.10%/0.27%/0.39% BD-rate reduction.
- Score: 45.6756570330982
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In-loop filtering (ILF) is a key technology for removing the artifacts in image/video coding standards. Recently, neural network-based in-loop filtering methods achieve remarkable coding gains beyond the capability of advanced video coding standards, which becomes a powerful coding tool candidate for future video coding standards. However, the utilization of deep neural networks brings heavy time and computational complexity, and high demands of high-performance hardware, which is challenging to apply to the general uses of coding scene. To address this limitation, inspired by explorations in image restoration, we propose an efficient and practical in-loop filtering scheme by adopting the Look-up Table (LUT). We train the DNN of in-loop filtering within a fixed filtering reference range, and cache the output values of the DNN into a LUT via traversing all possible inputs. At testing time in the coding process, the filtered pixel is generated by locating input pixels (to-be-filtered pixel with reference pixels) and interpolating cached filtered pixel values. To further enable the large filtering reference range with the limited storage cost of LUT, we introduce the enhanced indexing mechanism in the filtering process, and clipping/finetuning mechanism in the training. The proposed method is implemented into the Versatile Video Coding (VVC) reference software, VTM-11.0. Experimental results show that the ultrafast, very fast, and fast mode of the proposed method achieves on average 0.13%/0.34%/0.51%, and 0.10%/0.27%/0.39% BD-rate reduction, under the all intra (AI) and random access (RA) configurations. Especially, our method has friendly time and computational complexity, only 101%/102%-104%/108% time increase with 0.13-0.93 kMACs/pixel, and only 164-1148 KB storage cost for a single model. Our solution may shed light on the journey of practical neural network-based coding tool evolution.
Related papers
- Advanced Learning-Based Inter Prediction for Future Video Coding [46.4999280984859]
The paper proposes a low complexity learning-based inter prediction (LLIP) method to replace the traditional INTERPF.
LLIP enhances the filtering process by leveraging a lightweight neural network model, where parameters can be exported for efficient inference.
Ultimately, we replace the traditional handcraft filtering parameters in INTERPF with the learned optimal filtering parameters.
arXiv Detail & Related papers (2024-11-24T08:47:00Z) - Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines [5.155405463139862]
This paper investigates the efficacy of jointly optimizing content-specific post-processing filters to adapt a human oriented video/image into a machine vision task.
We propose a novel training strategy based on competitive learning principles.
Experiments on the OpenImages dataset show an improvement in the BD-rate reduction from -41.3% to -44.6%.
arXiv Detail & Related papers (2024-06-18T07:45:57Z) - Filter Pruning for Efficient CNNs via Knowledge-driven Differential
Filter Sampler [103.97487121678276]
Filter pruning simultaneously accelerates the computation and reduces the memory overhead of CNNs.
We propose a novel Knowledge-driven Differential Filter Sampler(KDFS) with Masked Filter Modeling(MFM) framework for filter pruning.
arXiv Detail & Related papers (2023-07-01T02:28:41Z) - Complexity Reduction of Learned In-Loop Filtering in Video Coding [12.06039429078762]
In video coding, in-loop filters are applied on reconstructed video frames to enhance their perceptual quality, before storing the frames for output.
The proposed method uses a novel combination of sparsity and structured pruning for complexity reduction of learned in-loop filters.
arXiv Detail & Related papers (2022-03-16T14:34:41Z) - A Global Appearance and Local Coding Distortion based Fusion Framework
for CNN based Filtering in Video Coding [15.778380865885842]
In-loop filtering is used in video coding to process the reconstructed frame in order to remove blocking artifacts.
In this paper, we address the filtering problem from two aspects, global appearance restoration for disrupted texture and local coding distortion restoration caused by fixed pipeline of coding.
A three-stream global appearance and local coding distortion based fusion network is developed with a high-level global feature stream, a high-level local feature stream and a low-level local feature stream.
arXiv Detail & Related papers (2021-06-24T03:08:44Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action
Localization [96.73647162960842]
TAL is a fundamental yet challenging task in video understanding.
Existing TAL methods rely on pre-training a video encoder through action classification supervision.
We introduce a novel low-fidelity end-to-end (LoFi) video encoder pre-training method.
arXiv Detail & Related papers (2021-03-28T22:18:14Z) - Computational optimization of convolutional neural networks using
separated filters architecture [69.73393478582027]
We consider a convolutional neural network transformation that reduces computation complexity and thus speedups neural network processing.
Use of convolutional neural networks (CNN) is the standard approach to image recognition despite the fact they can be too computationally demanding.
arXiv Detail & Related papers (2020-02-18T17:42:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.