Related papers: The Practice of Averaging Rate-Distortion Curves over Testsets to Compare Learned Video Codecs Can Cause Misleading Conclusions

The Practice of Averaging Rate-Distortion Curves over Testsets to Compare Learned Video Codecs Can Cause Misleading Conclusions

URL: http://arxiv.org/abs/2409.08772v2
Date: Tue, 24 Dec 2024 08:18:25 GMT
Title: The Practice of Averaging Rate-Distortion Curves over Testsets to Compare Learned Video Codecs Can Cause Misleading Conclusions
Authors: M. Akin Yilmaz, Onur Keleş, A. Murat Tekalp,
Abstract summary: We show how averaged rate-distortion curves can mislead comparative evaluation of different codecs.<n>A single video with distinct RD characteristics from the rest of the test set can disproportionately influence the average curve.<n>We argue that the learned video compression community should also report per-sequence RD curves and performance metrics for a test set.
Score: 7.714092783675679
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper aims to demonstrate how the prevalent practice in the learned video compression community of averaging rate-distortion (RD) curves across a test video set can lead to misleading conclusions in evaluating codec performance. Through analytical analysis of a simple case and experimental results with two recent learned video codecs, we show how averaged RD curves can mislead comparative evaluation of different codecs, particularly when videos in a dataset have varying characteristics and operating ranges. We illustrate how a single video with distinct RD characteristics from the rest of the test set can disproportionately influence the average RD curve, potentially overshadowing a codec's superior performance across most individual sequences. Using two recent learned video codecs on the UVG dataset as a case study, we demonstrate computing performance metrics, such as the BD rate, from the average RD curve suggests conclusions that contradict those reached from calculating the average of per-sequence metrics. Hence, we argue that the learned video compression community should also report per-sequence RD curves and performance metrics for a test set should be computed from the average of per-sequence metrics, similar to the established practice in traditional video coding, to ensure fair and accurate codec comparisons.

Related papers

Learning from Streaming Video with Orthogonal Gradients [62.51504086522027]
We address the challenge of representation learning from a continuous stream of video as input, in a self-supervised manner. This differs from the standard approaches to video learning where videos are chopped and shuffled during training in order to create a non-redundant batch. We demonstrate the drop in performance when moving from shuffled to sequential learning on three tasks.
arXiv Detail & Related papers (2025-04-02T17:59:57Z)
Benchmarking Conventional and Learned Video Codecs with a Low-Delay Configuration [11.016119119250765]
This paper conducts a comparative study of state-of-the-art conventional and learned video coding methods based on a low delay configuration. To allow a fair and meaningful comparison, the evaluation was performed on test sequences defined in the AOM and MPEG common test conditions in the YCbCr 4:2:0 color space. The evaluation results show that the JVET ECM codecs offer the best overall coding performance among all codecs tested.
arXiv Detail & Related papers (2024-08-09T12:55:23Z)
Not All Pairs are Equal: Hierarchical Learning for Average-Precision-Oriented Video Retrieval [80.09819072780193]
Average Precision (AP) assesses the overall rankings of relevant videos at the top list. Recent video retrieval methods utilize pair-wise losses that treat all sample pairs equally.
arXiv Detail & Related papers (2024-07-22T11:52:04Z)
Hierarchical B-frame Video Coding for Long Group of Pictures [42.229439873835254]
We present an end-to-end learned video for random access that combines training on long sequences of frames, rate allocation and content adaptation on inference. Under common test conditions, it achieves results comparable to VTM in terms of YUV-PSNR BD-Rate on some classes of videos. On average it surpasses open LD and RA end-to-end solutions in terms of VMAF and YUV BD-Rates.
arXiv Detail & Related papers (2024-06-24T11:29:52Z)
Uncertainty-Aware Deep Video Compression with Ensembles [24.245365441718654]
We propose an uncertainty-aware video compression model that can effectively capture predictive uncertainty with deep ensembles. Our model can effectively save bits by more than 20% compared to 1080p sequences.
arXiv Detail & Related papers (2024-03-28T05:44:48Z)
Towards Debiasing Frame Length Bias in Text-Video Retrieval via Causal Intervention [72.12974259966592]
We present a unique and systematic study of a temporal bias due to frame length discrepancy between training and test sets of trimmed video clips. We propose a causal debiasing approach and perform extensive experiments and ablation studies on the Epic-Kitchens-100, YouCook2, and MSR-VTT datasets.
arXiv Detail & Related papers (2023-09-17T15:58:27Z)
Video Compression with Arbitrary Rescaling Network [8.489428003916622]
We propose a rate-guided arbitrary rescaling network (RARN) for video resizing before encoding. The lightweight RARN structure can process FHD (1080p) content at real-time speed (91 FPS) and obtain a considerable rate reduction.
arXiv Detail & Related papers (2023-06-07T07:15:18Z)
Video compression dataset and benchmark of learning-based video-quality metrics [55.41644538483948]
We present a new benchmark for video-quality metrics that evaluates video compression. It is based on a new dataset consisting of about 2,500 streams encoded using different standards. Subjective scores were collected using crowdsourced pairwise comparisons.
arXiv Detail & Related papers (2022-11-22T09:22:28Z)
Transfer of Representations to Video Label Propagation: Implementation Factors Matter [31.030799003595522]
We study the impact of important implementation factors in feature extraction and label propagation. We show that augmenting video-based correspondence cues with still-image-based ones can further improve performance. We hope that this study will help to improve evaluation practices and better inform future research directions in temporal correspondence.
arXiv Detail & Related papers (2022-03-10T18:58:22Z)
End-to-End Rate-Distortion Optimized Learned Hierarchical Bi-Directional Video Compression [10.885590093103344]
Learned VC allows end-to-end rate-distortion (R-D) optimized training of nonlinear transform, motion and entropy model simultaneously. This paper proposes a learned hierarchical bi-directional video (LHBDC) that combines the benefits of hierarchical motion-sampling and end-to-end optimization.
arXiv Detail & Related papers (2021-12-17T14:30:22Z)
Perceptual Learned Video Compression with Recurrent Conditional GAN [158.0726042755]
We propose a Perceptual Learned Video Compression (PLVC) approach with recurrent conditional generative adversarial network. PLVC learns to compress video towards good perceptual quality at low bit-rate. The user study further validates the outstanding perceptual performance of PLVC in comparison with the latest learned video compression approaches.
arXiv Detail & Related papers (2021-09-07T13:36:57Z)
Objective video quality metrics application to video codecs comparisons: choosing the best for subjective quality estimation [101.18253437732933]
Quality assessment plays a key role in creating and comparing video compression algorithms. For comparison, we used a set of videos encoded with video codecs of different standards, and visual quality scores collected for the resulting set of streams since 2018 until 2021.
arXiv Detail & Related papers (2021-07-21T17:18:11Z)
Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system. Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame. Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.