Benchmarking Video Frame Interpolation
- URL: http://arxiv.org/abs/2403.17128v1
- Date: Mon, 25 Mar 2024 19:13:12 GMT
- Title: Benchmarking Video Frame Interpolation
- Authors: Simon Kiefhaber, Simon Niklaus, Feng Liu, Simone Schaub-Meyer,
- Abstract summary: We present a benchmark which establishes consistent error metrics by utilizing a submission website that computes them.
We also present a test set adhering to the assumption of linearity by utilizing synthetic data, and evaluate the computational efficiency in a coherent manner.
- Score: 11.918489436283748
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video frame interpolation, the task of synthesizing new frames in between two or more given ones, is becoming an increasingly popular research target. However, the current evaluation of frame interpolation techniques is not ideal. Due to the plethora of test datasets available and inconsistent computation of error metrics, a coherent and fair comparison across papers is very challenging. Furthermore, new test sets have been proposed as part of method papers so they are unable to provide the in-depth evaluation of a dedicated benchmarking paper. Another severe downside is that these test sets violate the assumption of linearity when given two input frames, making it impossible to solve without an oracle. We hence strongly believe that the community would greatly benefit from a benchmarking paper, which is what we propose. Specifically, we present a benchmark which establishes consistent error metrics by utilizing a submission website that computes them, provides insights by analyzing the interpolation quality with respect to various per-pixel attributes such as the motion magnitude, contains a carefully designed test set adhering to the assumption of linearity by utilizing synthetic data, and evaluates the computational efficiency in a coherent manner.
Related papers
- Pareto Set Identification With Posterior Sampling [14.121842087273167]
This paper studies PSI in the transductive linear setting with potentially correlated objectives.
We propose the PSIPS algorithm that deals simultaneously with structure and correlation without paying the computational cost of existing oracle-based algorithms.
arXiv Detail & Related papers (2024-11-07T18:15:38Z) - Using Similarity to Evaluate Factual Consistency in Summaries [2.7595794227140056]
Abstractive summarisers generate fluent summaries, but the factuality of the generated text is not guaranteed.
We propose a new zero-shot factuality evaluation metric, Sentence-BERTScore (SBERTScore), which compares sentences between the summary and the source document.
Our experiments indicate that each technique has different strengths, with SBERTScore particularly effective in identifying correct summaries.
arXiv Detail & Related papers (2024-09-23T15:02:38Z) - SparseCL: Sparse Contrastive Learning for Contradiction Retrieval [87.02936971689817]
Contradiction retrieval refers to identifying and extracting documents that explicitly disagree with or refute the content of a query.
Existing methods such as similarity search and crossencoder models exhibit significant limitations.
We introduce SparseCL that leverages specially trained sentence embeddings designed to preserve subtle, contradictory nuances between sentences.
arXiv Detail & Related papers (2024-06-15T21:57:03Z) - How to Evaluate Entity Resolution Systems: An Entity-Centric Framework with Application to Inventor Name Disambiguation [1.7812428873698403]
We propose an entity-centric data labeling methodology that integrates with a unified framework for monitoring summary statistics.
These benchmark data sets can then be used for model training and a variety of evaluation tasks.
arXiv Detail & Related papers (2024-04-08T15:53:29Z) - Revisiting Evaluation Metrics for Semantic Segmentation: Optimization
and Evaluation of Fine-grained Intersection over Union [113.20223082664681]
We propose the use of fine-grained mIoUs along with corresponding worst-case metrics.
These fine-grained metrics offer less bias towards large objects, richer statistical information, and valuable insights into model and dataset auditing.
Our benchmark study highlights the necessity of not basing evaluations on a single metric and confirms that fine-grained mIoUs reduce the bias towards large objects.
arXiv Detail & Related papers (2023-10-30T03:45:15Z) - A Critical Re-evaluation of Benchmark Datasets for (Deep) Learning-Based
Matching Algorithms [11.264467955516706]
We propose four approaches to assessing the difficulty and appropriateness of 13 established datasets.
We show that most of the popular datasets pose rather easy classification tasks.
We propose a new methodology for yielding benchmark datasets.
arXiv Detail & Related papers (2023-07-03T07:54:54Z) - Evaluating Graph Neural Networks for Link Prediction: Current Pitfalls
and New Benchmarking [66.83273589348758]
Link prediction attempts to predict whether an unseen edge exists based on only a portion of edges of a graph.
A flurry of methods have been introduced in recent years that attempt to make use of graph neural networks (GNNs) for this task.
New and diverse datasets have also been created to better evaluate the effectiveness of these new models.
arXiv Detail & Related papers (2023-06-18T01:58:59Z) - A Closer Look at Debiased Temporal Sentence Grounding in Videos:
Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video.
Recent studies have found that current benchmark datasets may have obvious moment annotation biases.
We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z) - Are Missing Links Predictable? An Inferential Benchmark for Knowledge
Graph Completion [79.07695173192472]
InferWiki improves upon existing benchmarks in inferential ability, assumptions, and patterns.
Each testing sample is predictable with supportive data in the training set.
In experiments, we curate two settings of InferWiki varying in sizes and structures, and apply the construction process on CoDEx as comparative datasets.
arXiv Detail & Related papers (2021-08-03T09:51:15Z) - Comprehensive Studies for Arbitrary-shape Scene Text Detection [78.50639779134944]
We propose a unified framework for the bottom-up based scene text detection methods.
Under the unified framework, we ensure the consistent settings for non-core modules.
With the comprehensive investigations and elaborate analyses, it reveals the advantages and disadvantages of previous models.
arXiv Detail & Related papers (2021-07-25T13:18:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.