A Closer Look at Temporal Sentence Grounding in Videos: Datasets and
Metrics
- URL: http://arxiv.org/abs/2101.09028v2
- Date: Wed, 27 Jan 2021 07:19:07 GMT
- Title: A Closer Look at Temporal Sentence Grounding in Videos: Datasets and
Metrics
- Authors: Yitian Yuan, Xiaohan Lan, Long Chen, Wei Liu, Xin Wang, Wenwu Zhu
- Abstract summary: We re- organize two widely-used TSGV datasets (Charades-STA and ActivityNet Captions) to make it different from the training split.
We introduce a new evaluation metric "dR@$n$,IoU@$m$" to calibrate the basic IoU scores.
All the results demonstrate that the re-organized datasets and new metric can better monitor the progress in TSGV.
- Score: 70.45937234489044
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite Temporal Sentence Grounding in Videos (TSGV) has realized impressive
progress over the last few years, current TSGV models tend to capture the
moment annotation biases and fail to take full advantage of multi-modal inputs.
Miraculously, some extremely simple TSGV baselines even without training can
also achieve state-of-the-art performance. In this paper, we first take a
closer look at the existing evaluation protocol, and argue that both the
prevailing datasets and metrics are the devils to cause the unreliable
benchmarking. To this end, we propose to re-organize two widely-used TSGV
datasets (Charades-STA and ActivityNet Captions), and deliberately
\textbf{C}hange the moment annotation \textbf{D}istribution of the test split
to make it different from the training split, dubbed as Charades-CD and
ActivityNet-CD, respectively. Meanwhile, we further introduce a new evaluation
metric "dR@$n$,IoU@$m$" to calibrate the basic IoU scores by penalizing more on
the over-long moment predictions and reduce the inflating performance caused by
the moment annotation biases. Under this new evaluation protocol, we conduct
extensive experiments and ablation studies on eight state-of-the-art TSGV
models. All the results demonstrate that the re-organized datasets and new
metric can better monitor the progress in TSGV, which is still far from
satisfactory. The repository of this work is at
\url{https://github.com/yytzsy/grounding_changing_distribution}.
Related papers
- Temporal Graph Benchmark for Machine Learning on Temporal Graphs [54.52243310226456]
Temporal Graph Benchmark (TGB) is a collection of challenging and diverse benchmark datasets.
We benchmark each dataset and find that the performance of common models can vary drastically across datasets.
TGB provides an automated machine learning pipeline for reproducible and accessible temporal graph research.
arXiv Detail & Related papers (2023-07-03T13:58:20Z) - Transform-Equivariant Consistency Learning for Temporal Sentence
Grounding [66.10949751429781]
We introduce a novel Equivariant Consistency Regulation Learning framework to learn more discriminative representations for each video.
Our motivation comes from that the temporal boundary of the query-guided activity should be consistently predicted.
In particular, we devise a self-supervised consistency loss module to enhance the completeness and smoothness of the augmented video.
arXiv Detail & Related papers (2023-05-06T19:29:28Z) - BERT on a Data Diet: Finding Important Examples by Gradient-Based
Pruning [20.404705741136777]
We introduce GraNd and its estimated version, EL2N, as scoring metrics for finding important examples in a dataset.
We show that by pruning a small portion of the examples with the highest GraNd/EL2N scores, we can not only preserve the test accuracy, but also surpass it.
arXiv Detail & Related papers (2022-11-10T14:37:23Z) - From Spectral Graph Convolutions to Large Scale Graph Convolutional
Networks [0.0]
Graph Convolutional Networks (GCNs) have been shown to be a powerful concept that has been successfully applied to a large variety of tasks.
We study the theory that paved the way to the definition of GCN, including related parts of classical graph theory.
arXiv Detail & Related papers (2022-07-12T16:57:08Z) - A Closer Look at Debiased Temporal Sentence Grounding in Videos:
Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video.
Recent studies have found that current benchmark datasets may have obvious moment annotation biases.
We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z) - Networked Time Series Prediction with Incomplete Data [59.45358694862176]
We propose NETS-ImpGAN, a novel deep learning framework that can be trained on incomplete data with missing values in both history and future.
We conduct extensive experiments on three real-world datasets under different missing patterns and missing rates.
arXiv Detail & Related papers (2021-10-05T18:20:42Z) - Time-Series Representation Learning via Temporal and Contextual
Contrasting [14.688033556422337]
We propose an unsupervised Time-Series representation learning framework via Temporal and Contextual Contrasting (TS-TCC)
First, the raw time-series data are transformed into two different yet correlated views by using weak and strong augmentations.
Second, we propose a novel temporal contrasting module to learn robust temporal representations by designing a tough cross-view prediction task.
Third, to further learn discriminative representations, we propose a contextual contrasting module built upon the contexts from the temporal contrasting module.
arXiv Detail & Related papers (2021-06-26T23:56:31Z) - Improving Calibration for Long-Tailed Recognition [68.32848696795519]
We propose two methods to improve calibration and performance in such scenarios.
For dataset bias due to different samplers, we propose shifted batch normalization.
Our proposed methods set new records on multiple popular long-tailed recognition benchmark datasets.
arXiv Detail & Related papers (2021-04-01T13:55:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.