Embracing Uncertainty: Decoupling and De-bias for Robust Temporal
Grounding
- URL: http://arxiv.org/abs/2103.16848v1
- Date: Wed, 31 Mar 2021 07:00:56 GMT
- Title: Embracing Uncertainty: Decoupling and De-bias for Robust Temporal
Grounding
- Authors: Hao Zhou, Chongyang Zhang, Yan Luo, Yanjun Chen, Chuanping Hu
- Abstract summary: Temporal grounding aims to localize temporal boundaries within untrimmed videos by language queries.
It faces the challenge of two types of inevitable human uncertainties: query uncertainty and label uncertainty.
We propose a novel DeNet (Decoupling and De-bias) to embrace human uncertainty.
- Score: 23.571580627202405
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Temporal grounding aims to localize temporal boundaries within untrimmed
videos by language queries, but it faces the challenge of two types of
inevitable human uncertainties: query uncertainty and label uncertainty. The
two uncertainties stem from human subjectivity, leading to limited
generalization ability of temporal grounding. In this work, we propose a novel
DeNet (Decoupling and De-bias) to embrace human uncertainty: Decoupling - We
explicitly disentangle each query into a relation feature and a modified
feature. The relation feature, which is mainly based on skeleton-like words
(including nouns and verbs), aims to extract basic and consistent information
in the presence of query uncertainty. Meanwhile, modified feature assigned with
style-like words (including adjectives, adverbs, etc) represents the subjective
information, and thus brings personalized predictions; De-bias - We propose a
de-bias mechanism to generate diverse predictions, aim to alleviate the bias
caused by single-style annotations in the presence of label uncertainty.
Moreover, we put forward new multi-label metrics to diversify the performance
evaluation. Extensive experiments show that our approach is more effective and
robust than state-of-the-arts on Charades-STA and ActivityNet Captions
datasets.
Related papers
- Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness [106.52630978891054]
We present a taxonomy of uncertainty specific to vision-language AI systems.
We also introduce a new metric confidence-weighted accuracy, that is well correlated with both accuracy and calibration error.
arXiv Detail & Related papers (2024-07-02T04:23:54Z) - Spurious Feature Eraser: Stabilizing Test-Time Adaptation for Vision-Language Foundation Model [86.9619638550683]
Vision-language foundation models have exhibited remarkable success across a multitude of downstream tasks due to their scalability on extensive image-text paired data.
However, these models display significant limitations when applied to downstream tasks, such as fine-grained image classification, as a result of decision shortcuts''
arXiv Detail & Related papers (2024-03-01T09:01:53Z) - CUE: An Uncertainty Interpretation Framework for Text Classifiers Built
on Pre-Trained Language Models [28.750894873827068]
We propose a novel framework, called CUE, which aims to interpret uncertainties inherent in the predictions of PLM-based models.
By comparing the difference in predictive uncertainty between the perturbed and the original text representations, we are able to identify the latent dimensions responsible for uncertainty.
arXiv Detail & Related papers (2023-06-06T11:37:46Z) - Ambiguity Meets Uncertainty: Investigating Uncertainty Estimation for
Word Sense Disambiguation [5.55197751179213]
Existing supervised methods treat WSD as a classification task and have achieved remarkable performance.
This paper extensively studies uncertainty estimation (UE) on the benchmark designed for WSD.
We examine the capability of capturing data and model uncertainties by the model with the selected UE score on well-designed test scenarios and discover that the model reflects data uncertainty satisfactorily but underestimates model uncertainty.
arXiv Detail & Related papers (2023-05-22T15:18:15Z) - Human-Guided Fair Classification for Natural Language Processing [9.652938946631735]
We show how to leverage unsupervised style transfer and GPT-3's zero-shot capabilities to generate semantically similar sentences that differ along sensitive attributes.
We validate the generated pairs via an extensive crowdsourcing study, which confirms that a lot of these pairs align with human intuition about fairness in the context of toxicity classification.
arXiv Detail & Related papers (2022-12-20T10:46:40Z) - Measuring Fairness of Text Classifiers via Prediction Sensitivity [63.56554964580627]
ACCUMULATED PREDICTION SENSITIVITY measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features.
We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness.
arXiv Detail & Related papers (2022-03-16T15:00:33Z) - Dealing with Disagreements: Looking Beyond the Majority Vote in
Subjective Annotations [6.546195629698355]
We investigate the efficacy of multi-annotator models for subjective tasks.
We show that this approach yields same or better performance than aggregating labels in the data prior to training.
Our approach also provides a way to estimate uncertainty in predictions, which we demonstrate better correlate with annotation disagreements than traditional methods.
arXiv Detail & Related papers (2021-10-12T03:12:34Z) - CertainNet: Sampling-free Uncertainty Estimation for Object Detection [65.28989536741658]
Estimating the uncertainty of a neural network plays a fundamental role in safety-critical settings.
In this work, we propose a novel sampling-free uncertainty estimation method for object detection.
We call it CertainNet, and it is the first to provide separate uncertainties for each output signal: objectness, class, location and size.
arXiv Detail & Related papers (2021-10-04T17:59:31Z) - Dive into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty
Estimation for Facial Expression Recognition [59.52434325897716]
We propose a solution, named DMUE, to address the problem of annotation ambiguity from two perspectives.
For the former, an auxiliary multi-branch learning framework is introduced to better mine and describe the latent distribution in the label space.
For the latter, the pairwise relationship of semantic feature between instances are fully exploited to estimate the ambiguity extent in the instance space.
arXiv Detail & Related papers (2021-04-01T03:21:57Z) - I Beg to Differ: A study of constructive disagreement in online
conversations [15.581515781839656]
We construct a corpus of 7 425 Wikipedia Talk page conversations that contain content disputes.
We define the task of predicting whether disagreements will be escalated to mediation by a moderator.
We develop a variety of neural models and show that taking into account the structure of the conversation improves predictive accuracy.
arXiv Detail & Related papers (2021-01-26T16:36:43Z) - Differentially Private and Fair Deep Learning: A Lagrangian Dual
Approach [54.32266555843765]
This paper studies a model that protects the privacy of the individuals sensitive information while also allowing it to learn non-discriminatory predictors.
The method relies on the notion of differential privacy and the use of Lagrangian duality to design neural networks that can accommodate fairness constraints.
arXiv Detail & Related papers (2020-09-26T10:50:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.