Towards Streaming Egocentric Action Anticipation
- URL: http://arxiv.org/abs/2110.05386v1
- Date: Mon, 11 Oct 2021 16:22:56 GMT
- Title: Towards Streaming Egocentric Action Anticipation
- Authors: Antonino Furnari and Giovanni Maria Farinella
- Abstract summary: Egocentric action anticipation is the task of predicting the future actions a camera wearer will likely perform based on past video observations.
Current evaluation schemes assume that predictions can be made offline, and hence that computational resources are not limited.
We propose a streaming'' egocentric action anticipation evaluation protocol which explicitly considers model runtime for performance assessment.
- Score: 23.9991007631236
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Egocentric action anticipation is the task of predicting the future actions a
camera wearer will likely perform based on past video observations. While in a
real-world system it is fundamental to output such predictions before the
action begins, past works have not generally paid attention to model runtime
during evaluation. Indeed, current evaluation schemes assume that predictions
can be made offline, and hence that computational resources are not limited. In
contrast, in this paper, we propose a ``streaming'' egocentric action
anticipation evaluation protocol which explicitly considers model runtime for
performance assessment, assuming that predictions will be available only after
the current video segment is processed, which depends on the processing time of
a method. Following the proposed evaluation scheme, we benchmark different
state-of-the-art approaches for egocentric action anticipation on two popular
datasets. Our analysis shows that models with a smaller runtime tend to
outperform heavier models in the considered streaming scenario, thus changing
the rankings generally observed in standard offline evaluations. Based on this
observation, we propose a lightweight action anticipation model consisting in a
simple feed-forward 3D CNN, which we propose to optimize using knowledge
distillation techniques and a custom loss. The results show that the proposed
approach outperforms prior art in the streaming scenario, also in combination
with other lightweight models.
Related papers
- Forecasting with Deep Learning: Beyond Average of Average of Average Performance [0.393259574660092]
Current practices for evaluating and comparing forecasting models focus on summarising performance into a single score.
We propose a novel framework for evaluating models from multiple perspectives.
We show the advantages of this framework by comparing a state-of-the-art deep learning approach with classical forecasting techniques.
arXiv Detail & Related papers (2024-06-24T12:28:22Z) - Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - Fine-grained Forecasting Models Via Gaussian Process Blurring Effect [6.472434306724611]
Time series forecasting is a challenging task due to the existence of complex and dynamic temporal dependencies.
Using more training data is one way to improve the accuracy, but this source is often limited.
We are building on successful denoising approaches for image generation by advocating for an end-to-end forecasting and denoising paradigm.
arXiv Detail & Related papers (2023-12-21T20:25:16Z) - Streaming egocentric action anticipation: An evaluation scheme and
approach [27.391434284586985]
Egocentric action anticipation aims to predict the future actions the camera wearer will perform from the observation of the past.
Current evaluation schemes assume that predictions are available right after the input video is observed.
We propose a streaming egocentric action evaluation scheme which assumes that predictions are performed online and made available only after the model has processed the current input segment.
arXiv Detail & Related papers (2023-06-29T04:53:29Z) - A positive feedback method based on F-measure value for Salient Object
Detection [1.9249287163937976]
This paper proposes a positive feedback method based on F-measure value for salient object detection (SOD)
Our proposed method takes an image to be detected and inputs it into several existing models to obtain their respective prediction maps.
Experimental results on five publicly available datasets show that our proposed positive feedback method outperforms the latest 12 methods in five evaluation metrics for saliency map prediction.
arXiv Detail & Related papers (2023-04-28T04:05:13Z) - A Control-Centric Benchmark for Video Prediction [69.22614362800692]
We propose a benchmark for action-conditioned video prediction in the form of a control benchmark.
Our benchmark includes simulated environments with 11 task categories and 310 task instance definitions.
We then leverage our benchmark to study the effects of scaling model size, quantity of training data, and model ensembling.
arXiv Detail & Related papers (2023-04-26T17:59:45Z) - You Mostly Walk Alone: Analyzing Feature Attribution in Trajectory
Prediction [52.442129609979794]
Recent deep learning approaches for trajectory prediction show promising performance.
It remains unclear which features such black-box models actually learn to use for making predictions.
This paper proposes a procedure that quantifies the contributions of different cues to model performance.
arXiv Detail & Related papers (2021-10-11T14:24:15Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z) - Modeling Online Behavior in Recommender Systems: The Importance of
Temporal Context [30.894950420437926]
We show how omitting temporal context when evaluating recommender system performance leads to false confidence.
We propose a training procedure to further embed the temporal context in existing models.
Results show that including our temporal objective can improve recall@20 by up to 20%.
arXiv Detail & Related papers (2020-09-19T19:36:43Z) - Video Prediction via Example Guidance [156.08546987158616]
In video prediction tasks, one major challenge is to capture the multi-modal nature of future contents and dynamics.
In this work, we propose a simple yet effective framework that can efficiently predict plausible future states.
arXiv Detail & Related papers (2020-07-03T14:57:24Z) - Counterfactual Predictions under Runtime Confounding [74.90756694584839]
We study the counterfactual prediction task in the setting where all relevant factors are captured in the historical data.
We propose a doubly-robust procedure for learning counterfactual prediction models in this setting.
arXiv Detail & Related papers (2020-06-30T15:49:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.