Benchmarking Data Efficiency and Computational Efficiency of Temporal
Action Localization Models
- URL: http://arxiv.org/abs/2308.13082v1
- Date: Thu, 24 Aug 2023 20:59:55 GMT
- Title: Benchmarking Data Efficiency and Computational Efficiency of Temporal
Action Localization Models
- Authors: Jan Warchocki, Teodor Oprescu, Yunhan Wang, Alexandru Damacus, Paul
Misterka, Robert-Jan Bruintjes, Attila Lengyel, Ombretta Strafforello, Jan
van Gemert
- Abstract summary: In temporal action localization, given an input video, the goal is to predict which actions it contains, where they begin, and where they end.
This work explores and measures how current deep temporal action localization models perform in settings constrained by the amount of data or computational power.
- Score: 42.06124795143787
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In temporal action localization, given an input video, the goal is to predict
which actions it contains, where they begin, and where they end. Training and
testing current state-of-the-art deep learning models requires access to large
amounts of data and computational power. However, gathering such data is
challenging and computational resources might be limited. This work explores
and measures how current deep temporal action localization models perform in
settings constrained by the amount of data or computational power. We measure
data efficiency by training each model on a subset of the training set. We find
that TemporalMaxer outperforms other models in data-limited settings.
Furthermore, we recommend TriDet when training time is limited. To test the
efficiency of the models during inference, we pass videos of different lengths
through each model. We find that TemporalMaxer requires the least computational
resources, likely due to its simple architecture.
Related papers
- Data-Centric Machine Learning for Earth Observation: Necessary and Sufficient Features [5.143097874851516]
We leverage model explanation methods to identify the features crucial for the model to reach optimal performance.
Some datasets can reach their optimal accuracy with less than 20% of the temporal instances, while in other datasets, the time series of a single band from a single modality is sufficient.
arXiv Detail & Related papers (2024-08-21T07:26:43Z) - Data Shapley in One Training Run [88.59484417202454]
Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts.
Existing approaches require re-training models on different data subsets, which is computationally intensive.
This paper introduces In-Run Data Shapley, which addresses these limitations by offering scalable data attribution for a target model of interest.
arXiv Detail & Related papers (2024-06-16T17:09:24Z) - Combating Missing Modalities in Egocentric Videos at Test Time [92.38662956154256]
Real-world applications often face challenges with incomplete modalities due to privacy concerns, efficiency needs, or hardware issues.
We propose a novel approach to address this issue at test time without requiring retraining.
MiDl represents the first self-supervised, online solution for handling missing modalities exclusively at test time.
arXiv Detail & Related papers (2024-04-23T16:01:33Z) - Certain and Approximately Certain Models for Statistical Learning [4.318959672085627]
We show that it is possible to learn accurate models directly from data with missing values for certain training data and target models.
We build efficient algorithms with theoretical guarantees to check this necessity and return accurate models in cases where imputation is unnecessary.
arXiv Detail & Related papers (2024-02-27T22:49:33Z) - Strategies and impact of learning curve estimation for CNN-based image
classification [0.2678472239880052]
Learning curves are a measure for how the performance of machine learning models improves given a certain volume of training data.
Over a wide variety of applications and models it was observed that learning curves follow -- to a large extent -- a power law behavior.
By estimating the learning curve of a model from training on small subsets of data only the best models need to be considered for training on the full dataset.
arXiv Detail & Related papers (2023-10-12T16:28:25Z) - Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain.
We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size.
Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z) - Building a Performance Model for Deep Learning Recommendation Model
Training on GPUs [6.05245376098191]
We devise a performance model for GPU training of Deep Learning Recommendation Models (DLRM)
We show that both the device active time (the sum of kernel runtimes) and the device idle time are important components of the overall device time.
We propose a critical-path-based algorithm to predict the per-batch training time of DLRM by traversing its execution graph.
arXiv Detail & Related papers (2022-01-19T19:05:42Z) - Efficient Nearest Neighbor Language Models [114.40866461741795]
Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore.
We show how to achieve up to a 6x speed-up in inference speed while retaining comparable performance.
arXiv Detail & Related papers (2021-09-09T12:32:28Z) - GeoStat Representations of Time Series for Fast Classification [30.987852463546698]
We introduce GeoStat representations for time series.
GeoStat representations are based off of a generalization of recent methods for trajectory classification.
We show that this methodology achieves good performance on a challenging dataset involving the classification of fishing vessels.
arXiv Detail & Related papers (2020-07-13T20:48:03Z) - Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence.
This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time.
Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.