Context-aware and Scale-insensitive Temporal Repetition Counting
- URL: http://arxiv.org/abs/2005.08465v1
- Date: Mon, 18 May 2020 05:49:48 GMT
- Title: Context-aware and Scale-insensitive Temporal Repetition Counting
- Authors: Huaidong Zhang, Xuemiao Xu, Guoqiang Han, and Shengfeng He
- Abstract summary: Temporal repetition counting aims to estimate the number of cycles of a given repetitive action.
Existing deep learning methods assume repetitive actions are performed in a fixed time-scale, which is invalid for the complex repetitive actions in real life.
We propose a context-aware and scale-insensitive framework to tackle the challenges in repetition counting caused by the unknown and diverse cycle-lengths.
- Score: 60.40438811580856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Temporal repetition counting aims to estimate the number of cycles of a given
repetitive action. Existing deep learning methods assume repetitive actions are
performed in a fixed time-scale, which is invalid for the complex repetitive
actions in real life. In this paper, we tailor a context-aware and
scale-insensitive framework, to tackle the challenges in repetition counting
caused by the unknown and diverse cycle-lengths. Our approach combines two key
insights: (1) Cycle lengths from different actions are unpredictable that
require large-scale searching, but, once a coarse cycle length is determined,
the variety between repetitions can be overcome by regression. (2) Determining
the cycle length cannot only rely on a short fragment of video but a contextual
understanding. The first point is implemented by a coarse-to-fine cycle
refinement method. It avoids the heavy computation of exhaustively searching
all the cycle lengths in the video, and, instead, it propagates the coarse
prediction for further refinement in a hierarchical manner. We secondly propose
a bidirectional cycle length estimation method for a context-aware prediction.
It is a regression network that takes two consecutive coarse cycles as input,
and predicts the locations of the previous and next repetitive cycles. To
benefit the training and evaluation of temporal repetition counting area, we
construct a new and largest benchmark, which contains 526 videos with diverse
repetitive actions. Extensive experiments show that the proposed network
trained on a single dataset outperforms state-of-the-art methods on several
benchmarks, indicating that the proposed framework is general enough to capture
repetition patterns across domains.
Related papers
- Recycled Attention: Efficient inference for long-context language models [54.00118604124301]
We propose Recycled Attention, an inference-time method which alternates between full context attention and attention over a subset of input tokens.
When performing partial attention, we recycle the attention pattern of a previous token that has performed full attention and attend only to the top K most attended tokens.
Compared to previously proposed inference-time acceleration method which attends only to local context or tokens with high accumulative attention scores, our approach flexibly chooses tokens that are relevant to the current decoding step.
arXiv Detail & Related papers (2024-11-08T18:57:07Z) - IVAC-P2L: Leveraging Irregular Repetition Priors for Improving Video Action Counting [24.596979713593765]
Video Action Counting (VAC) is crucial in analyzing repetitive actions in videos.
Traditional methods have overlooked the complexity of action repetitions, such as interruptions and the variability in cycle duration.
We introduce Irregular Video Action Counting (IVAC), which prioritizes modeling irregular repetition patterns in videos.
arXiv Detail & Related papers (2024-03-18T16:56:47Z) - Efficient Action Counting with Dynamic Queries [31.833468477101604]
We introduce a novel approach that employs an action query representation to localize repeated action cycles with linear computational complexity.
Unlike static action queries, this approach dynamically embeds video features into action queries, offering a more flexible and generalizable representation.
Our method significantly outperforms previous works, particularly in terms of long video sequences, unseen actions, and actions at various speeds.
arXiv Detail & Related papers (2024-03-03T15:43:11Z) - Curricular and Cyclical Loss for Time Series Learning Strategy [17.725840333187577]
We propose a novel Curricular and CyclicaL loss (CRUCIAL) to learn time series for the first time.
CRUCIAL has two characteristics: It can arrange an easy-to-hard learning order and achieve an adaptive cycle.
We prove that compared with monotonous size, cyclical size can reduce expected error.
arXiv Detail & Related papers (2023-12-26T02:40:05Z) - Mitigating the Learning Bias towards Repetition by Self-Contrastive
Training for Open-Ended Generation [92.42032403795879]
We show that pretrained language models (LMs) such as GPT2 still tend to generate repetitive texts.
We attribute their overestimation of token-level repetition probabilities to the learning bias.
We find that LMs use longer-range dependencies to predict repetitive tokens than non-repetitive ones, which may be the cause of sentence-level repetition loops.
arXiv Detail & Related papers (2023-07-04T07:53:55Z) - Full Resolution Repetition Counting [19.676724611655914]
Given an untrimmed video, repetitive actions counting aims to estimate the number of repetitions of class-agnostic actions.
Down-sampling is commonly utilized in recent state-of-the-art methods, leading to ignorance of several repetitive samples.
In this paper, we attempt to understand repetitive actions from a full temporal resolution view, by combining offline feature extraction and temporal convolution networks.
arXiv Detail & Related papers (2023-05-23T07:45:56Z) - ReFIT: Relevance Feedback from a Reranker during Inference [109.33278799999582]
Retrieve-and-rerank is a prevalent framework in neural information retrieval.
We propose to leverage the reranker to improve recall by making it provide relevance feedback to the retriever at inference time.
arXiv Detail & Related papers (2023-05-19T15:30:33Z) - Counting Out Time: Class Agnostic Video Repetition Counting in the Wild [82.26003709476848]
We present an approach for estimating the period with which an action is repeated in a video.
The crux of the approach lies in constraining the period prediction module to use temporal self-similarity.
We train this model, called Repnet, with a synthetic dataset that is generated from a large unlabeled video collection.
arXiv Detail & Related papers (2020-06-27T18:00:42Z) - Energy-based Periodicity Mining with Deep Features for Action Repetition
Counting in Unconstrained Videos [17.00863997561408]
Action repetition counting is to estimate the occurrence times of the repetitive motion in one action.
We propose a new method superior to the traditional ways in two aspects, without preprocessing and applicable for arbitrary periodicity actions.
arXiv Detail & Related papers (2020-03-15T14:21:18Z) - Consistency of a Recurrent Language Model With Respect to Incomplete
Decoding [67.54760086239514]
We study the issue of receiving infinite-length sequences from a recurrent language model.
We propose two remedies which address inconsistency: consistent variants of top-k and nucleus sampling, and a self-terminating recurrent language model.
arXiv Detail & Related papers (2020-02-06T19:56:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.