Online Action Representation using Change Detection and Symbolic Programming
- URL: http://arxiv.org/abs/2405.11511v1
- Date: Sun, 19 May 2024 10:31:59 GMT
- Title: Online Action Representation using Change Detection and Symbolic Programming
- Authors: Vishnu S Nair, Sneha Sree, Jayaraj Joseph, Mohanasankar Sivaprakasam,
- Abstract summary: The proposed method employs a change detection algorithm to automatically segment action sequences.
We show the effectiveness of this representation in the downstream task of class repetition detection.
The results of the experiments demonstrate that, despite operating online, the proposed method performs better or on par with the existing method.
- Score: 0.3937354192623676
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper addresses the critical need for online action representation, which is essential for various applications like rehabilitation, surveillance, etc. The task can be defined as representation of actions as soon as they happen in a streaming video without access to video frames in the future. Most of the existing methods use predefined window sizes for video segments, which is a restrictive assumption on the dynamics. The proposed method employs a change detection algorithm to automatically segment action sequences, which form meaningful sub-actions and subsequently fit symbolic generative motion programs to the clipped segments. We determine the start time and end time of segments using change detection followed by a piece-wise linear fit algorithm on joint angle and bone length sequences. Domain-specific symbolic primitives are fit to pose keypoint trajectories of those extracted segments in order to obtain a higher level semantic representation. Since this representation is part-based, it is complementary to the compositional nature of human actions, i.e., a complex activity can be broken down into elementary sub-actions. We show the effectiveness of this representation in the downstream task of class agnostic repetition detection. We propose a repetition counting algorithm based on consecutive similarity matching of primitives, which can do online repetition counting. We also compare the results with a similar but offline repetition counting algorithm. The results of the experiments demonstrate that, despite operating online, the proposed method performs better or on par with the existing method.
Related papers
- Skim then Focus: Integrating Contextual and Fine-grained Views for Repetitive Action Counting [87.11995635760108]
Key to action counting is accurately locating each video's repetitive actions.
We propose a dual-branch network, i.e., SkimFocusNet, working in a two-step manner.
arXiv Detail & Related papers (2024-06-13T05:15:52Z) - Efficient Action Counting with Dynamic Queries [31.833468477101604]
We introduce a novel approach that employs an action query representation to localize repeated action cycles with linear computational complexity.
Unlike static action queries, this approach dynamically embeds video features into action queries, offering a more flexible and generalizable representation.
Our method significantly outperforms previous works, particularly in terms of long video sequences, unseen actions, and actions at various speeds.
arXiv Detail & Related papers (2024-03-03T15:43:11Z) - Activity Grammars for Temporal Action Segmentation [71.03141719666972]
temporal action segmentation aims at translating an untrimmed activity video into a sequence of action segments.
This paper introduces an effective activity grammar to guide neural predictions for temporal action segmentation.
Experimental results demonstrate that our method significantly improves temporal action segmentation in terms of both performance and interpretability.
arXiv Detail & Related papers (2023-12-07T12:45:33Z) - Temporal Segment Transformer for Action Segmentation [54.25103250496069]
We propose an attention based approach which we call textittemporal segment transformer, for joint segment relation modeling and denoising.
The main idea is to denoise segment representations using attention between segment and frame representations, and also use inter-segment attention to capture temporal correlations between segments.
We show that this novel architecture achieves state-of-the-art accuracy on the popular 50Salads, GTEA and Breakfast benchmarks.
arXiv Detail & Related papers (2023-02-25T13:05:57Z) - A Closer Look at Temporal Ordering in the Segmentation of Instructional
Videos [17.712793578388126]
We take a closer look at Procedure and Summarization (PSS) and propose three fundamental improvements over current methods.
We propose a new segmentation metric based on dynamic programming that takes into account the order of segments.
We propose a matching algorithm that constrains the temporal order of segment mapping, and is also differentiable.
arXiv Detail & Related papers (2022-09-30T14:44:19Z) - Action parsing using context features [0.0]
We argue that context information, particularly the temporal information about other actions in the video sequence, is valuable for action segmentation.
The proposed parsing algorithm temporally segments the video sequence into action segments.
arXiv Detail & Related papers (2022-05-20T07:54:04Z) - SVIP: Sequence VerIfication for Procedures in Videos [68.07865790764237]
We propose a novel sequence verification task that aims to distinguish positive video pairs performing the same action sequence from negative ones with step-level transformations.
Such a challenging task resides in an open-set setting without prior action detection or segmentation.
We collect a scripted video dataset enumerating all kinds of step-level transformations in chemical experiments.
arXiv Detail & Related papers (2021-12-13T07:03:36Z) - Few-Shot Action Recognition with Compromised Metric via Optimal
Transport [31.834843714684343]
Few-shot action recognition is still not mature despite the wide research of few-shot image classification.
One main obstacle to applying these algorithms in action recognition is the complex structure of videos.
We propose Compromised Metric via Optimal Transport (CMOT) to combine the advantages of these two solutions.
arXiv Detail & Related papers (2021-04-08T12:42:05Z) - Unsupervised Learning of Video Representations via Dense Trajectory
Clustering [86.45054867170795]
This paper addresses the task of unsupervised learning of representations for action recognition in videos.
We first propose to adapt two top performing objectives in this class - instance recognition and local aggregation.
We observe promising performance, but qualitative analysis shows that the learned representations fail to capture motion patterns.
arXiv Detail & Related papers (2020-06-28T22:23:03Z) - Gabriella: An Online System for Real-Time Activity Detection in
Untrimmed Security Videos [72.50607929306058]
We propose a real-time online system to perform activity detection on untrimmed security videos.
The proposed method consists of three stages: tubelet extraction, activity classification and online tubelet merging.
We demonstrate the effectiveness of the proposed approach in terms of speed (100 fps) and performance with state-of-the-art results.
arXiv Detail & Related papers (2020-04-23T22:20:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.