Related papers: Action parsing using context features

Action parsing using context features

URL: http://arxiv.org/abs/2205.10008v1
Date: Fri, 20 May 2022 07:54:04 GMT
Title: Action parsing using context features
Authors: Nagita Mehrseresht
Abstract summary: We argue that context information, particularly the temporal information about other actions in the video sequence, is valuable for action segmentation. The proposed parsing algorithm temporally segments the video sequence into action segments.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose an action parsing algorithm to parse a video sequence containing an unknown number of actions into its action segments. We argue that context information, particularly the temporal information about other actions in the video sequence, is valuable for action segmentation. The proposed parsing algorithm temporally segments the video sequence into action segments. The optimal temporal segmentation is found using a dynamic programming search algorithm that optimizes the overall classification confidence score. The classification score of each segment is determined using local features calculated from that segment as well as context features calculated from other candidate action segments of the sequence. Experimental results on the Breakfast activity data-set showed improved segmentation accuracy compared to existing state-of-the-art parsing techniques.

Related papers

Parameter-free Video Segmentation for Vision and Language Understanding [55.20132267309382]
We propose an algorithm for segmenting videos into contiguous chunks, based on the minimum description length principle. The algorithm is entirely parameter-free, given feature vectors, not requiring a set threshold or the number or size of chunks to be specified.
arXiv Detail & Related papers (2025-03-03T05:54:37Z)
Efficient Temporal Action Segmentation via Boundary-aware Query Voting [51.92693641176378]
BaFormer is a boundary-aware Transformer network that tokenizes each video segment as an instance token. BaFormer significantly reduces the computational costs, utilizing only 6% of the running time.
arXiv Detail & Related papers (2024-05-25T00:44:13Z)
Online Action Representation using Change Detection and Symbolic Programming [0.3937354192623676]
The proposed method employs a change detection algorithm to automatically segment action sequences. We show the effectiveness of this representation in the downstream task of class repetition detection. The results of the experiments demonstrate that, despite operating online, the proposed method performs better or on par with the existing method.
arXiv Detail & Related papers (2024-05-19T10:31:59Z)
Activity Grammars for Temporal Action Segmentation [71.03141719666972]
temporal action segmentation aims at translating an untrimmed activity video into a sequence of action segments. This paper introduces an effective activity grammar to guide neural predictions for temporal action segmentation. Experimental results demonstrate that our method significantly improves temporal action segmentation in terms of both performance and interpretability.
arXiv Detail & Related papers (2023-12-07T12:45:33Z)
TAEC: Unsupervised Action Segmentation with Temporal-Aware Embedding and Clustering [27.52568444236988]
We propose an unsupervised approach for learning action classes from untrimmed video sequences. In particular, we propose a temporal embedding network that combines relative time prediction, feature reconstruction, and sequence-to-sequence learning. Based on the identified clusters, we decode the video into coherent temporal segments that correspond to semantically meaningful action classes.
arXiv Detail & Related papers (2023-03-09T10:46:23Z)
Temporal Segment Transformer for Action Segmentation [54.25103250496069]
We propose an attention based approach which we call textittemporal segment transformer, for joint segment relation modeling and denoising. The main idea is to denoise segment representations using attention between segment and frame representations, and also use inter-segment attention to capture temporal correlations between segments. We show that this novel architecture achieves state-of-the-art accuracy on the popular 50Salads, GTEA and Breakfast benchmarks.
arXiv Detail & Related papers (2023-02-25T13:05:57Z)
A Closer Look at Temporal Ordering in the Segmentation of Instructional Videos [17.712793578388126]
We take a closer look at Procedure and Summarization (PSS) and propose three fundamental improvements over current methods. We propose a new segmentation metric based on dynamic programming that takes into account the order of segments. We propose a matching algorithm that constrains the temporal order of segment mapping, and is also differentiable.
arXiv Detail & Related papers (2022-09-30T14:44:19Z)
Unsupervised Action Segmentation with Self-supervised Feature Learning and Co-occurrence Parsing [32.66011849112014]
temporal action segmentation is a task to classify each frame in the video with an action label. In this work we explore a self-supervised method that operates on a corpus of unlabeled videos and predicts a likely set of temporal segments across the videos. We develop CAP, a novel co-occurrence action parsing algorithm that can not only capture the correlation among sub-actions underlying the structure of activities, but also estimate the temporal trajectory of the sub-actions in an accurate and general way.
arXiv Detail & Related papers (2021-05-29T00:29:40Z)
Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation [79.6596425920849]
This paper addresses the task of unsupervised video multi-object segmentation. We introduce a novel approach for more accurate and efficient unseen-temporal segmentation. We evaluate the proposed approach on DAVIS$_17$ and YouTube-VIS, and the results demonstrate that it outperforms state-of-the-art methods both in segmentation accuracy and inference speed.
arXiv Detail & Related papers (2021-04-10T14:39:44Z)
Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation [96.67525775629444]
Action segmentation refers to inferring boundaries of semantically consistent visual concepts in videos. We present a fully automatic and unsupervised approach for segmenting actions in a video that does not require any training. Our proposal is an effective temporally-weighted hierarchical clustering algorithm that can group semantically consistent frames of the video.
arXiv Detail & Related papers (2021-03-20T23:30:01Z)
Motion-supervised Co-Part Segmentation [88.40393225577088]
We propose a self-supervised deep learning method for co-part segmentation. Our approach develops the idea that motion information inferred from videos can be leveraged to discover meaningful object parts.
arXiv Detail & Related papers (2020-04-07T09:56:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.