Temporal Action Segmentation from Timestamp Supervision
- URL: http://arxiv.org/abs/2103.06669v2
- Date: Mon, 15 Mar 2021 09:50:40 GMT
- Title: Temporal Action Segmentation from Timestamp Supervision
- Authors: Zhe Li, Yazan Abu Farha, Juergen Gall
- Abstract summary: We introduce timestamp supervision for the temporal action segmentation task.
Timestamps require a comparable annotation effort to weakly supervised approaches.
Our approach uses the model output and the annotated timestamps to generate frame-wise labels.
- Score: 25.49797678477498
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Temporal action segmentation approaches have been very successful recently.
However, annotating videos with frame-wise labels to train such models is very
expensive and time consuming. While weakly supervised methods trained using
only ordered action lists require much less annotation effort, the performance
is still much worse than fully supervised approaches. In this paper, we
introduce timestamp supervision for the temporal action segmentation task.
Timestamps require a comparable annotation effort to weakly supervised
approaches, and yet provide a more supervisory signal. To demonstrate the
effectiveness of timestamp supervision, we propose an approach to train a
segmentation model using only timestamps annotations. Our approach uses the
model output and the annotated timestamps to generate frame-wise labels by
detecting the action changes. We further introduce a confidence loss that
forces the predicted probabilities to monotonically decrease as the distance to
the timestamps increases. This ensures that all and not only the most
distinctive frames of an action are learned during training. The evaluation on
four datasets shows that models trained with timestamps annotations achieve
comparable performance to the fully supervised approaches.
Related papers
- Distill and Collect for Semi-Supervised Temporal Action Segmentation [0.0]
We propose an approach for the temporal action segmentation task that can simultaneously leverage knowledge from annotated and unannotated video sequences.
Our approach uses multi-stream distillation that repeatedly refines and finally combines their frame predictions.
Our model also predicts the action order, which is later used as a temporal constraint while estimating frames labels to counter the lack of supervision for unannotated videos.
arXiv Detail & Related papers (2022-11-02T17:34:04Z) - Robust Action Segmentation from Timestamp Supervision [18.671808549019833]
Action segmentation is the task of predicting an action label for each frame of an untrimmed video.
Timestamp supervision is a promising type of weak supervision as obtaining one timestamp per action is less expensive than annotating all frames.
We show that our approach is more robust to missing annotations compared to other approaches and various baselines.
arXiv Detail & Related papers (2022-10-12T18:01:14Z) - A Generalized & Robust Framework For Timestamp Supervision in Temporal
Action Segmentation [79.436224998992]
In temporal action segmentation, Timestamp supervision requires only a handful of labelled frames per video sequence.
We propose a novel Expectation-Maximization based approach that leverages the label uncertainty of unlabelled frames.
Our proposed method produces SOTA results and even exceeds the fully-supervised setup in several metrics and datasets.
arXiv Detail & Related papers (2022-07-20T18:30:48Z) - Zero-Shot Temporal Action Detection via Vision-Language Prompting [134.26292288193298]
We propose a novel zero-Shot Temporal Action detection model via Vision-LanguagE prompting (STALE)
Our model significantly outperforms state-of-the-art alternatives.
Our model also yields superior results on supervised TAD over recent strong competitors.
arXiv Detail & Related papers (2022-07-17T13:59:46Z) - Turning to a Teacher for Timestamp Supervised Temporal Action
Segmentation [27.735478880660164]
We propose a new framework for timestamp supervised temporal action segmentation.
We introduce a teacher model parallel to the segmentation model to help stabilize the process of model optimization.
Our method outperforms the state-of-the-art method and performs comparably against the fully-supervised methods at a much lower annotation cost.
arXiv Detail & Related papers (2022-07-02T02:00:55Z) - Video Moment Retrieval from Text Queries via Single Frame Annotation [65.92224946075693]
Video moment retrieval aims at finding the start and end timestamps of a moment described by a given natural language query.
Fully supervised methods need complete temporal boundary annotations to achieve promising results.
We propose a new paradigm called "glance annotation"
arXiv Detail & Related papers (2022-04-20T11:59:17Z) - Weakly Supervised Video Salient Object Detection [79.51227350937721]
We present the first weakly supervised video salient object detection model based on relabeled "fixation guided scribble annotations"
An "Appearance-motion fusion module" and bidirectional ConvLSTM based framework are proposed to achieve effective multi-modal learning and long-term temporal context modeling.
arXiv Detail & Related papers (2021-04-06T09:48:38Z) - A Closer Look at Temporal Sentence Grounding in Videos: Datasets and
Metrics [70.45937234489044]
We re- organize two widely-used TSGV datasets (Charades-STA and ActivityNet Captions) to make it different from the training split.
We introduce a new evaluation metric "dR@$n$,IoU@$m$" to calibrate the basic IoU scores.
All the results demonstrate that the re-organized datasets and new metric can better monitor the progress in TSGV.
arXiv Detail & Related papers (2021-01-22T09:59:30Z) - Weakly Supervised Temporal Action Localization with Segment-Level Labels [140.68096218667162]
Temporal action localization presents a trade-off between test performance and annotation-time cost.
We introduce a new segment-level supervision setting: segments are labeled when annotators observe actions happening here.
We devise a partial segment loss regarded as a loss sampling to learn integral action parts from labeled segments.
arXiv Detail & Related papers (2020-07-03T10:32:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.