Turning to a Teacher for Timestamp Supervised Temporal Action
Segmentation
- URL: http://arxiv.org/abs/2207.00712v1
- Date: Sat, 2 Jul 2022 02:00:55 GMT
- Title: Turning to a Teacher for Timestamp Supervised Temporal Action
Segmentation
- Authors: Yang Zhao and Yan Song
- Abstract summary: We propose a new framework for timestamp supervised temporal action segmentation.
We introduce a teacher model parallel to the segmentation model to help stabilize the process of model optimization.
Our method outperforms the state-of-the-art method and performs comparably against the fully-supervised methods at a much lower annotation cost.
- Score: 27.735478880660164
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Temporal action segmentation in videos has drawn much attention recently.
Timestamp supervision is a cost-effective way for this task. To obtain more
information to optimize the model, the existing method generated pseudo
frame-wise labels iteratively based on the output of a segmentation model and
the timestamp annotations. However, this practice may introduce noise and
oscillation during the training, and lead to performance degeneration. To
address this problem, we propose a new framework for timestamp supervised
temporal action segmentation by introducing a teacher model parallel to the
segmentation model to help stabilize the process of model optimization. The
teacher model can be seen as an ensemble of the segmentation model, which helps
to suppress the noise and to improve the stability of pseudo labels. We further
introduce a segmentally smoothing loss, which is more focused and cohesive, to
enforce the smooth transition of the predicted probabilities within action
instances. The experiments on three datasets show that our method outperforms
the state-of-the-art method and performs comparably against the
fully-supervised methods at a much lower annotation cost.
Related papers
- Efficient Temporal Action Segmentation via Boundary-aware Query Voting [51.92693641176378]
BaFormer is a boundary-aware Transformer network that tokenizes each video segment as an instance token.
BaFormer significantly reduces the computational costs, utilizing only 6% of the running time.
arXiv Detail & Related papers (2024-05-25T00:44:13Z) - STAT: Towards Generalizable Temporal Action Localization [56.634561073746056]
Weakly-supervised temporal action localization (WTAL) aims to recognize and localize action instances with only video-level labels.
Existing methods suffer from severe performance degradation when transferring to different distributions.
We propose GTAL, which focuses on improving the generalizability of action localization methods.
arXiv Detail & Related papers (2024-04-20T07:56:21Z) - Timestamp-supervised Wearable-based Activity Segmentation and
Recognition with Contrastive Learning and Order-Preserving Optimal Transport [11.837401473598288]
We propose a novel method for joint activity segmentation and recognition with timestamp supervision.
The prototypes are estimated by class-activation maps to form a sample-to-prototype contrast module.
Comprehensive experiments on four public HAR datasets demonstrate that our model trained with timestamp supervision is superior to the state-of-the-art weakly-supervised methods.
arXiv Detail & Related papers (2023-10-13T14:00:49Z) - Diffusion Action Segmentation [63.061058214427085]
We propose a novel framework via denoising diffusion models, which shares the same inherent spirit of such iterative refinement.
In this framework, action predictions are iteratively generated from random noise with input video features as conditions.
arXiv Detail & Related papers (2023-03-31T10:53:24Z) - FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality
Assessment [93.09267863425492]
We argue that understanding both high-level semantics and internal temporal structures of actions in competitive sports videos is the key to making predictions accurate and interpretable.
We construct a new fine-grained dataset, called FineDiving, developed on diverse diving events with detailed annotations on action procedures.
arXiv Detail & Related papers (2022-04-07T17:59:32Z) - Dense Unsupervised Learning for Video Segmentation [49.46930315961636]
We present a novel approach to unsupervised learning for video object segmentation (VOS)
Unlike previous work, our formulation allows to learn dense feature representations directly in a fully convolutional regime.
Our approach exceeds the segmentation accuracy of previous work despite using significantly less training data and compute power.
arXiv Detail & Related papers (2021-11-11T15:15:11Z) - A Positive/Unlabeled Approach for the Segmentation of Medical Sequences
using Point-Wise Supervision [3.883460584034766]
We propose a new method to efficiently segment medical imaging volumes or videos using point-wise annotations only.
Our approach trains a deep learning model using an appropriate Positive/Unlabeled objective function using point-wise annotations.
We show experimentally that our approach outperforms state-of-the-art methods tailored to the same problem.
arXiv Detail & Related papers (2021-07-18T09:13:33Z) - Temporal Action Segmentation from Timestamp Supervision [25.49797678477498]
We introduce timestamp supervision for the temporal action segmentation task.
Timestamps require a comparable annotation effort to weakly supervised approaches.
Our approach uses the model output and the annotated timestamps to generate frame-wise labels.
arXiv Detail & Related papers (2021-03-11T13:52:41Z) - Weakly Supervised Temporal Action Localization with Segment-Level Labels [140.68096218667162]
Temporal action localization presents a trade-off between test performance and annotation-time cost.
We introduce a new segment-level supervision setting: segments are labeled when annotators observe actions happening here.
We devise a partial segment loss regarded as a loss sampling to learn integral action parts from labeled segments.
arXiv Detail & Related papers (2020-07-03T10:32:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.