Dense Unsupervised Learning for Video Segmentation
- URL: http://arxiv.org/abs/2111.06265v1
- Date: Thu, 11 Nov 2021 15:15:11 GMT
- Title: Dense Unsupervised Learning for Video Segmentation
- Authors: Nikita Araslanov, Simone Schaub-Meyer and Stefan Roth
- Abstract summary: We present a novel approach to unsupervised learning for video object segmentation (VOS)
Unlike previous work, our formulation allows to learn dense feature representations directly in a fully convolutional regime.
Our approach exceeds the segmentation accuracy of previous work despite using significantly less training data and compute power.
- Score: 49.46930315961636
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel approach to unsupervised learning for video object
segmentation (VOS). Unlike previous work, our formulation allows to learn dense
feature representations directly in a fully convolutional regime. We rely on
uniform grid sampling to extract a set of anchors and train our model to
disambiguate between them on both inter- and intra-video levels. However, a
naive scheme to train such a model results in a degenerate solution. We propose
to prevent this with a simple regularisation scheme, accommodating the
equivariance property of the segmentation task to similarity transformations.
Our training objective admits efficient implementation and exhibits fast
training convergence. On established VOS benchmarks, our approach exceeds the
segmentation accuracy of previous work despite using significantly less
training data and compute power.
Related papers
- Unsupervised Representation Learning by Balanced Self Attention Matching [2.3020018305241337]
We present a self-supervised method for embedding image features called BAM.
We obtain rich representations and avoid feature collapse by minimizing a loss that matches these distributions to their globally balanced and entropy regularized version.
We show competitive performance with leading methods on both semi-supervised and transfer-learning benchmarks.
arXiv Detail & Related papers (2024-08-04T12:52:44Z) - Self-Supervised Dual Contouring [30.9409064656302]
We propose a self-supervised training scheme for the Neural Dual Contouring meshing framework.
We use two novel self-supervised loss functions that encourage consistency between distances to the generated mesh.
We demonstrate that our self-supervised losses improve meshing performance in the single-view reconstruction task.
arXiv Detail & Related papers (2024-05-28T12:44:28Z) - Temporally Consistent Unbalanced Optimal Transport for Unsupervised Action Segmentation [31.622109513774635]
We propose a novel approach to the action segmentation task for long, untrimmed videos.
By encoding a temporal consistency prior to a Gromov-Wasserstein problem, we are able to decode a temporally consistent segmentation.
Our method does not require knowing the action order for a video to attain temporal consistency.
arXiv Detail & Related papers (2024-04-01T22:53:47Z) - Unsupervised Video Summarization via Iterative Training and Simplified GAN [12.32122301626006]
This paper introduces a new, unsupervised method for automatic video summarization using ideas from generative adversarial networks.
An iterative training strategy is also applied by alternately training the reconstructor and the frame selector for multiple iterations.
arXiv Detail & Related papers (2023-11-07T06:01:56Z) - Transform-Equivariant Consistency Learning for Temporal Sentence
Grounding [66.10949751429781]
We introduce a novel Equivariant Consistency Regulation Learning framework to learn more discriminative representations for each video.
Our motivation comes from that the temporal boundary of the query-guided activity should be consistently predicted.
In particular, we devise a self-supervised consistency loss module to enhance the completeness and smoothness of the augmented video.
arXiv Detail & Related papers (2023-05-06T19:29:28Z) - Parameter Decoupling Strategy for Semi-supervised 3D Left Atrium
Segmentation [0.0]
We present a novel semi-supervised segmentation model based on parameter decoupling strategy to encourage consistent predictions from diverse views.
Our method has achieved a competitive result over the state-of-the-art semisupervised methods on the Atrial Challenge dataset.
arXiv Detail & Related papers (2021-09-20T14:51:42Z) - Self-supervised Augmentation Consistency for Adapting Semantic
Segmentation [56.91850268635183]
We propose an approach to domain adaptation for semantic segmentation that is both practical and highly accurate.
We employ standard data augmentation techniques $-$ photometric noise, flipping and scaling $-$ and ensure consistency of the semantic predictions.
We achieve significant improvements of the state-of-the-art segmentation accuracy after adaptation, consistent both across different choices of the backbone architecture and adaptation scenarios.
arXiv Detail & Related papers (2021-04-30T21:32:40Z) - Unsupervised Learning of Video Representations via Dense Trajectory
Clustering [86.45054867170795]
This paper addresses the task of unsupervised learning of representations for action recognition in videos.
We first propose to adapt two top performing objectives in this class - instance recognition and local aggregation.
We observe promising performance, but qualitative analysis shows that the learned representations fail to capture motion patterns.
arXiv Detail & Related papers (2020-06-28T22:23:03Z) - Self-supervised Video Object Segmentation [76.83567326586162]
The objective of this paper is self-supervised representation learning, with the goal of solving semi-supervised video object segmentation (a.k.a. dense tracking)
We make the following contributions: (i) we propose to improve the existing self-supervised approach, with a simple, yet more effective memory mechanism for long-term correspondence matching; (ii) by augmenting the self-supervised approach with an online adaptation module, our method successfully alleviates tracker drifts caused by spatial-temporal discontinuity; (iv) we demonstrate state-of-the-art results among the self-supervised approaches on DAVIS-2017 and YouTube
arXiv Detail & Related papers (2020-06-22T17:55:59Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.