Focalized Contrastive View-invariant Learning for Self-supervised
Skeleton-based Action Recognition
- URL: http://arxiv.org/abs/2304.00858v1
- Date: Mon, 3 Apr 2023 10:12:30 GMT
- Title: Focalized Contrastive View-invariant Learning for Self-supervised
Skeleton-based Action Recognition
- Authors: Qianhui Men, Edmond S. L. Ho, Hubert P. H. Shum, Howard Leung
- Abstract summary: We propose a self-supervised framework called Focalized Contrastive View-invariant Learning (FoCoViL)
FoCoViL significantly suppresses the view-specific information on the representation space where the viewpoints are coarsely aligned.
It associates actions with common view-invariant properties and simultaneously separates the dissimilar ones.
- Score: 16.412306012741354
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning view-invariant representation is a key to improving feature
discrimination power for skeleton-based action recognition. Existing approaches
cannot effectively remove the impact of viewpoint due to the implicit
view-dependent representations. In this work, we propose a self-supervised
framework called Focalized Contrastive View-invariant Learning (FoCoViL), which
significantly suppresses the view-specific information on the representation
space where the viewpoints are coarsely aligned. By maximizing mutual
information with an effective contrastive loss between multi-view sample pairs,
FoCoViL associates actions with common view-invariant properties and
simultaneously separates the dissimilar ones. We further propose an adaptive
focalization method based on pairwise similarity to enhance contrastive
learning for a clearer cluster boundary in the learned space. Different from
many existing self-supervised representation learning work that rely heavily on
supervised classifiers, FoCoViL performs well on both unsupervised and
supervised classifiers with superior recognition performance. Extensive
experiments also show that the proposed contrastive-based focalization
generates a more discriminative latent representation.
Related papers
- Discriminative Anchor Learning for Efficient Multi-view Clustering [59.11406089896875]
We propose discriminative anchor learning for multi-view clustering (DALMC)
We learn discriminative view-specific feature representations according to the original dataset.
We build anchors from different views based on these representations, which increase the quality of the shared anchor graph.
arXiv Detail & Related papers (2024-09-25T13:11:17Z) - An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition [49.45660055499103]
Zero-shot human skeleton-based action recognition aims to construct a model that can recognize actions outside the categories seen during training.
Previous research has focused on aligning sequences' visual and semantic spatial distributions.
We introduce a new loss function sampling method to obtain a tight and robust representation.
arXiv Detail & Related papers (2024-06-02T06:53:01Z) - Constrained Multiview Representation for Self-supervised Contrastive
Learning [4.817827522417457]
We introduce a novel approach predicated on representation distance-based mutual information (MI) for measuring the significance of different views.
We harness multi-view representations extracted from the frequency domain, re-evaluating their significance based on mutual information.
arXiv Detail & Related papers (2024-02-05T19:09:33Z) - A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z) - Cluster-aware Contrastive Learning for Unsupervised Out-of-distribution
Detection [0.0]
Unsupervised out-of-distribution (OOD) Detection aims to separate the samples falling outside the distribution of training data without label information.
We propose Cluster-aware Contrastive Learning (CCL) framework for unsupervised OOD detection, which considers both instance-level and semantic-level information.
arXiv Detail & Related papers (2023-02-06T07:21:03Z) - Semantics-Consistent Feature Search for Self-Supervised Visual
Representation Learning [15.242064747740116]
It is unavoidable to construct undesirable views containing different semantic concepts during the augmentation procedure.
It would damage the semantic consistency of representation to pull these augmentations closer in the feature space indiscriminately.
In this study, we introduce feature-level augmentation and propose a novel semantics-consistent feature search (SCFS) method to mitigate this negative effect.
arXiv Detail & Related papers (2022-12-13T11:13:59Z) - Matching Multiple Perspectives for Efficient Representation Learning [0.0]
We present an approach that combines self-supervised learning with a multi-perspective matching technique.
We show that the availability of multiple views of the same object combined with a variety of self-supervised pretraining algorithms can lead to improved object classification performance.
arXiv Detail & Related papers (2022-08-16T10:33:13Z) - Visual Perturbation-aware Collaborative Learning for Overcoming the
Language Prior Problem [60.0878532426877]
We propose a novel collaborative learning scheme from the viewpoint of visual perturbation calibration.
Specifically, we devise a visual controller to construct two sorts of curated images with different perturbation extents.
The experimental results on two diagnostic VQA-CP benchmark datasets evidently demonstrate its effectiveness.
arXiv Detail & Related papers (2022-07-24T23:50:52Z) - Deep Clustering by Semantic Contrastive Learning [67.28140787010447]
We introduce a novel variant called Semantic Contrastive Learning (SCL)
It explores the characteristics of both conventional contrastive learning and deep clustering.
It can amplify the strengths of contrastive learning and deep clustering in a unified approach.
arXiv Detail & Related papers (2021-03-03T20:20:48Z) - Heterogeneous Contrastive Learning: Encoding Spatial Information for
Compact Visual Representations [183.03278932562438]
This paper presents an effective approach that adds spatial information to the encoding stage to alleviate the learning inconsistency between the contrastive objective and strong data augmentation operations.
We show that our approach achieves higher efficiency in visual representations and thus delivers a key message to inspire the future research of self-supervised visual representation learning.
arXiv Detail & Related papers (2020-11-19T16:26:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.