Boosting Semi-Supervised 2D Human Pose Estimation by Revisiting Data
Augmentation and Consistency Training
- URL: http://arxiv.org/abs/2402.11566v2
- Date: Fri, 8 Mar 2024 02:46:23 GMT
- Title: Boosting Semi-Supervised 2D Human Pose Estimation by Revisiting Data
Augmentation and Consistency Training
- Authors: Huayi Zhou, Mukun Luo, Fei Jiang, Yue Ding, Hongtao Lu
- Abstract summary: We find that SSHPE can be boosted from two cores: advanced data augmentations and concise consistency training ways.
We propose to repeatedly augment unlabeled images with diverse hard augmentations, and generate multi-path predictions sequentially.
Compared to SOTA approaches, our method brings substantial improvements on public datasets.
- Score: 25.02026393037821
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The 2D human pose estimation (HPE) is a basic visual problem. However, its
supervised learning requires massive keypoint labels, which is labor-intensive
to collect. Thus, we aim at boosting a pose estimator by excavating extra
unlabeled data with semi-supervised learning (SSL). Most previous SSHPE methods
are consistency-based and strive to maintain consistent outputs for differently
augmented inputs. Under this genre, we find that SSHPE can be boosted from two
cores: advanced data augmentations and concise consistency training ways.
Specifically, for the first core, we discover the synergistic effects of
existing augmentations, and reveal novel paradigms for conveniently producing
new superior HPE-oriented augmentations which can more effectively add noise on
unlabeled samples. We can therefore establish paired easy-hard augmentations
with larger difficulty gaps. For the second core, we propose to repeatedly
augment unlabeled images with diverse hard augmentations, and generate
multi-path predictions sequentially for optimizing multi-losses in a single
network. This simple and compact design is interpretable, and easily benefits
from newly found augmentations. Comparing to SOTA approaches, our method brings
substantial improvements on public datasets. Code is in
\url{https://github.com/hnuzhy/MultiAugs}
Related papers
- Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning [52.170253590364545]
Gen-SIS is a diffusion-based augmentation technique trained exclusively on unlabeled image data.
We show that these self-augmentations', i.e. generative augmentations based on the vanilla SSL encoder embeddings, facilitate the training of a stronger SSL encoder.
arXiv Detail & Related papers (2024-12-02T16:20:59Z) - GeNIe: Generative Hard Negative Images Through Diffusion [16.619150568764262]
Recent advances in generative AI have enabled more sophisticated augmentation techniques that produce data resembling natural images.
We introduce GeNIe, a novel augmentation method which leverages a latent diffusion model conditioned on a text prompt to generate challenging augmentations.
Our experiments demonstrate the effectiveness of our novel augmentation method and its superior performance over the prior art.
arXiv Detail & Related papers (2023-12-05T07:34:30Z) - Contrastive Transformer Learning with Proximity Data Generation for
Text-Based Person Search [60.626459715780605]
Given a descriptive text query, text-based person search aims to retrieve the best-matched target person from an image gallery.
Such a cross-modal retrieval task is quite challenging due to significant modality gap, fine-grained differences and insufficiency of annotated data.
In this paper, we propose a simple yet effective dual Transformer model for text-based person search.
arXiv Detail & Related papers (2023-11-15T16:26:49Z) - GraphLearner: Graph Node Clustering with Fully Learnable Augmentation [76.63963385662426]
Contrastive deep graph clustering (CDGC) leverages the power of contrastive learning to group nodes into different clusters.
We propose a Graph Node Clustering with Fully Learnable Augmentation, termed GraphLearner.
It introduces learnable augmentors to generate high-quality and task-specific augmented samples for CDGC.
arXiv Detail & Related papers (2022-12-07T10:19:39Z) - MSR: Making Self-supervised learning Robust to Aggressive Augmentations [98.6457801252358]
We propose a new SSL paradigm, which counteracts the impact of semantic shift by balancing the role of weak and aggressively augmented pairs.
We show that our model achieves 73.1% top-1 accuracy on ImageNet-1K with ResNet-50 for 200 epochs, which is a 2.5% improvement over BYOL.
arXiv Detail & Related papers (2022-06-04T14:27:29Z) - Augmentation Pathways Network for Visual Recognition [61.33084317147437]
This paper introduces Augmentation Pathways (AP) to stabilize training on a much wider range of augmentation policies.
AP tames heavy data augmentations and stably boosts performance without a careful selection among augmentation policies.
Experimental results on ImageNet benchmarks demonstrate the compatibility and effectiveness on a much wider range of augmentations.
arXiv Detail & Related papers (2021-07-26T06:54:53Z) - Few-shot learning via tensor hallucination [17.381648488344222]
Few-shot classification addresses the challenge of classifying examples given only limited labeled data.
We show that using a simple loss function is more than enough for training a feature generator in the few-shot setting.
Our method sets a new state of the art, outperforming more sophisticated few-shot data augmentation methods.
arXiv Detail & Related papers (2021-04-19T17:30:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.