Boosting Semi-Supervised 2D Human Pose Estimation by Revisiting Data Augmentation and Consistency Training
- URL: http://arxiv.org/abs/2402.11566v3
- Date: Thu, 13 Feb 2025 03:15:37 GMT
- Title: Boosting Semi-Supervised 2D Human Pose Estimation by Revisiting Data Augmentation and Consistency Training
- Authors: Huayi Zhou, Mukun Luo, Fei Jiang, Yue Ding, Hongtao Lu, Kui Jia,
- Abstract summary: We find that SSHPE can be boosted from two cores: advanced data augmentations and concise consistency training ways.
This simple and compact design is interpretable, and easily benefits from newly found augmentations.
We extensively validate the superiority and versatility of our approach on conventional human body images, overhead fisheye images, and human hand images.
- Score: 54.074020740827855
- License:
- Abstract: The 2D human pose estimation (HPE) is a basic visual problem. However, its supervised learning requires massive keypoint labels, which is labor-intensive to collect. Thus, we aim at boosting a pose estimator by excavating extra unlabeled data with semi-supervised learning (SSL). Most previous SSHPE methods are consistency-based and strive to maintain consistent outputs for differently augmented inputs. Under this genre, we find that SSHPE can be boosted from two cores: advanced data augmentations and concise consistency training ways. Specifically, for the first core, we discover the synergistic effects of existing augmentations, and reveal novel paradigms for conveniently producing new superior HPE-oriented augmentations which can more effectively add noise on unlabeled samples. We can therefore establish paired easy-hard augmentations with larger difficulty gaps. For the second core, we propose to repeatedly augment unlabeled images with diverse hard augmentations, and generate multi-path predictions sequentially for optimizing multi-losses in a single network. This simple and compact design is interpretable, and easily benefits from newly found augmentations. Comparing to state-of-the-art SSL approaches, our method brings substantial improvements on public datasets. And we extensively validate the superiority and versatility of our approach on conventional human body images, overhead fisheye images, and human hand images. The code is released in https://github.com/hnuzhy/MultiAugs.
Related papers
- Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning [52.170253590364545]
Gen-SIS is a diffusion-based augmentation technique trained exclusively on unlabeled image data.
We show that these self-augmentations', i.e. generative augmentations based on the vanilla SSL encoder embeddings, facilitate the training of a stronger SSL encoder.
arXiv Detail & Related papers (2024-12-02T16:20:59Z) - GeNIe: Generative Hard Negative Images Through Diffusion [16.619150568764262]
Recent advances in generative AI have enabled more sophisticated augmentation techniques that produce data resembling natural images.
We introduce GeNIe, a novel augmentation method which leverages a latent diffusion model conditioned on a text prompt to generate challenging augmentations.
Our experiments demonstrate the effectiveness of our novel augmentation method and its superior performance over the prior art.
arXiv Detail & Related papers (2023-12-05T07:34:30Z) - Contrastive Transformer Learning with Proximity Data Generation for
Text-Based Person Search [60.626459715780605]
Given a descriptive text query, text-based person search aims to retrieve the best-matched target person from an image gallery.
Such a cross-modal retrieval task is quite challenging due to significant modality gap, fine-grained differences and insufficiency of annotated data.
In this paper, we propose a simple yet effective dual Transformer model for text-based person search.
arXiv Detail & Related papers (2023-11-15T16:26:49Z) - GraphLearner: Graph Node Clustering with Fully Learnable Augmentation [76.63963385662426]
Contrastive deep graph clustering (CDGC) leverages the power of contrastive learning to group nodes into different clusters.
We propose a Graph Node Clustering with Fully Learnable Augmentation, termed GraphLearner.
It introduces learnable augmentors to generate high-quality and task-specific augmented samples for CDGC.
arXiv Detail & Related papers (2022-12-07T10:19:39Z) - MSR: Making Self-supervised learning Robust to Aggressive Augmentations [98.6457801252358]
We propose a new SSL paradigm, which counteracts the impact of semantic shift by balancing the role of weak and aggressively augmented pairs.
We show that our model achieves 73.1% top-1 accuracy on ImageNet-1K with ResNet-50 for 200 epochs, which is a 2.5% improvement over BYOL.
arXiv Detail & Related papers (2022-06-04T14:27:29Z) - Augmentation Pathways Network for Visual Recognition [61.33084317147437]
This paper introduces Augmentation Pathways (AP) to stabilize training on a much wider range of augmentation policies.
AP tames heavy data augmentations and stably boosts performance without a careful selection among augmentation policies.
Experimental results on ImageNet benchmarks demonstrate the compatibility and effectiveness on a much wider range of augmentations.
arXiv Detail & Related papers (2021-07-26T06:54:53Z) - Few-shot learning via tensor hallucination [17.381648488344222]
Few-shot classification addresses the challenge of classifying examples given only limited labeled data.
We show that using a simple loss function is more than enough for training a feature generator in the few-shot setting.
Our method sets a new state of the art, outperforming more sophisticated few-shot data augmentation methods.
arXiv Detail & Related papers (2021-04-19T17:30:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.