Related papers: Boosting Semi-Supervised 2D Human Pose Estimation by Revisiting Data Augmentation and Consistency Training

Boosting Semi-Supervised 2D Human Pose Estimation by Revisiting Data Augmentation and Consistency Training

URL: http://arxiv.org/abs/2402.11566v3
Date: Thu, 13 Feb 2025 03:15:37 GMT
Title: Boosting Semi-Supervised 2D Human Pose Estimation by Revisiting Data Augmentation and Consistency Training
Authors: Huayi Zhou, Mukun Luo, Fei Jiang, Yue Ding, Hongtao Lu, Kui Jia,
Abstract summary: We find that SSHPE can be boosted from two cores: advanced data augmentations and concise consistency training ways.<n>This simple and compact design is interpretable, and easily benefits from newly found augmentations.<n>We extensively validate the superiority and versatility of our approach on conventional human body images, overhead fisheye images, and human hand images.
Score: 54.074020740827855
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The 2D human pose estimation (HPE) is a basic visual problem. However, its supervised learning requires massive keypoint labels, which is labor-intensive to collect. Thus, we aim at boosting a pose estimator by excavating extra unlabeled data with semi-supervised learning (SSL). Most previous SSHPE methods are consistency-based and strive to maintain consistent outputs for differently augmented inputs. Under this genre, we find that SSHPE can be boosted from two cores: advanced data augmentations and concise consistency training ways. Specifically, for the first core, we discover the synergistic effects of existing augmentations, and reveal novel paradigms for conveniently producing new superior HPE-oriented augmentations which can more effectively add noise on unlabeled samples. We can therefore establish paired easy-hard augmentations with larger difficulty gaps. For the second core, we propose to repeatedly augment unlabeled images with diverse hard augmentations, and generate multi-path predictions sequentially for optimizing multi-losses in a single network. This simple and compact design is interpretable, and easily benefits from newly found augmentations. Comparing to state-of-the-art SSL approaches, our method brings substantial improvements on public datasets. And we extensively validate the superiority and versatility of our approach on conventional human body images, overhead fisheye images, and human hand images. The code is released in https://github.com/hnuzhy/MultiAugs.

Related papers

Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning [52.170253590364545]
Gen-SIS is a diffusion-based augmentation technique trained exclusively on unlabeled image data. We show that these self-augmentations', i.e. generative augmentations based on the vanilla SSL encoder embeddings, facilitate the training of a stronger SSL encoder.
arXiv Detail & Related papers (2024-12-02T16:20:59Z)
GeNIe: Generative Hard Negative Images Through Diffusion [16.619150568764262]
Recent advances in generative AI have enabled more sophisticated augmentation techniques that produce data resembling natural images. We introduce GeNIe, a novel augmentation method which leverages a latent diffusion model conditioned on a text prompt to generate challenging augmentations. Our experiments demonstrate the effectiveness of our novel augmentation method and its superior performance over the prior art.
arXiv Detail & Related papers (2023-12-05T07:34:30Z)
Contrastive Transformer Learning with Proximity Data Generation for Text-Based Person Search [60.626459715780605]
Given a descriptive text query, text-based person search aims to retrieve the best-matched target person from an image gallery. Such a cross-modal retrieval task is quite challenging due to significant modality gap, fine-grained differences and insufficiency of annotated data. In this paper, we propose a simple yet effective dual Transformer model for text-based person search.
arXiv Detail & Related papers (2023-11-15T16:26:49Z)
DualAug: Exploiting Additional Heavy Augmentation with OOD Data Rejection [77.6648187359111]
We propose a novel data augmentation method, named textbfDualAug, to keep the augmentation in distribution as much as possible at a reasonable time and computational cost. Experiments on supervised image classification benchmarks show that DualAug improve various automated data augmentation method.
arXiv Detail & Related papers (2023-10-12T08:55:10Z)
GraphLearner: Graph Node Clustering with Fully Learnable Augmentation [76.63963385662426]
Contrastive deep graph clustering (CDGC) leverages the power of contrastive learning to group nodes into different clusters. We propose a Graph Node Clustering with Fully Learnable Augmentation, termed GraphLearner. It introduces learnable augmentors to generate high-quality and task-specific augmented samples for CDGC.
arXiv Detail & Related papers (2022-12-07T10:19:39Z)
ScoreMix: A Scalable Augmentation Strategy for Training GANs with Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available. We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z)
Data-Efficient Augmentation for Training Neural Networks [15.870155099135538]
We propose a rigorous technique to select subsets of data points that when augmented, closely capture the training dynamics of full data augmentation. Our method achieves 6.3x speedup on CIFAR10 and 2.2x speedup on SVHN, and outperforms the baselines by up to 10% across various subset sizes.
arXiv Detail & Related papers (2022-10-15T19:32:20Z)
MSR: Making Self-supervised learning Robust to Aggressive Augmentations [98.6457801252358]
We propose a new SSL paradigm, which counteracts the impact of semantic shift by balancing the role of weak and aggressively augmented pairs. We show that our model achieves 73.1% top-1 accuracy on ImageNet-1K with ResNet-50 for 200 epochs, which is a 2.5% improvement over BYOL.
arXiv Detail & Related papers (2022-06-04T14:27:29Z)
Improving Contrastive Learning with Model Augmentation [123.05700988581806]
The sequential recommendation aims at predicting the next items in user behaviors, which can be solved by characterizing item relationships in sequences. Due to the data sparsity and noise issues in sequences, a new self-supervised learning (SSL) paradigm is proposed to improve the performance.
arXiv Detail & Related papers (2022-03-25T06:12:58Z)
Augmentation Pathways Network for Visual Recognition [61.33084317147437]
This paper introduces Augmentation Pathways (AP) to stabilize training on a much wider range of augmentation policies. AP tames heavy data augmentations and stably boosts performance without a careful selection among augmentation policies. Experimental results on ImageNet benchmarks demonstrate the compatibility and effectiveness on a much wider range of augmentations.
arXiv Detail & Related papers (2021-07-26T06:54:53Z)
Few-shot learning via tensor hallucination [17.381648488344222]
Few-shot classification addresses the challenge of classifying examples given only limited labeled data. We show that using a simple loss function is more than enough for training a feature generator in the few-shot setting. Our method sets a new state of the art, outperforming more sophisticated few-shot data augmentation methods.
arXiv Detail & Related papers (2021-04-19T17:30:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.