Related papers: Pose-Diversified Augmentation with Diffusion Model for Person Re-Identification

Pose-Diversified Augmentation with Diffusion Model for Person Re-Identification

URL: http://arxiv.org/abs/2406.16042v1
Date: Sun, 23 Jun 2024 07:48:21 GMT
Title: Pose-Diversified Augmentation with Diffusion Model for Person Re-Identification
Authors: Inès Hyeonsu Kim, JoungBin Lee, Soowon Son, Woojeong Jin, Kyusun Cho, Junyoung Seo, Min-Seop Kwak, Seokju Cho, JeongYeol Baek, Byeongwon Lee, Seungryong Kim,
Abstract summary: Person re-identification (Re-ID) often faces challenges due to variations in human poses and camera viewpoints. Previous methods have attempted to address these issues through data augmentation. We propose Diff-ID, a novel data augmentation approach that incorporates sparse and underrepresented human pose and camera viewpoint examples.
Score: 28.794827024749658
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Person re-identification (Re-ID) often faces challenges due to variations in human poses and camera viewpoints, which significantly affect the appearance of individuals across images. Existing datasets frequently lack diversity and scalability in these aspects, hindering the generalization of Re-ID models to new camera systems. Previous methods have attempted to address these issues through data augmentation; however, they rely on human poses already present in the training dataset, failing to effectively reduce the human pose bias in the dataset. We propose Diff-ID, a novel data augmentation approach that incorporates sparse and underrepresented human pose and camera viewpoint examples into the training data, addressing the limited diversity in the original training data distribution. Our objective is to augment a training dataset that enables existing Re-ID models to learn features unbiased by human pose and camera viewpoint variations. To achieve this, we leverage the knowledge of pre-trained large-scale diffusion models. Using the SMPL model, we simultaneously capture both the desired human poses and camera viewpoints, enabling realistic human rendering. The depth information provided by the SMPL model indirectly conveys the camera viewpoints. By conditioning the diffusion model on both the human pose and camera viewpoint concurrently through the SMPL model, we generate realistic images with diverse human poses and camera viewpoints. Qualitative results demonstrate the effectiveness of our method in addressing human pose bias and enhancing the generalizability of Re-ID models compared to other data augmentation-based Re-ID approaches. The performance gains achieved by training Re-ID models on our offline augmented dataset highlight the potential of our proposed framework in improving the scalability and generalizability of person Re-ID models.

Related papers

DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior [82.9526308672547]
We present DPoser-X, a diffusion-based prior model for 3D whole-body human poses.<n>Our approach unifies various pose-centric tasks as inverse problems, solving them through variational diffusion sampling.<n>Our model consistently outperforms state-of-the-art alternatives, establishing a new benchmark for whole-body human pose prior modeling.
arXiv Detail & Related papers (2025-08-01T12:56:39Z)
Test-Time Augmentation for Pose-invariant Face Recognition [14.515296731166721]
Pose-TTA is a novel approach that aligns faces at inference time without additional training.<n>To achieve this, we employ a portrait animator that transfers the source image identity into the pose of a driving image.<n>We propose a weighted feature aggregation strategy to address any distortions or biases arising from the synthetic data.
arXiv Detail & Related papers (2025-05-14T10:11:35Z)
SapiensID: Foundation for Human Recognition [15.65725865703615]
SapiensID is a unified model for face and body analysis, achieving robust performance across diverse settings. To facilitate training, we introduce WebBody4M, a large-scale dataset capturing diverse poses and scale variations. Experiments demonstrate that SapiensID achieves state-of-the-art results on various body ReID benchmarks, outperforming specialized models in both short-term and long-term scenarios.
arXiv Detail & Related papers (2025-04-07T03:38:07Z)
Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models [65.82564074712836]
We introduce DIFfusionHOI, a new HOI detector shedding light on text-to-image diffusion models. We first devise an inversion-based strategy to learn the expression of relation patterns between humans and objects in embedding space. These learned relation embeddings then serve as textual prompts, to steer diffusion models generate images that depict specific interactions.
arXiv Detail & Related papers (2024-10-26T12:00:33Z)
Synthesizing Efficient Data with Diffusion Models for Person Re-Identification Pre-Training [51.87027943520492]
We present a novel paradigm Diffusion-ReID to efficiently augment and generate diverse images based on known identities. Benefiting from our proposed paradigm, we first create a new large-scale person Re-ID dataset Diff-Person, which consists of over 777K images from 5,183 identities.
arXiv Detail & Related papers (2024-06-10T06:26:03Z)
Data Augmentation in Human-Centric Vision [54.97327269866757]
This survey presents a comprehensive analysis of data augmentation techniques in human-centric vision tasks. It delves into a wide range of research areas including person ReID, human parsing, human pose estimation, and pedestrian detection. Our work categorizes data augmentation methods into two main types: data generation and data perturbation.
arXiv Detail & Related papers (2024-03-13T16:05:18Z)
Extracting Training Data from Diffusion Models [77.11719063152027]
We show that diffusion models memorize individual images from their training data and emit them at generation time. With a generate-and-filter pipeline, we extract over a thousand training examples from state-of-the-art models. We train hundreds of diffusion models in various settings to analyze how different modeling and data decisions affect privacy.
arXiv Detail & Related papers (2023-01-30T18:53:09Z)
Pose Invariant Person Re-Identification using Robust Pose-transformation GAN [11.338815177557645]
Person re-identification (re-ID) aims to retrieve a person's images from an image gallery, given a single instance of the person of interest. Despite several advancements, learning discriminative identity-sensitive and viewpoint invariant features for robust Person Re-identification is a major challenge owing to large pose variation of humans. This paper proposes a re-ID pipeline that utilizes the image generation capability of Generative Adversarial Networks combined with pose regression and feature fusion to achieve pose invariant feature learning.
arXiv Detail & Related papers (2021-04-11T15:47:03Z)
Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames. Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z)
MirrorNet: A Deep Bayesian Approach to Reflective 2D Pose Estimation from Human Images [42.27703025887059]
The main problems with the standard supervised approach are that it often yields anatomically implausible poses. We propose a semi-supervised method that can make effective use of images with and without pose annotations. The results of experiments show that the proposed reflective architecture makes estimated poses anatomically plausible.
arXiv Detail & Related papers (2020-04-08T05:02:48Z)
Cross-Resolution Adversarial Dual Network for Person Re-Identification and Beyond [59.149653740463435]
Person re-identification (re-ID) aims at matching images of the same person across camera views. Due to varying distances between cameras and persons of interest, resolution mismatch can be expected. We propose a novel generative adversarial network to address cross-resolution person re-ID.
arXiv Detail & Related papers (2020-02-19T07:21:38Z)
A Robust Pose Transformational GAN for Pose Guided Person Image Synthesis [9.570395744724461]
We propose a simple yet effective pose transformation GAN by utilizing the Residual Learning method without any additional feature learning to generate a given human image in any arbitrary pose. Using effective data augmentation techniques and cleverly tuning the model, we achieve robustness in terms of illumination, occlusion, distortion and scale. We present a detailed study, both qualitative and quantitative, to demonstrate the superiority of our model over the existing methods on two large datasets.
arXiv Detail & Related papers (2020-01-05T15:32:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.