Camera-Conditioned Stable Feature Generation for Isolated Camera
Supervised Person Re-IDentification
- URL: http://arxiv.org/abs/2203.15210v1
- Date: Tue, 29 Mar 2022 03:10:24 GMT
- Title: Camera-Conditioned Stable Feature Generation for Isolated Camera
Supervised Person Re-IDentification
- Authors: Chao Wu, Wenhang Ge, Ancong Wu, Xiaobin Chang
- Abstract summary: Cross-camera images could be unavailable under the ISolated Camera Supervised setting, e.g., a surveillance system deployed across distant scenes.
A new pipeline is introduced by synthesizing the cross-camera samples in the feature space for model training.
Experiments on two ISCS person Re-ID datasets demonstrate the superiority of our CCSFG to the competitors.
- Score: 24.63519986072777
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To learn camera-view invariant features for person Re-IDentification (Re-ID),
the cross-camera image pairs of each person play an important role. However,
such cross-view training samples could be unavailable under the ISolated Camera
Supervised (ISCS) setting, e.g., a surveillance system deployed across distant
scenes.To handle this challenging problem, a new pipeline is introduced by
synthesizing the cross-camera samples in the feature space for model training.
Specifically, the feature encoder and generator are end-to-end optimized under
a novel method, Camera-Conditioned Stable Feature Generation (CCSFG). Its joint
learning procedure raises concern on the stability of generative model
training. Therefore, a new feature generator, $\sigma$-Regularized Conditional
Variational Autoencoder ($\sigma$-Reg.~CVAE), is proposed with theoretical and
experimental analysis on its robustness. Extensive experiments on two ISCS
person Re-ID datasets demonstrate the superiority of our CCSFG to the
competitors.
Related papers
- SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification [61.753607285860944]
We propose a novel two-stage feature learning framework named SD-ReID for AG-ReID.
In the first stage, we train a simple ViT-based model to extract coarse-grained representations and controllable conditions.
In the second stage, we fine-tune the SD model to learn complementary representations guided by the controllable conditions.
arXiv Detail & Related papers (2025-04-13T12:44:50Z) - OminiControl: Minimal and Universal Control for Diffusion Transformer [68.3243031301164]
OminiControl is a framework that integrates image conditions into pre-trained Diffusion Transformer (DiT) models.
At its core, OminiControl leverages a parameter reuse mechanism, enabling the DiT to encode image conditions using itself as a powerful backbone.
OminiControl addresses a wide range of image conditioning tasks in a unified manner, including subject-driven generation and spatially-aligned conditions.
arXiv Detail & Related papers (2024-11-22T17:55:15Z) - Exploring Stronger Transformer Representation Learning for Occluded Person Re-Identification [2.552131151698595]
We proposed a novel self-supervision and supervision combining transformer-based person re-identification framework, namely SSSC-TransReID.
We designed a self-supervised contrastive learning branch, which can enhance the feature representation for person re-identification without negative samples or additional pre-training.
Our proposed model obtains superior Re-ID performance consistently and outperforms the state-of-the-art ReID methods by large margins on the mean average accuracy (mAP) and Rank-1 accuracy.
arXiv Detail & Related papers (2024-10-21T03:17:25Z) - RCDN: Towards Robust Camera-Insensitivity Collaborative Perception via Dynamic Feature-based 3D Neural Modeling [13.980022113881697]
We introduce a new robust camera-insensitivity problem: how to overcome the issues caused by the failed camera perspectives?
We propose RCDN, a Robust Camera-insensitivity collaborative perception with a novel Dynamic feature-based 3D Neural modeling mechanism.
arXiv Detail & Related papers (2024-05-27T06:35:55Z) - Robust Ensemble Person Re-Identification via Orthogonal Fusion with Occlusion Handling [4.431087385310259]
Occlusion remains one of the major challenges in person reidentification (ReID)
We propose a deep ensemble model that harnesses both CNN and Transformer architectures to generate robust feature representations.
arXiv Detail & Related papers (2024-03-29T18:38:59Z) - OCR is All you need: Importing Multi-Modality into Image-based Defect Detection System [7.1083241462091165]
We introduce an external modality-guided data mining framework, primarily rooted in optical character recognition (OCR), to extract statistical features from images.
A key aspect of our approach is the alignment of external modality features, extracted using a single modality-aware model, with image features encoded by a convolutional neural network.
Our methodology considerably boosts the recall rate of the defect detection model and maintains high robustness even in challenging scenarios.
arXiv Detail & Related papers (2024-03-18T07:41:39Z) - A Transformer Model for Boundary Detection in Continuous Sign Language [55.05986614979846]
The Transformer model is employed for both Isolated Sign Language Recognition and Continuous Sign Language Recognition.
The training process involves using isolated sign videos, where hand keypoint features extracted from the input video are enriched.
The trained model, coupled with a post-processing method, is then applied to detect isolated sign boundaries within continuous sign videos.
arXiv Detail & Related papers (2024-02-22T17:25:01Z) - Complexity Matters: Rethinking the Latent Space for Generative Modeling [65.64763873078114]
In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion.
In this study, we aim to shed light on this under-explored topic by rethinking the latent space from the perspective of model complexity.
arXiv Detail & Related papers (2023-07-17T07:12:29Z) - Domain-adaptive Person Re-identification without Cross-camera Paired
Samples [12.041823465553875]
Cross-camera pedestrian samples collected from long-distance scenes often have no positive samples.
It is extremely challenging to use cross-camera negative samples to achieve cross-region pedestrian identity matching.
A novel domain-adaptive person re-ID method that focuses on cross-camera consistent discriminative feature learning is proposed.
arXiv Detail & Related papers (2023-07-13T02:42:28Z) - Cross-Camera Feature Prediction for Intra-Camera Supervised Person
Re-identification across Distant Scenes [70.30052164401178]
Person re-identification (Re-ID) aims to match person images across non-overlapping camera views.
ICS-DS Re-ID uses cross-camera unpaired data with intra-camera identity labels for training.
Cross-camera feature prediction method to mine cross-camera self supervision information.
Joint learning of global-level and local-level features forms a global-local cross-camera feature prediction scheme.
arXiv Detail & Related papers (2021-07-29T11:27:50Z) - Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for
Unsupervised Person Re-Identification [60.36551512902312]
unsupervised person re-identification (re-ID) aims to learn discriminative models with unlabeled data.
One popular method is to obtain pseudo-label by clustering and use them to optimize the model.
In this paper, we propose a unified framework to solve both problems.
arXiv Detail & Related papers (2021-03-08T09:13:06Z) - Towards Precise Intra-camera Supervised Person Re-identification [54.86892428155225]
Intra-camera supervision (ICS) for person re-identification (Re-ID) assumes that identity labels are independently annotated within each camera view.
Lack of inter-camera labels makes the ICS Re-ID problem much more challenging than the fully supervised counterpart.
Our approach performs even comparable to state-of-the-art fully supervised methods in two of the datasets.
arXiv Detail & Related papers (2020-02-12T11:56:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.