Devil's in the Details: Aligning Visual Clues for Conditional Embedding
in Person Re-Identification
- URL: http://arxiv.org/abs/2009.05250v2
- Date: Mon, 7 Dec 2020 11:07:51 GMT
- Title: Devil's in the Details: Aligning Visual Clues for Conditional Embedding
in Person Re-Identification
- Authors: Fufu Yu, Xinyang Jiang, Yifei Gong, Shizhen Zhao, Xiaowei Guo, Wei-Shi
Zheng, Feng Zheng, Xing Sun
- Abstract summary: We propose two key recognition patterns to better utilize the detail information of pedestrian images.
CACE-Net achieves state-of-the-art performance on three public datasets.
- Score: 94.77172127405846
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although Person Re-Identification has made impressive progress, difficult
cases like occlusion, change of view-pointand similar clothing still bring
great challenges. Besides overall visual features, matching and comparing
detailed information is also essential for tackling these challenges. This
paper proposes two key recognition patterns to better utilize the detail
information of pedestrian images, that most of the existing methods are unable
to satisfy. Firstly, Visual Clue Alignment requires the model to select and
align decisive regions pairs from two images for pair-wise comparison, while
existing methods only align regions with predefined rules like high feature
similarity or same semantic labels. Secondly, the Conditional Feature Embedding
requires the overall feature of a query image to be dynamically adjusted based
on the gallery image it matches, while most of the existing methods ignore the
reference images. By introducing novel techniques including correspondence
attention module and discrepancy-based GCN, we propose an end-to-end ReID
method that integrates both patterns into a unified framework, called
CACE-Net((C)lue(A)lignment and (C)onditional (E)mbedding). The experiments show
that CACE-Net achieves state-of-the-art performance on three public datasets.
Related papers
- VSFormer: Visual-Spatial Fusion Transformer for Correspondence Pruning [22.0082111649259]
Correspondence pruning aims to find correct matches (inliers) from an initial set of putative correspondences.
We propose a Visual-Spatial Fusion Transformer (VSFormer) to identify inliers and recover camera poses accurately.
arXiv Detail & Related papers (2023-12-14T09:50:09Z) - Noisy-Correspondence Learning for Text-to-Image Person Re-identification [50.07634676709067]
We propose a novel Robust Dual Embedding method (RDE) to learn robust visual-semantic associations even with noisy correspondences.
Our method achieves state-of-the-art results both with and without synthetic noisy correspondences on three datasets.
arXiv Detail & Related papers (2023-08-19T05:34:13Z) - Occ$^2$Net: Robust Image Matching Based on 3D Occupancy Estimation for
Occluded Regions [14.217367037250296]
Occ$2$Net is an image matching method that models occlusion relations using 3D occupancy and infers matching points in occluded regions.
We evaluate our method on both real-world and simulated datasets and demonstrate its superior performance over state-of-the-art methods on several metrics.
arXiv Detail & Related papers (2023-08-14T13:09:41Z) - Single Stage Virtual Try-on via Deformable Attention Flows [51.70606454288168]
Virtual try-on aims to generate a photo-realistic fitting result given an in-shop garment and a reference person image.
We develop a novel Deformable Attention Flow (DAFlow) which applies the deformable attention scheme to multi-flow estimation.
Our proposed method achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-07-19T10:01:31Z) - Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance
Consistency [59.427074701985795]
Single-view reconstruction typically rely on viewpoint annotations, silhouettes, the absence of background, multiple views of the same instance, a template shape, or symmetry.
We avoid all of these supervisions and hypotheses by leveraging explicitly the consistency between images of different object instances.
Our main contributions are two approaches to leverage cross-instance consistency: (i) progressive conditioning, a training strategy to gradually specialize the model from category to instances in a curriculum learning fashion; (ii) swap reconstruction, a loss enforcing consistency between instances having similar shape or texture.
arXiv Detail & Related papers (2022-04-21T17:47:35Z) - Co-Attention for Conditioned Image Matching [91.43244337264454]
We propose a new approach to determine correspondences between image pairs in the wild under large changes in illumination, viewpoint, context, and material.
While other approaches find correspondences between pairs of images by treating the images independently, we instead condition on both images to implicitly take account of the differences between them.
arXiv Detail & Related papers (2020-07-16T17:32:00Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.