MagnifierNet: Towards Semantic Adversary and Fusion for Person
Re-identification
- URL: http://arxiv.org/abs/2002.10979v4
- Date: Tue, 5 May 2020 02:22:42 GMT
- Title: MagnifierNet: Towards Semantic Adversary and Fusion for Person
Re-identification
- Authors: Yushi Lan, Yuan Liu, Maoqing Tian, Xinchi Zhou, Xuesen Zhang, Shuai
Yi, Hongsheng Li
- Abstract summary: MagnifierNet is a triple-branch network which accurately mines details from whole to parts.
"Semantic Fusion Branch" filters out irrelevant noises by selectively fusing semantic region information sequentially.
"Semantic Diversity Loss" removes redundant overlaps across learned semantic representations.
- Score: 38.13515165097505
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although person re-identification (ReID) has achieved significant improvement
recently by enforcing part alignment, it is still a challenging task when it
comes to distinguishing visually similar identities or identifying the occluded
person. In these scenarios, magnifying details in each part features and
selectively fusing them together may provide a feasible solution. In this work,
we propose MagnifierNet, a triple-branch network which accurately mines details
from whole to parts. Firstly, the holistic salient features are encoded by a
global branch. Secondly, to enhance detailed representation for each semantic
region, the "Semantic Adversarial Branch" is designed to learn from dynamically
generated semantic-occluded samples during training. Meanwhile, we introduce
"Semantic Fusion Branch" to filter out irrelevant noises by selectively fusing
semantic region information sequentially. To further improve feature diversity,
we introduce a novel loss function "Semantic Diversity Loss" to remove
redundant overlaps across learned semantic representations. State-of-the-art
performance has been achieved on three benchmarks by large margins.
Specifically, the mAP score is improved by 6% and 5% on the most challenging
CUHK03-L and CUHK03-D benchmarks.
Related papers
- GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding [66.5538429726564]
Self-supervised 3D representation learning aims to learn effective representations from large-scale unlabeled point clouds.
We propose GroupContrast, a novel approach that combines segment grouping and semantic-aware contrastive learning.
arXiv Detail & Related papers (2024-03-14T17:59:59Z) - Collaborative Group: Composed Image Retrieval via Consensus Learning from Noisy Annotations [67.92679668612858]
We propose the Consensus Network (Css-Net), inspired by the psychological concept that groups outperform individuals.
Css-Net comprises two core components: (1) a consensus module with four diverse compositors, each generating distinct image-text embeddings; and (2) a Kullback-Leibler divergence loss that encourages learning of inter-compositor interactions.
On benchmark datasets, particularly FashionIQ, Css-Net demonstrates marked improvements. Notably, it achieves significant recall gains, with a 2.77% increase in R@10 and 6.67% boost in R@50, underscoring its
arXiv Detail & Related papers (2023-06-03T11:50:44Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - Leveraging Hidden Positives for Unsupervised Semantic Segmentation [5.937673383513695]
We leverage contrastive learning by excavating hidden positives to learn rich semantic relationships.
We introduce a gradient propagation strategy to learn semantic consistency between adjacent patches.
Our proposed method achieves new state-of-the-art (SOTA) results in COCO-stuff, Cityscapes, and Potsdam-3 datasets.
arXiv Detail & Related papers (2023-03-27T08:57:28Z) - Feature Completion Transformer for Occluded Person Re-identification [25.159974510754992]
Occluded person re-identification (Re-ID) is a challenging problem due to the destruction of occluders.
We propose a Feature Completion Transformer (FCFormer) to implicitly complement the semantic information of occluded parts in the feature space.
FCFormer achieves superior performance and outperforms the state-of-the-art methods by significant margins on occluded datasets.
arXiv Detail & Related papers (2023-03-03T01:12:57Z) - Semantic Feature Integration network for Fine-grained Visual
Classification [5.182627302449368]
We propose the Semantic Feature Integration network (SFI-Net) to address the above difficulties.
By eliminating unnecessary features and reconstructing the semantic relations among discriminative features, our SFI-Net has achieved satisfying performance.
arXiv Detail & Related papers (2023-02-13T07:32:25Z) - Part-guided Relational Transformers for Fine-grained Visual Recognition [59.20531172172135]
We propose a framework to learn the discriminative part features and explore correlations with a feature transformation module.
Our proposed approach does not rely on additional part branches and reaches state-the-of-art performance on 3-of-the-level object recognition.
arXiv Detail & Related papers (2022-12-28T03:45:56Z) - Regional Semantic Contrast and Aggregation for Weakly Supervised
Semantic Segmentation [25.231470587575238]
We propose regional semantic contrast and aggregation (RCA) for learning semantic segmentation.
RCA is equipped with a regional memory bank to store massive, diverse object patterns appearing in training data.
RCA earns a strong capability of fine-grained semantic understanding, and eventually establishes new state-of-the-art results on two popular benchmarks.
arXiv Detail & Related papers (2022-03-17T23:29:03Z) - Domain Generalization via Shuffled Style Assembly for Face Anti-Spoofing [69.80851569594924]
Generalizable face anti-spoofing (FAS) has drawn growing attention.
In this work, we separate the complete representation into content and style ones.
A novel Shuffled Style Assembly Network (SSAN) is proposed to extract and reassemble different content and style features.
arXiv Detail & Related papers (2022-03-10T12:44:05Z) - Deep Miner: A Deep and Multi-branch Network which Mines Rich and Diverse
Features for Person Re-identification [7.068680287596106]
Deep Miner is a method that allows CNNs to "mine" richer and more diverse features about people.
It produces a model that significantly outperforms state-of-the-art (SOTA) re-identification methods.
arXiv Detail & Related papers (2021-02-18T13:30:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.