Unsupervised Person Re-Identification with Wireless Positioning under
Weak Scene Labeling
- URL: http://arxiv.org/abs/2110.15610v2
- Date: Wed, 5 Apr 2023 11:07:23 GMT
- Title: Unsupervised Person Re-Identification with Wireless Positioning under
Weak Scene Labeling
- Authors: Yiheng Liu, Wengang Zhou, Qiaokang Xie, Houqiang Li
- Abstract summary: We propose to explore unsupervised person re-identification with both visual data and wireless positioning trajectories under weak scene labeling.
Specifically, we propose a novel unsupervised multimodal training framework (UMTF), which models the complementarity of visual data and wireless information.
Our UMTF contains a multimodal data association strategy (MMDA) and a multimodal graph neural network (MMGN)
- Score: 131.18390399368997
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing unsupervised person re-identification methods only rely on visual
clues to match pedestrians under different cameras. Since visual data is
essentially susceptible to occlusion, blur, clothing changes, etc., a promising
solution is to introduce heterogeneous data to make up for the defect of visual
data. Some works based on full-scene labeling introduce wireless positioning to
assist cross-domain person re-identification, but their GPS labeling of entire
monitoring scenes is laborious. To this end, we propose to explore unsupervised
person re-identification with both visual data and wireless positioning
trajectories under weak scene labeling, in which we only need to know the
locations of the cameras. Specifically, we propose a novel unsupervised
multimodal training framework (UMTF), which models the complementarity of
visual data and wireless information. Our UMTF contains a multimodal data
association strategy (MMDA) and a multimodal graph neural network (MMGN). MMDA
explores potential data associations in unlabeled multimodal data, while MMGN
propagates multimodal messages in the video graph based on the adjacency matrix
learned from histogram statistics of wireless data. Thanks to the robustness of
the wireless data to visual noise and the collaboration of various modules,
UMTF is capable of learning a model free of the human label on data. Extensive
experimental results conducted on two challenging datasets, i.e., WP-ReID and
DukeMTMC-VideoReID demonstrate the effectiveness of the proposed method.
Related papers
- Few-shot Message-Enhanced Contrastive Learning for Graph Anomaly
Detection [15.757864894708364]
Graph anomaly detection plays a crucial role in identifying exceptional instances in graph data that deviate significantly from the majority.
We propose a novel few-shot Graph Anomaly Detection model called FMGAD.
We show that FMGAD can achieve better performance than other state-of-the-art methods, regardless of artificially injected anomalies or domain-organic anomalies.
arXiv Detail & Related papers (2023-11-17T07:49:20Z) - ADAMM: Anomaly Detection of Attributed Multi-graphs with Metadata: A
Unified Neural Network Approach [39.211176955683285]
We propose ADAMM, a novel graph neural network model that handles directed multi-graphs.
ADAMM fuses metadata and graph-level representation learning through an unsupervised anomaly detection objective.
arXiv Detail & Related papers (2023-11-13T14:19:36Z) - MMRDN: Consistent Representation for Multi-View Manipulation
Relationship Detection in Object-Stacked Scenes [62.20046129613934]
We propose a novel multi-view fusion framework, namely multi-view MRD network (MMRDN)
We project the 2D data from different views into a common hidden space and fit the embeddings with a set of Von-Mises-Fisher distributions.
We select a set of $K$ Maximum Vertical Neighbors (KMVN) points from the point cloud of each object pair, which encodes the relative position of these two objects.
arXiv Detail & Related papers (2023-04-25T05:55:29Z) - Unified Visual Relationship Detection with Vision and Language Models [89.77838890788638]
This work focuses on training a single visual relationship detector predicting over the union of label spaces from multiple datasets.
We propose UniVRD, a novel bottom-up method for Unified Visual Relationship Detection by leveraging vision and language models.
Empirical results on both human-object interaction detection and scene-graph generation demonstrate the competitive performance of our model.
arXiv Detail & Related papers (2023-03-16T00:06:28Z) - ViFiCon: Vision and Wireless Association Via Self-Supervised Contrastive
Learning [5.5232283752707785]
ViFiCon is a self-supervised contrastive learning scheme which uses synchronized information across vision and wireless modalities to perform cross-modal association.
We show that ViFiCon achieves high performance vision-to- wireless association, finding which bounding box corresponds to which smartphone device.
arXiv Detail & Related papers (2022-10-11T15:04:05Z) - From Unsupervised to Few-shot Graph Anomaly Detection: A Multi-scale Contrastive Learning Approach [26.973056364587766]
Anomaly detection from graph data is an important data mining task in many applications such as social networks, finance, and e-commerce.
We propose a novel framework, graph ANomaly dEtection framework with Multi-scale cONtrastive lEarning (ANEMONE in short)
By using a graph neural network as a backbone to encode the information from multiple graph scales (views), we learn better representation for nodes in a graph.
arXiv Detail & Related papers (2022-02-11T09:45:11Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - Visual Distant Supervision for Scene Graph Generation [66.10579690929623]
Scene graph models usually require supervised learning on large quantities of labeled data with intensive human annotation.
We propose visual distant supervision, a novel paradigm of visual relation learning, which can train scene graph models without any human-labeled data.
Comprehensive experimental results show that our distantly supervised model outperforms strong weakly supervised and semi-supervised baselines.
arXiv Detail & Related papers (2021-03-29T06:35:24Z) - Vision Meets Wireless Positioning: Effective Person Re-identification
with Recurrent Context Propagation [120.18969251405485]
Existing person re-identification methods rely on the visual sensor to capture the pedestrians.
Mobile phone can be sensed by WiFi and cellular networks in the form of a wireless positioning signal.
We propose a novel recurrent context propagation module that enables information to propagate between visual data and wireless positioning data.
arXiv Detail & Related papers (2020-08-10T14:19:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.