Graph Neural Networks for Cross-Camera Data Association
- URL: http://arxiv.org/abs/2201.06311v1
- Date: Mon, 17 Jan 2022 09:52:39 GMT
- Title: Graph Neural Networks for Cross-Camera Data Association
- Authors: Elena Luna, Juan C. SanMiguel, Jos\'e M. Mart\'inez, and Pablo
Carballeira
- Abstract summary: Cross-camera image data association is essential for many multi-camera computer vision tasks.
This paper proposes an efficient approach for cross-cameras data-association focused on a global solution.
- Score: 3.490148531239259
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-camera image data association is essential for many multi-camera
computer vision tasks, such as multi-camera pedestrian detection, multi-camera
multi-target tracking, 3D pose estimation, etc. This association task is
typically stated as a bipartite graph matching problem and often solved by
applying minimum-cost flow techniques, which may be computationally inefficient
with large data. Furthermore, cameras are usually treated by pairs, obtaining
local solutions, rather than finding a global solution at once. Other key issue
is that of the affinity measurement: the widespread usage of non-learnable
pre-defined distances, such as the Euclidean and Cosine ones. This paper
proposes an efficient approach for cross-cameras data-association focused on a
global solution, instead of processing cameras by pairs. To avoid the usage of
fixed distances, we leverage the connectivity of Graph Neural Networks,
previously unused in this scope, using a Message Passing Network to jointly
learn features and similarity. We validate the proposal for pedestrian
multi-view association, showing results over the EPFL multi-camera pedestrian
dataset. Our approach considerably outperforms the literature data association
techniques, without requiring to be trained in the same scenario in which it is
tested. Our code is available at
\url{http://www-vpu.eps.uam.es/publications/gnn_cca}.
Related papers
- Cross-Camera Data Association via GNN for Supervised Graph Clustering [0.0]
Cross-camera data association is one of the cornerstones of the multi-camera computer vision field.
We propose supervised clustering of the affinity graph, where nodes are instances captured by all cameras.
We leverage the advantages of GNN (Graph Neural Network) architecture to examine nodes' relations and generate representative edge embeddings.
Our proposed method, named SGC-CCA, outperformed the state-of-the-art method named GNN-CCA across all clustering metrics.
arXiv Detail & Related papers (2024-10-01T12:52:54Z) - Multi-Camera Multi-Person Association using Transformer-Based Dense Pixel Correspondence Estimation and Detection-Based Masking [1.0937094979510213]
Multi-camera Association (MCA) is the task of identifying objects and individuals across camera views.
We investigate a novel multi-camera multi-target association algorithm based on dense pixel correspondence estimation.
Our results conclude that the algorithm performs exceptionally well associating pedestrians on camera pairs that are positioned close to each other.
arXiv Detail & Related papers (2024-08-17T20:58:16Z) - Multi-View Person Matching and 3D Pose Estimation with Arbitrary
Uncalibrated Camera Networks [36.49915280876899]
Cross-view person matching and 3D human pose estimation in multi-camera networks are difficult when the cameras are extrinsically uncalibrated.
Existing efforts require large amounts of 3D data for training neural networks or known camera poses for geometric constraints to solve the problem.
We present a method, PME, that solves the two tasks without requiring either information.
arXiv Detail & Related papers (2023-12-04T01:28:38Z) - Enhancing Multi-Camera People Tracking with Anchor-Guided Clustering and
Spatio-Temporal Consistency ID Re-Assignment [22.531044994763487]
We propose a novel multi-camera multiple people tracking method that uses anchor clustering-guided for cross-camera reassigning.
Our approach aims to improve accuracy of tracking by identifying key features that are unique to every individual.
The method has demonstrated robustness and effectiveness in handling both synthetic and real-world data.
arXiv Detail & Related papers (2023-04-19T07:38:15Z) - Robust Multi-Object Tracking by Marginal Inference [92.48078680697311]
Multi-object tracking in videos requires to solve a fundamental problem of one-to-one assignment between objects in adjacent frames.
We present an efficient approach to compute a marginal probability for each pair of objects in real time.
It achieves competitive results on MOT17 and MOT20 benchmarks.
arXiv Detail & Related papers (2022-08-07T14:04:45Z) - Cross-Camera Trajectories Help Person Retrieval in a Camera Network [124.65912458467643]
Existing methods often rely on purely visual matching or consider temporal constraints but ignore the spatial information of the camera network.
We propose a pedestrian retrieval framework based on cross-camera generation, which integrates both temporal and spatial information.
To verify the effectiveness of our method, we construct the first cross-camera pedestrian trajectory dataset.
arXiv Detail & Related papers (2022-04-27T13:10:48Z) - Cross-Camera Feature Prediction for Intra-Camera Supervised Person
Re-identification across Distant Scenes [70.30052164401178]
Person re-identification (Re-ID) aims to match person images across non-overlapping camera views.
ICS-DS Re-ID uses cross-camera unpaired data with intra-camera identity labels for training.
Cross-camera feature prediction method to mine cross-camera self supervision information.
Joint learning of global-level and local-level features forms a global-local cross-camera feature prediction scheme.
arXiv Detail & Related papers (2021-07-29T11:27:50Z) - DeepI2P: Image-to-Point Cloud Registration via Deep Classification [71.3121124994105]
DeepI2P is a novel approach for cross-modality registration between an image and a point cloud.
Our method estimates the relative rigid transformation between the coordinate frames of the camera and Lidar.
We circumvent the difficulty by converting the registration problem into a classification and inverse camera projection optimization problem.
arXiv Detail & Related papers (2021-04-08T04:27:32Z) - Self-supervised Human Detection and Segmentation via Multi-view
Consensus [116.92405645348185]
We propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training.
We show that our approach outperforms state-of-the-art self-supervised person detection and segmentation techniques on images that visually depart from those of standard benchmarks.
arXiv Detail & Related papers (2020-12-09T15:47:21Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.