Graph Neural Networks for Cross-Camera Data Association
- URL: http://arxiv.org/abs/2201.06311v1
- Date: Mon, 17 Jan 2022 09:52:39 GMT
- Title: Graph Neural Networks for Cross-Camera Data Association
- Authors: Elena Luna, Juan C. SanMiguel, Jos\'e M. Mart\'inez, and Pablo
Carballeira
- Abstract summary: Cross-camera image data association is essential for many multi-camera computer vision tasks.
This paper proposes an efficient approach for cross-cameras data-association focused on a global solution.
- Score: 3.490148531239259
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-camera image data association is essential for many multi-camera
computer vision tasks, such as multi-camera pedestrian detection, multi-camera
multi-target tracking, 3D pose estimation, etc. This association task is
typically stated as a bipartite graph matching problem and often solved by
applying minimum-cost flow techniques, which may be computationally inefficient
with large data. Furthermore, cameras are usually treated by pairs, obtaining
local solutions, rather than finding a global solution at once. Other key issue
is that of the affinity measurement: the widespread usage of non-learnable
pre-defined distances, such as the Euclidean and Cosine ones. This paper
proposes an efficient approach for cross-cameras data-association focused on a
global solution, instead of processing cameras by pairs. To avoid the usage of
fixed distances, we leverage the connectivity of Graph Neural Networks,
previously unused in this scope, using a Message Passing Network to jointly
learn features and similarity. We validate the proposal for pedestrian
multi-view association, showing results over the EPFL multi-camera pedestrian
dataset. Our approach considerably outperforms the literature data association
techniques, without requiring to be trained in the same scenario in which it is
tested. Our code is available at
\url{http://www-vpu.eps.uam.es/publications/gnn_cca}.
Related papers
- Multi-View Person Matching and 3D Pose Estimation with Arbitrary
Uncalibrated Camera Networks [36.49915280876899]
Cross-view person matching and 3D human pose estimation in multi-camera networks are difficult when the cameras are extrinsically uncalibrated.
Existing efforts require large amounts of 3D data for training neural networks or known camera poses for geometric constraints to solve the problem.
We present a method, PME, that solves the two tasks without requiring either information.
arXiv Detail & Related papers (2023-12-04T01:28:38Z) - Enhancing Multi-Camera People Tracking with Anchor-Guided Clustering and
Spatio-Temporal Consistency ID Re-Assignment [22.531044994763487]
We propose a novel multi-camera multiple people tracking method that uses anchor clustering-guided for cross-camera reassigning.
Our approach aims to improve accuracy of tracking by identifying key features that are unique to every individual.
The method has demonstrated robustness and effectiveness in handling both synthetic and real-world data.
arXiv Detail & Related papers (2023-04-19T07:38:15Z) - Robust Multi-Object Tracking by Marginal Inference [92.48078680697311]
Multi-object tracking in videos requires to solve a fundamental problem of one-to-one assignment between objects in adjacent frames.
We present an efficient approach to compute a marginal probability for each pair of objects in real time.
It achieves competitive results on MOT17 and MOT20 benchmarks.
arXiv Detail & Related papers (2022-08-07T14:04:45Z) - Cross-Camera Trajectories Help Person Retrieval in a Camera Network [124.65912458467643]
Existing methods often rely on purely visual matching or consider temporal constraints but ignore the spatial information of the camera network.
We propose a pedestrian retrieval framework based on cross-camera generation, which integrates both temporal and spatial information.
To verify the effectiveness of our method, we construct the first cross-camera pedestrian trajectory dataset.
arXiv Detail & Related papers (2022-04-27T13:10:48Z) - Cross-Camera Feature Prediction for Intra-Camera Supervised Person
Re-identification across Distant Scenes [70.30052164401178]
Person re-identification (Re-ID) aims to match person images across non-overlapping camera views.
ICS-DS Re-ID uses cross-camera unpaired data with intra-camera identity labels for training.
Cross-camera feature prediction method to mine cross-camera self supervision information.
Joint learning of global-level and local-level features forms a global-local cross-camera feature prediction scheme.
arXiv Detail & Related papers (2021-07-29T11:27:50Z) - DeepI2P: Image-to-Point Cloud Registration via Deep Classification [71.3121124994105]
DeepI2P is a novel approach for cross-modality registration between an image and a point cloud.
Our method estimates the relative rigid transformation between the coordinate frames of the camera and Lidar.
We circumvent the difficulty by converting the registration problem into a classification and inverse camera projection optimization problem.
arXiv Detail & Related papers (2021-04-08T04:27:32Z) - COTR: Correspondence Transformer for Matching Across Images [31.995943755283786]
We propose a novel framework for finding correspondences in images based on a deep neural network.
By doing so, one has the option to query only the points of interest and retrieve sparse correspondences, or to query all points in an image and obtain dense mappings.
arXiv Detail & Related papers (2021-03-25T22:47:02Z) - Self-supervised Human Detection and Segmentation via Multi-view
Consensus [116.92405645348185]
We propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training.
We show that our approach outperforms state-of-the-art self-supervised person detection and segmentation techniques on images that visually depart from those of standard benchmarks.
arXiv Detail & Related papers (2020-12-09T15:47:21Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - Intelligent Querying for Target Tracking in Camera Networks using Deep
Q-Learning with n-Step Bootstrapping [11.221084462863894]
We formulate the target tracking problem in a camera network as an MDP and learn a reinforcement learning based policy that selects a camera for making a re-identification query.
The proposed approach to camera selection does not assume the knowledge of the camera network topology but the resulting policy implicitly learns it.
arXiv Detail & Related papers (2020-04-20T20:49:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.