Single-Frame based Deep View Synchronization for Unsynchronized
Multi-Camera Surveillance
- URL: http://arxiv.org/abs/2007.03891v3
- Date: Mon, 2 May 2022 16:45:46 GMT
- Title: Single-Frame based Deep View Synchronization for Unsynchronized
Multi-Camera Surveillance
- Authors: Qi Zhang and Antoni B. Chan
- Abstract summary: Multi-camera surveillance has been an active research topic for understanding and modeling scenes.
It is usually assumed that the cameras are all temporally synchronized when designing models for these multi-camera based tasks.
Our view synchronization models are applied to different DNNs-based multi-camera vision tasks under the unsynchronized setting.
- Score: 56.964614522968226
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-camera surveillance has been an active research topic for understanding
and modeling scenes. Compared to a single camera, multi-cameras provide larger
field-of-view and more object cues, and the related applications are multi-view
counting, multi-view tracking, 3D pose estimation or 3D reconstruction, etc. It
is usually assumed that the cameras are all temporally synchronized when
designing models for these multi-camera based tasks. However, this assumption
is not always valid,especially for multi-camera systems with network
transmission delay and low frame-rates due to limited network bandwidth,
resulting in desynchronization of the captured frames across cameras. To handle
the issue of unsynchronized multi-cameras, in this paper, we propose a
synchronization model that works in conjunction with existing DNN-based
multi-view models, thus avoiding the redesign of the whole model. Under the
low-fps regime, we assume that only a single relevant frame is available from
each view, and synchronization is achieved by matching together image contents
guided by epipolar geometry. We consider two variants of the model, based on
where in the pipeline the synchronization occurs, scene-level synchronization
and camera-level synchronization. The view synchronization step and the
task-specific view fusion and prediction step are unified in the same framework
and trained in an end-to-end fashion. Our view synchronization models are
applied to different DNNs-based multi-camera vision tasks under the
unsynchronized setting, including multi-view counting and 3D pose estimation,
and achieve good performance compared to baselines.
Related papers
- Synchformer: Efficient Synchronization from Sparse Cues [100.89656994681934]
Our contributions include a novel audio-visual synchronization model, and training that decouples extraction from synchronization modelling.
This approach achieves state-of-the-art performance in both dense and sparse settings.
We also extend synchronization model training to AudioSet a million-scale 'in-the-wild' dataset, investigate evidence attribution techniques for interpretability, and explore a new capability for synchronization models: audio-visual synchronizability.
arXiv Detail & Related papers (2024-01-29T18:59:55Z) - Enabling Cross-Camera Collaboration for Video Analytics on Distributed
Smart Cameras [7.609628915907225]
We present Argus, a distributed video analytics system with cross-camera collaboration on smart cameras.
We identify multi-camera, multi-target tracking as the primary task multi-camera video analytics and develop a novel technique that avoids redundant, processing-heavy tasks.
Argus reduces the number of object identifications and end-to-end latency by up to 7.13x and 2.19x compared to the state-of-the-art.
arXiv Detail & Related papers (2024-01-25T12:27:03Z) - SyncDreamer: Generating Multiview-consistent Images from a Single-view Image [59.75474518708409]
A novel diffusion model called SyncDreamer generates multiview-consistent images from a single-view image.
Experiments show that SyncDreamer generates images with high consistency across different views.
arXiv Detail & Related papers (2023-09-07T02:28:04Z) - Sparse in Space and Time: Audio-visual Synchronisation with Trainable
Selectors [103.21152156339484]
The objective of this paper is audio-visual synchronisation of general videos 'in the wild'
We make four contributions: (i) in order to handle longer temporal sequences required for sparse synchronisation signals, we design a multi-modal transformer model that employs'selectors'
We identify artefacts that can arise from the compression codecs used for audio and video and can be used by audio-visual models in training to artificially solve the synchronisation task.
arXiv Detail & Related papers (2022-10-13T14:25:37Z) - MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan
Synchronization [61.015704878681795]
We present a novel, end-to-end trainable multi-body motion segmentation and rigid registration framework for 3D point clouds.
The two non-trivial challenges posed by this multi-scan multibody setting are.
guaranteeing correspondence and segmentation consistency across multiple input point clouds and.
obtaining robust motion-based rigid body segmentation applicable to novel object categories.
arXiv Detail & Related papers (2021-01-17T06:36:28Z) - Asynchronous Multi-View SLAM [78.49842639404413]
Existing multi-camera SLAM systems assume synchronized shutters for all cameras, which is often not the case in practice.
Our framework integrates a continuous-time motion model to relate information across asynchronous multi-frames during tracking, local mapping, and loop closing.
arXiv Detail & Related papers (2021-01-17T00:50:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.