Related papers: Synchronized Smartphone Video Recording System of Depth and RGB Image Frames with Sub-millisecond Precision

Synchronized Smartphone Video Recording System of Depth and RGB Image Frames with Sub-millisecond Precision

URL: http://arxiv.org/abs/2111.03552v1
Date: Fri, 5 Nov 2021 15:16:54 GMT
Title: Synchronized Smartphone Video Recording System of Depth and RGB Image Frames with Sub-millisecond Precision
Authors: Marsel Faizullin, Anastasiia Kornilova, Azat Akhmetyanov, Konstantin Pakulev, Andrey Sadkov and Gonzalo Ferrer
Abstract summary: We propose a recording system with high time synchronization (sync) precision. It consists of heterogeneous sensors such as smartphone, depth camera, IMU, etc.
Score: 2.1286051580524523
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: In this paper, we propose a recording system with high time synchronization (sync) precision which consists of heterogeneous sensors such as smartphone, depth camera, IMU, etc. Due to the general interest and mass adoption of smartphones, we include at least one of such devices into our system. This heterogeneous system requires a hybrid synchronization for the two different time authorities: smartphone and MCU, where we combine a hardware wired-based trigger sync with software sync. We evaluate our sync results on a custom and novel system mixing active infra-red depth with RGB camera. Our system achieves sub-millisecond precision of time sync. Moreover, our system exposes every RGB-depth image pair at the same time with this precision. We showcase a configuration in particular but the general principles behind our system could be replicated by other projects.

Related papers

RocSync: Millisecond-Accurate Temporal Synchronization for Heterogeneous Camera Systems [38.099313678683224]
We present a low-cost, general-purpose synchronization method that achieves millisecond-level temporal alignment across diverse camera systems.<n>The proposed solution employs a custom-built itLED Clock that encodes time through red and infrared, allowing visual decoding of the exposure window.<n>We validate the system in large-scale surgical recordings involving over 25 heterogeneous cameras spanning both IR and RGB modalities.
arXiv Detail & Related papers (2025-11-18T22:13:06Z)
When Every Millisecond Counts: Real-Time Anomaly Detection via the Multimodal Asynchronous Hybrid Network [42.72133852384352]
We introduce real-time anomaly detection for autonomous driving, prioritizing both minimal response time and high accuracy.<n>We propose a novel multimodal asynchronous hybrid network that combines event streams from event cameras with image data from RGB cameras.<n>Our approach outperforms existing methods in both accuracy and response time, achieving millisecond-level real-time performance.
arXiv Detail & Related papers (2025-06-20T19:58:38Z)
JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization [94.82127738291749]
JavisDiT is able to generate high-quality audio and video content simultaneously from open-ended user prompts. New benchmark, JavisBench, consists of 10,140 high-quality text-captioned sounding videos spanning diverse scenes and complex real-world scenarios.
arXiv Detail & Related papers (2025-03-30T09:40:42Z)
Multi-modal Multi-platform Person Re-Identification: Benchmark and Method [58.59888754340054]
MP-ReID is a novel dataset designed specifically for multi-modality and multi-platform ReID. This benchmark compiles data from 1,930 identities across diverse modalities, including RGB, infrared, and thermal imaging. We introduce Uni-Prompt ReID, a framework with specific-designed prompts, tailored for cross-modality and cross-platform scenarios.
arXiv Detail & Related papers (2025-03-21T12:27:49Z)
Event-based Asynchronous HDR Imaging by Temporal Incident Light Modulation [54.64335350932855]
We propose a Pixel-Asynchronous HDR imaging system, based on key insights into the challenges in HDR imaging. Our proposed Asyn system integrates the Dynamic Vision Sensors (DVS) with a set of LCD panels. The LCD panels modulate the irradiance incident upon the DVS by altering their transparency, thereby triggering the pixel-independent event streams.
arXiv Detail & Related papers (2024-03-14T13:45:09Z)
An Asynchronous Linear Filter Architecture for Hybrid Event-Frame Cameras [9.69495347826584]
We present an asynchronous linear filter architecture, fusing event and frame camera data, for HDR video reconstruction and spatial convolution. The proposed AKF pipeline outperforms other state-of-the-art methods in both absolute intensity error (69.4% reduction) and image similarity indexes (average 35.5% improvement)
arXiv Detail & Related papers (2023-09-03T12:37:59Z)
Deep learning-based stereo camera multi-video synchronization [5.305803516459996]
A software-based synchronization method would reduce the cost, weight and size of the entire system. This study paves the way to a production ready software-based video synchronization system.
arXiv Detail & Related papers (2023-03-22T21:14:36Z)
Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors [103.21152156339484]
The objective of this paper is audio-visual synchronisation of general videos 'in the wild' We make four contributions: (i) in order to handle longer temporal sequences required for sparse synchronisation signals, we design a multi-modal transformer model that employs'selectors' We identify artefacts that can arise from the compression codecs used for audio and video and can be used by audio-visual models in training to artificially solve the synchronisation task.
arXiv Detail & Related papers (2022-10-13T14:25:37Z)
Rolling Shutter Inversion: Bring Rolling Shutter Images to High Framerate Global Shutter Video [111.08121952640766]
This paper presents a novel deep-learning based solution to the RS temporal super-resolution problem. By leveraging the multi-view geometry relationship of the RS imaging process, our framework successfully achieves high framerate GS generation. Our method can produce high-quality GS image sequences with rich details, outperforming the state-of-the-art methods.
arXiv Detail & Related papers (2022-10-06T16:47:12Z)
SmartPortraits: Depth Powered Handheld Smartphone Dataset of Human Portraits for State Estimation, Reconstruction and Synthesis [1.981491298222699]
We present a dataset of 1000 video sequences of human portraits recorded in real and uncontrolled conditions. The collected dataset contains 200 people captured in different poses and locations. The main purpose is to bridge the gap between raw measurements obtained from a smartphone and downstream applications.
arXiv Detail & Related papers (2022-04-21T15:47:38Z)
Sub-millisecond Video Synchronization of Multiple Android Smartphones [2.283665431721732]
This paper addresses the problem of building an affordable easy-to-setup synchronized multi-view camera system. We propose a solution for this problem - a publicly-available Android application for synchronized video recording on multiple smartphones with sub-millisecond accuracy.
arXiv Detail & Related papers (2021-07-02T11:56:33Z)
Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction [51.072733683919246]
We introduce Recurrent Asynchronous Multimodal (RAM) networks to handle asynchronous and irregular data from multiple sensors. Inspired by traditional RNNs, RAM networks maintain a hidden state that is updated asynchronously and can be queried at any time to generate a prediction. We show an improvement over state-of-the-art methods by up to 30% in terms of mean depth absolute error.
arXiv Detail & Related papers (2021-02-18T13:24:35Z)
CoMo: A novel co-moving 3D camera system [0.0]
CoMo is a co-moving camera system of two synchronized high speed cameras coupled with rotational stages. We address the calibration of the external parameters measuring the position of the cameras and their three angles of yaw, pitch and roll in the system "home" configuration. We evaluate the robustness and accuracy of the system by comparing reconstructed and measured 3D distances in what we call 3D tests, which show a relative error of the order of 1%.
arXiv Detail & Related papers (2021-01-26T13:29:13Z)
Single-Frame based Deep View Synchronization for Unsynchronized Multi-Camera Surveillance [56.964614522968226]
Multi-camera surveillance has been an active research topic for understanding and modeling scenes. It is usually assumed that the cameras are all temporally synchronized when designing models for these multi-camera based tasks. Our view synchronization models are applied to different DNNs-based multi-camera vision tasks under the unsynchronized setting.
arXiv Detail & Related papers (2020-07-08T04:39:38Z)
Event-based Asynchronous Sparse Convolutional Networks [54.094244806123235]
Event cameras are bio-inspired sensors that respond to per-pixel brightness changes in the form of asynchronous and sparse "events" We present a general framework for converting models trained on synchronous image-like event representations into asynchronous models with identical output. We show both theoretically and experimentally that this drastically reduces the computational complexity and latency of high-capacity, synchronous neural networks.
arXiv Detail & Related papers (2020-03-20T08:39:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.