V2CE: Video to Continuous Events Simulator
- URL: http://arxiv.org/abs/2309.08891v2
- Date: Fri, 26 Apr 2024 21:59:30 GMT
- Title: V2CE: Video to Continuous Events Simulator
- Authors: Zhongyang Zhang, Shuyang Cui, Kaidong Chai, Haowen Yu, Subhasis Dasgupta, Upal Mahbub, Tauhidur Rahman,
- Abstract summary: We present a novel method for video-to-events stream conversion from multiple perspectives, considering the specific characteristics of Dynamic Vision Sensor (DVS)
A series of carefully designed timestamp losses helps enhance the quality of generated event voxels significantly.
We also propose a novel local dynamic-aware inference strategy to accurately recover event timestamps from event voxels in a continuous fashion.
- Score: 1.1009908861287052
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dynamic Vision Sensor (DVS)-based solutions have recently garnered significant interest across various computer vision tasks, offering notable benefits in terms of dynamic range, temporal resolution, and inference speed. However, as a relatively nascent vision sensor compared to Active Pixel Sensor (APS) devices such as RGB cameras, DVS suffers from a dearth of ample labeled datasets. Prior efforts to convert APS data into events often grapple with issues such as a considerable domain shift from real events, the absence of quantified validation, and layering problems within the time axis. In this paper, we present a novel method for video-to-events stream conversion from multiple perspectives, considering the specific characteristics of DVS. A series of carefully designed losses helps enhance the quality of generated event voxels significantly. We also propose a novel local dynamic-aware timestamp inference strategy to accurately recover event timestamps from event voxels in a continuous fashion and eliminate the temporal layering problem. Results from rigorous validation through quantified metrics at all stages of the pipeline establish our method unquestionably as the current state-of-the-art (SOTA).
Related papers
- Path-adaptive Spatio-Temporal State Space Model for Event-based Recognition with Arbitrary Duration [9.547947845734992]
Event cameras are bio-inspired sensors that capture the intensity changes asynchronously and output event streams.
We present a novel framework, dubbed PAST-Act, exhibiting superior capacity in recognizing events with arbitrary duration.
We also build a minute-level event-based recognition dataset, named ArDVS100, with arbitrary duration for the benefit of the community.
arXiv Detail & Related papers (2024-09-25T14:08:37Z) - DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition [51.96660522869841]
DailyDVS-200 is a benchmark dataset tailored for the event-based action recognition community.
It covers 200 action categories across real-world scenarios, recorded by 47 participants, and comprises more than 22,000 event sequences.
DailyDVS-200 is annotated with 14 attributes, ensuring a detailed characterization of the recorded actions.
arXiv Detail & Related papers (2024-07-06T15:25:10Z) - EventZoom: A Progressive Approach to Event-Based Data Augmentation for Enhanced Neuromorphic Vision [9.447299017563841]
Dynamic Vision Sensors (DVS) capture event data with high temporal resolution and low power consumption.
Event data augmentation serve as an essential method for overcoming the limitation of scale and diversity in event datasets.
arXiv Detail & Related papers (2024-05-29T08:39:31Z) - Motion Segmentation for Neuromorphic Aerial Surveillance [42.04157319642197]
Event cameras offer superior temporal resolution, superior dynamic range, and minimal power requirements.
Unlike traditional frame-based sensors that capture redundant information at fixed intervals, event cameras asynchronously record pixel-level brightness changes.
We introduce a novel motion segmentation method that leverages self-supervised vision transformers on both event data and optical flow information.
arXiv Detail & Related papers (2024-05-24T04:36:13Z) - An Event-Oriented Diffusion-Refinement Method for Sparse Events
Completion [36.64856578682197]
Event cameras or dynamic vision sensors (DVS) record asynchronous response to brightness changes instead of conventional intensity frames.
We propose an inventive event completion sequence approach conforming to unique characteristics of event data in both the processing stage and the output form.
Specifically, we treat event streams as 3D event clouds in thetemporal domain, develop a diffusion-based generative model to generate dense clouds in a coarse-to-fine manner, and recover exact timestamps to maintain the temporal resolution of raw data successfully.
arXiv Detail & Related papers (2024-01-06T08:09:54Z) - LEAP-VO: Long-term Effective Any Point Tracking for Visual Odometry [52.131996528655094]
We present the Long-term Effective Any Point Tracking (LEAP) module.
LEAP innovatively combines visual, inter-track, and temporal cues with mindfully selected anchors for dynamic track estimation.
Based on these traits, we develop LEAP-VO, a robust visual odometry system adept at handling occlusions and dynamic scenes.
arXiv Detail & Related papers (2024-01-03T18:57:27Z) - Implicit Event-RGBD Neural SLAM [54.74363487009845]
Implicit neural SLAM has achieved remarkable progress recently.
Existing methods face significant challenges in non-ideal scenarios.
We propose EN-SLAM, the first event-RGBD implicit neural SLAM framework.
arXiv Detail & Related papers (2023-11-18T08:48:58Z) - Event-based Simultaneous Localization and Mapping: A Comprehensive Survey [52.73728442921428]
Review of event-based vSLAM algorithms that exploit the benefits of asynchronous and irregular event streams for localization and mapping tasks.
Paper categorizes event-based vSLAM methods into four main categories: feature-based, direct, motion-compensation, and deep learning methods.
arXiv Detail & Related papers (2023-04-19T16:21:14Z) - Dual Memory Aggregation Network for Event-Based Object Detection with
Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner.
Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation.
Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z) - Ev-TTA: Test-Time Adaptation for Event-Based Object Recognition [7.814941658661939]
Ev-TTA is a simple, effective test-time adaptation for event-based object recognition.
Our formulation can be successfully applied regardless of input representations and extended into regression tasks.
arXiv Detail & Related papers (2022-03-23T07:43:44Z) - Learning Monocular Dense Depth from Events [53.078665310545745]
Event cameras produce brightness changes in the form of a stream of asynchronous events instead of intensity frames.
Recent learning-based approaches have been applied to event-based data, such as monocular depth prediction.
We propose a recurrent architecture to solve this task and show significant improvement over standard feed-forward methods.
arXiv Detail & Related papers (2020-10-16T12:36:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.