Unleashing the Temporal Potential of Stereo Event Cameras for Continuous-Time 3D Object Detection
- URL: http://arxiv.org/abs/2508.02288v1
- Date: Mon, 04 Aug 2025 10:57:03 GMT
- Title: Unleashing the Temporal Potential of Stereo Event Cameras for Continuous-Time 3D Object Detection
- Authors: Jae-Young Kang, Hoonhee Cho, Kuk-Jin Yoon,
- Abstract summary: Event cameras offer a solution by capturing motion continuously.<n>We propose a novel stereo 3D object detection framework that relies solely on event cameras.<n> Experiments show that our method outperforms prior approaches in dynamic environments.
- Score: 44.479946706395694
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: 3D object detection is essential for autonomous systems, enabling precise localization and dimension estimation. While LiDAR and RGB cameras are widely used, their fixed frame rates create perception gaps in high-speed scenarios. Event cameras, with their asynchronous nature and high temporal resolution, offer a solution by capturing motion continuously. The recent approach, which integrates event cameras with conventional sensors for continuous-time detection, struggles in fast-motion scenarios due to its dependency on synchronized sensors. We propose a novel stereo 3D object detection framework that relies solely on event cameras, eliminating the need for conventional 3D sensors. To compensate for the lack of semantic and geometric information in event data, we introduce a dual filter mechanism that extracts both. Additionally, we enhance regression by aligning bounding boxes with object-centric information. Experiments show that our method outperforms prior approaches in dynamic environments, demonstrating the potential of event cameras for robust, continuous-time 3D perception. The code is available at https://github.com/mickeykang16/Ev-Stereo3D.
Related papers
- TAPIP3D: Tracking Any Point in Persistent 3D Geometry [25.357437591411347]
We introduce TAPIP3D, a novel approach for long-term 3D point tracking in monocular and RGB-D videos.<n>TAPIP3D represents videos as camera-stabilized feature clouds, leveraging depth and camera motion information.<n>Our 3D-centric formulation significantly improves performance over existing 3D point tracking methods.
arXiv Detail & Related papers (2025-04-20T19:09:43Z) - Ev-3DOD: Pushing the Temporal Boundaries of 3D Object Detection with Event Cameras [45.01523776589879]
We introduce asynchronous event cameras into 3D object detection for the first time.<n>We leverage their high temporal resolution and low bandwidth to enable high-speed 3D object detection.<n>We introduce the first event-based 3D object detection dataset, DSEC-3DOD, which includes ground-truth 3D bounding boxes at 100 FPS.
arXiv Detail & Related papers (2025-02-26T23:51:25Z) - EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting [72.60992807941885]
Event cameras, inspired by biological vision, record pixel-wise intensity changes asynchronously with high temporal resolution.<n>We propose Event-Aided Free-Trajectory 3DGS, which seamlessly integrates the advantages of event cameras into 3DGS.<n>We evaluate our method on the public Tanks and Temples benchmark and a newly collected real-world dataset, RealEv-DAVIS.
arXiv Detail & Related papers (2024-10-20T13:44:24Z) - EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams [59.77837807004765]
This paper introduces a new problem, i.e., 3D human motion capture from an egocentric monocular event camera with a fisheye lens.
Event streams have high temporal resolution and provide reliable cues for 3D human motion capture under high-speed human motions and rapidly changing illumination.
Our EE3D demonstrates robustness and superior 3D accuracy compared to existing solutions while supporting real-time 3D pose update rates of 140Hz.
arXiv Detail & Related papers (2024-04-12T17:59:47Z) - Dual Memory Aggregation Network for Event-Based Object Detection with
Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner.
Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation.
Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z) - Virtually increasing the measurement frequency of LIDAR sensor utilizing
a single RGB camera [1.3706331473063877]
This research suggests using a mono camera to virtually enhance the frame rate of LIDARs.
We achieve state-of-the-art performance on large public datasets in terms of accuracy and similarity to real measurements.
arXiv Detail & Related papers (2023-02-10T11:43:35Z) - Stereo Hybrid Event-Frame (SHEF) Cameras for 3D Perception [17.585862399941544]
Event cameras address limitations as they report brightness changes of each pixel independently with a fine temporal resolution.
integrated hybrid event-frame sensors (eg., DAVIS) are available, but the quality of data is compromised by coupling at the pixel level in the circuit fabrication of such cameras.
This paper proposes a stereo hybrid event-frame (SHEF) camera system that offers a sensor modality with separate high-quality pure event and pure frame cameras.
arXiv Detail & Related papers (2021-10-11T04:03:36Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Event-based Stereo Visual Odometry [42.77238738150496]
We present a solution to the problem of visual odometry from the data acquired by a stereo event-based camera rig.
We seek to maximize thetemporal consistency of stereo event-based data while using a simple and efficient representation.
arXiv Detail & Related papers (2020-07-30T15:53:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.