Related papers: YCB-Ev SD: Synthetic event-vision dataset for 6DoF object pose estimation

YCB-Ev SD: Synthetic event-vision dataset for 6DoF object pose estimation

URL: http://arxiv.org/abs/2511.11344v1
Date: Fri, 14 Nov 2025 14:32:03 GMT
Title: YCB-Ev SD: Synthetic event-vision dataset for 6DoF object pose estimation
Authors: Pavel Rojtberg, Julius Kühn,
Abstract summary: YCB-Ev SD is a dataset of event-camera data at standard definition (SD) resolution for 6DoF object pose estimation.<n>We present 50,000 event sequences of 34 ms duration each, synthesized from Physically Based Rendering scenes.
Score: 2.1485350418225244
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We introduce YCB-Ev SD, a synthetic dataset of event-camera data at standard definition (SD) resolution for 6DoF object pose estimation. While synthetic data has become fundamental in frame-based computer vision, event-based vision lacks comparable comprehensive resources. Addressing this gap, we present 50,000 event sequences of 34 ms duration each, synthesized from Physically Based Rendering (PBR) scenes of YCB-Video objects following the Benchmark for 6D Object Pose (BOP) methodology. Our generation framework employs simulated linear camera motion to ensure complete scene coverage, including background activity. Through systematic evaluation of event representations for CNN-based inference, we demonstrate that time-surfaces with linear decay and dual-channel polarity encoding achieve superior pose estimation performance, outperforming exponential decay and single-channel alternatives by significant margins. Our analysis reveals that polarity information contributes most substantially to performance gains, while linear temporal encoding preserves critical motion information more effectively than exponential decay. The dataset is provided in a structured format with both raw event streams and precomputed optimal representations to facilitate immediate research use and reproducible benchmarking. The dataset is publicly available at https://huggingface.co/datasets/paroj/ycbev_sd.

Related papers

UniE2F: A Unified Diffusion Framework for Event-to-Frame Reconstruction with Video Foundation Models [67.24086328473437]
Event cameras excel at recording relative intensity changes rather than absolute intensity.<n>The resulting data streams suffer from a significant loss of spatial information and static texture details.<n>We address this limitation by leveraging a pre-trained video diffusion model to reconstruct high-fidelity video frames from sparse event data.
arXiv Detail & Related papers (2026-02-22T14:06:49Z)
EPRBench: A High-Quality Benchmark Dataset for Event Stream Based Visual Place Recognition [54.55914886780534]
Event stream-based Visual Place Recognition (VPR) is an emerging research direction that offers a compelling solution to the instability of conventional visible-light cameras under challenging conditions such as low illumination, overexposure, and high-speed motion.<n>We introduce EPRBench, a high-quality benchmark specifically designed for event stream-based VPR.<n>EPRBench comprises 10K event sequences and 65K event frames, collected using both handheld and vehicle-mounted setups to comprehensively capture real-world challenges across diverse viewpoints, weather conditions, and lighting scenarios.
arXiv Detail & Related papers (2026-02-13T13:25:05Z)
Generative Spatiotemporal Data Augmentation [12.849046721804797]
We use video foundation models to generate realistic 3D spatial and temporal variations from an image dataset.<n> Incorporating synthesized video clips as supplemental data yields consistent performance gains in low-data settings.
arXiv Detail & Related papers (2025-12-14T01:18:48Z)
4D-VLA: Spatiotemporal Vision-Language-Action Pretraining with Cross-Scene Calibration [31.111439909825627]
Existing methods typically model the dataset's action distribution using simple observations as inputs.<n>We propose 4D-VLA, a novel approach that effectively integrates 4D information into the input to these sources of chaos.<n>Our model consistently outperforms existing methods, demonstrating stronger spatial understanding and adaptability.
arXiv Detail & Related papers (2025-06-27T14:09:29Z)
Event-Based Crossing Dataset (EBCD) [0.9961452710097684]
Event-based vision revolutionizes traditional image sensing by capturing intensity variations rather than static frames.<n>Event-Based Crossing dataset is a dataset tailored for pedestrian and vehicle detection in dynamic outdoor environments.<n>This dataset facilitates an extensive assessment of object detection performance under varying conditions of sparsity and noise suppression.
arXiv Detail & Related papers (2025-03-21T19:20:58Z)
ESVO2: Direct Visual-Inertial Odometry with Stereo Event Cameras [41.992980062962495]
Event-based visual odometry aims at solving tracking and mapping subproblems (typically in parallel)<n>We build an event-based stereo visual-inertial odometry system on top of a direct pipeline.<n>The resulting system scales well with modern high-resolution event cameras.
arXiv Detail & Related papers (2024-10-12T05:35:27Z)
OmniPose6D: Towards Short-Term Object Pose Tracking in Dynamic Scenes from Monocular RGB [40.62577054196799]
We introduce a large-scale synthetic dataset OmniPose6D, crafted to mirror the diversity of real-world conditions.<n>We present a benchmarking framework for a comprehensive comparison of pose tracking algorithms.
arXiv Detail & Related papers (2024-10-09T09:01:40Z)
Evaluating Image-Based Face and Eye Tracking with Event Cameras [9.677797822200965]
Event Cameras, also known as Neuromorphic sensors, capture changes in local light intensity at the pixel level, producing asynchronously generated data termed events'' This data format mitigates common issues observed in conventional cameras, like under-sampling when capturing fast-moving objects. We evaluate the viability of integrating conventional algorithms with event-based data, transformed into a frame format.
arXiv Detail & Related papers (2024-08-19T20:27:08Z)
Video Dynamics Prior: An Internal Learning Approach for Robust Video Enhancements [83.5820690348833]
We present a framework for low-level vision tasks that does not require any external training data corpus. Our approach learns neural modules by optimizing over a corrupted sequence, leveraging the weights of the coherence-temporal test and statistics internal statistics.
arXiv Detail & Related papers (2023-12-13T01:57:11Z)
Generative Modeling with Phase Stochastic Bridges [49.4474628881673]
Diffusion models (DMs) represent state-of-the-art generative models for continuous inputs. We introduce a novel generative modeling framework grounded in textbfphase space dynamics Our framework demonstrates the capability to generate realistic data points at an early stage of dynamics propagation.
arXiv Detail & Related papers (2023-10-11T18:38:28Z)
Self-Supervised Scene Dynamic Recovery from Rolling Shutter Images and Events [63.984927609545856]
Event-based Inter/intra-frame Compensator (E-IC) is proposed to predict the per-pixel dynamic between arbitrary time intervals. We show that the proposed method achieves state-of-the-art and shows remarkable performance for event-based RS2GS inversion in real-world scenarios.
arXiv Detail & Related papers (2023-04-14T05:30:02Z)
A Unified Framework for Event-based Frame Interpolation with Ad-hoc Deblurring in the Wild [72.0226493284814]
We propose a unified framework for event-based frame that performs deblurring ad-hoc.<n>Our network consistently outperforms previous state-of-the-art methods on frame, single image deblurring, and the joint task of both.
arXiv Detail & Related papers (2023-01-12T18:19:00Z)
HighlightMe: Detecting Highlights from Human-Centric Videos [52.84233165201391]
We present a domain- and user-preference-agnostic approach to detect highlightable excerpts from human-centric videos. We use an autoencoder network equipped with spatial-temporal graph convolutions to detect human activities and interactions. We observe a 4-12% improvement in the mean average precision of matching the human-annotated highlights over state-of-the-art methods.
arXiv Detail & Related papers (2021-10-05T01:18:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.