Related papers: ControlEvents: Controllable Synthesis of Event Camera Datawith Foundational Prior from Image Diffusion Models

ControlEvents: Controllable Synthesis of Event Camera Datawith Foundational Prior from Image Diffusion Models

URL: http://arxiv.org/abs/2509.22864v1
Date: Fri, 26 Sep 2025 19:22:07 GMT
Title: ControlEvents: Controllable Synthesis of Event Camera Datawith Foundational Prior from Image Diffusion Models
Authors: Yixuan Hu, Yuxuan Xue, Simon Klenk, Daniel Cremers, Gerard Pons-Moll,
Abstract summary: We present a diffusion-based generative model designed to synthesize high-quality event data guided by diverse control signals.<n>We demonstrate the effectiveness of our approach by synthesizing event data for visual recognition, 2D skeleton estimation, and 3D body pose estimation.
Score: 61.17744115607788
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In recent years, event cameras have gained significant attention due to their bio-inspired properties, such as high temporal resolution and high dynamic range. However, obtaining large-scale labeled ground-truth data for event-based vision tasks remains challenging and costly. In this paper, we present ControlEvents, a diffusion-based generative model designed to synthesize high-quality event data guided by diverse control signals such as class text labels, 2D skeletons, and 3D body poses. Our key insight is to leverage the diffusion prior from foundation models, such as Stable Diffusion, enabling high-quality event data generation with minimal fine-tuning and limited labeled data. Our method streamlines the data generation process and significantly reduces the cost of producing labeled event datasets. We demonstrate the effectiveness of our approach by synthesizing event data for visual recognition, 2D skeleton estimation, and 3D body pose estimation. Our experiments show that the synthesized labeled event data enhances model performance in all tasks. Additionally, our approach can generate events based on unseen text labels during training, illustrating the powerful text-based generation capabilities inherited from foundation models.

Related papers

EvDiff: High Quality Video with an Event Camera [77.07279880903009]
Reconstructing intensity images from events is a highly ill-posed task due to the inherent ambiguity of absolute brightness.<n>We propose EvDiff, an event-based diffusion model that follows a surrogate training framework to produce high-quality videos.
arXiv Detail & Related papers (2025-11-21T18:49:18Z)
Revealing Latent Information: A Physics-inspired Self-supervised Pre-training Framework for Noisy and Sparse Events [25.348660233701708]
Event camera records data with high temporal resolution and wide dynamic range.<n>Event data is inherently sparse and noisy, mainly reflecting brightness changes.<n>We propose a self-supervised pre-training framework to fully reveal latent information in event data.
arXiv Detail & Related papers (2025-08-07T15:38:36Z)
Controlling Avatar Diffusion with Learnable Gaussian Embedding [27.651478116386354]
We introduce a novel control signal representation that is optimizable, dense, expressive, and 3D consistent.<n>We synthesize a large-scale dataset with multiple poses and identities.<n>Our model outperforms existing methods in terms of realism, expressiveness, and 3D consistency.
arXiv Detail & Related papers (2025-03-20T02:52:01Z)
3D-VirtFusion: Synthetic 3D Data Augmentation through Generative Diffusion Models and Controllable Editing [52.68314936128752]
We propose a new paradigm to automatically generate 3D labeled training data by harnessing the power of pretrained large foundation models. For each target semantic class, we first generate 2D images of a single object in various structure and appearance via diffusion models and chatGPT generated text prompts. We transform these augmented images into 3D objects and construct virtual scenes by random composition.
arXiv Detail & Related papers (2024-08-25T09:31:22Z)
EventZoom: A Progressive Approach to Event-Based Data Augmentation for Enhanced Neuromorphic Vision [9.447299017563841]
Dynamic Vision Sensors (DVS) capture event data with high temporal resolution and low power consumption.<n>Event data augmentation serve as an essential method for overcoming the limitation of scale and diversity in event datasets.
arXiv Detail & Related papers (2024-05-29T08:39:31Z)
Hardness-Aware Scene Synthesis for Semi-Supervised 3D Object Detection [59.33188668341604]
3D object detection serves as the fundamental task of autonomous driving perception. It is costly to obtain high-quality annotations for point cloud data. We propose a hardness-aware scene synthesis (HASS) method to generate adaptive synthetic scenes.
arXiv Detail & Related papers (2024-05-27T17:59:23Z)
DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets. We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability. Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z)
Exploring Event-based Human Pose Estimation with 3D Event Representations [26.34100847541989]
We introduce two 3D event representations: the Rasterized Event Point Cloud (Ras EPC) and the Decoupled Event Voxel (DEV) The Ras EPC aggregates events within concise temporal slices at identical positions, preserving their 3D attributes along with statistical information, thereby significantly reducing memory and computational demands. Our methods are tested on the DHP19 public dataset, MMHPSD dataset, and our EV-3DPW dataset, with further qualitative validation via a derived driving scene dataset EV-JAAD and an outdoor collection vehicle.
arXiv Detail & Related papers (2023-11-08T10:45:09Z)
EventMix: An Efficient Augmentation Strategy for Event-Based Data [4.8416725611508244]
Event cameras can provide high dynamic range and low-energy event stream data. The scale is smaller and more difficult to obtain than traditional frame-based data. This paper proposes an efficient data augmentation strategy for event stream data: EventMix.
arXiv Detail & Related papers (2022-05-24T13:07:33Z)
Event Data Association via Robust Model Fitting for Event-based Object Tracking [4.36706221903271]
We propose a novel Event Data Association (called EDA) approach to explicitly address the event association and fusion problem.<n>The proposed EDA seeks for event trajectories that best fit the event data, in order to perform unifying data association and information fusion.<n>The experimental results show the effectiveness of EDA under challenging scenarios, such as high speed, motion blur, and high dynamic range conditions.
arXiv Detail & Related papers (2021-10-25T13:56:00Z)
Robust Event Classification Using Imperfect Real-world PMU Data [58.26737360525643]
We study robust event classification using imperfect real-world phasor measurement unit (PMU) data. We develop a novel machine learning framework for training robust event classifiers.
arXiv Detail & Related papers (2021-10-19T17:41:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.