GERD: Geometric event response data generation
- URL: http://arxiv.org/abs/2412.03259v1
- Date: Wed, 04 Dec 2024 11:59:36 GMT
- Title: GERD: Geometric event response data generation
- Authors: Jens Egholm Pedersen, Dimitris Korakovounis, Jörg Conradt,
- Abstract summary: Event-based vision sensors are appealing because of their time resolution, higher dynamic range, and low-power consumption.<n>They also provide data that is fundamentally different from conventional frame-based cameras: events are sparse, discrete, and require integration in time.<n>We introduce a method to generate event-based data under controlled transformations.
- Score: 1.5269221584932013
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Event-based vision sensors are appealing because of their time resolution, higher dynamic range, and low-power consumption. They also provide data that is fundamentally different from conventional frame-based cameras: events are sparse, discrete, and require integration in time. Unlike conventional models grounded in established geometric and physical principles, event-based models lack comparable foundations. We introduce a method to generate event-based data under controlled transformations. Specifically, we subject a prototypical object to transformations that change over time to produce carefully curated event videos. We hope this work simplifies studies for geometric approaches in event-based vision. GERD is available at https://github.com/ncskth/gerd
Related papers
- UniE2F: A Unified Diffusion Framework for Event-to-Frame Reconstruction with Video Foundation Models [67.24086328473437]
Event cameras excel at recording relative intensity changes rather than absolute intensity.<n>The resulting data streams suffer from a significant loss of spatial information and static texture details.<n>We address this limitation by leveraging a pre-trained video diffusion model to reconstruct high-fidelity video frames from sparse event data.
arXiv Detail & Related papers (2026-02-22T14:06:49Z) - Scalable Adaptation of 3D Geometric Foundation Models via Weak Supervision from Internet Video [76.32954467706581]
We propose SAGE, a framework for Scalable Adaptation of GEometric foundation models from raw video streams.<n>We use a hierarchical mining pipeline to transform videos into training trajectories and hybrid supervision.<n>Experiments show that SAGE significantly enhances zero-shot generalization, reducing Chamfer Distance by 20-42% on unseen benchmarks.
arXiv Detail & Related papers (2026-02-08T09:53:21Z) - GPA-VGGT:Adapting VGGT to Large Scale Localization by Self-Supervised Learning with Geometry and Physics Aware Loss [15.633839321933385]
Recent advancements in Visual Geometry Grounded Transformer (VGGT) models have shown great promise in camera pose estimation and 3D reconstruction.<n>These models typically rely on ground truth labels for training, posing challenges when adapting to unlabeled and unseen scenes.<n>We propose a self-supervised framework to train VGGT with unlabeled data, thereby enhancing its localization capability in large-scale environments.
arXiv Detail & Related papers (2026-01-23T16:46:59Z) - GS2E: Gaussian Splatting is an Effective Data Generator for Event Stream Generation [32.13436507983477]
We introduce GS2E (Gaussian Splatting to Event), a large-scale synthetic event dataset for high-fidelity event vision tasks.<n>Results on event-based 3D reconstruction demonstrate GS2E's superior generalization capabilities and its practical value.
arXiv Detail & Related papers (2025-05-21T09:15:42Z) - EMF: Event Meta Formers for Event-based Real-time Traffic Object Detection [5.143097874851516]
Event cameras have higher temporal resolution, and require less storage and bandwidth compared to traditional RGB cameras.
Recent approaches in event-based object detection try to bridge this gap by employing computationally expensive transformer-based solutions.
Our proposed EMF becomes the fastest Progression-based architecture in the domain by outperforming most efficient event-based object detectors.
arXiv Detail & Related papers (2025-04-05T09:48:40Z) - STREAM: A Universal State-Space Model for Sparse Geometric Data [2.9483719973596303]
Handling unstructured geometric data, such as point clouds or event-based vision, is a pressing challenge in the field of machine vision.
We propose to encode geometric structure explicitly into the parameterization of a state-space model.
Our model deploys the Mamba selective state-space model with a modified kernel to efficiently map sparse data to modern hardware.
arXiv Detail & Related papers (2024-11-19T16:06:32Z) - Spatio-temporal Transformers for Action Unit Classification with Event Cameras [28.98336123799572]
We present FACEMORPHIC, a temporally synchronized multimodal face dataset composed of RGB videos and event streams.
We show how temporal synchronization can allow effective neuromorphic face analysis without the need to manually annotate videos.
arXiv Detail & Related papers (2024-10-29T11:23:09Z) - GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image [94.56927147492738]
We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes from single images.
We show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage.
We propose a simple yet effective strategy to segregate the complex data distribution of various scenes into distinct sub-distributions.
arXiv Detail & Related papers (2024-03-18T17:50:41Z) - GeoDeformer: Geometric Deformable Transformer for Action Recognition [22.536307401874105]
Vision transformers have recently emerged as an effective alternative to convolutional networks for action recognition.
This paper proposes a novel approach, GeoDeformer, designed to capture the variations inherent in action video by integrating geometric comprehension directly into the ViT architecture.
arXiv Detail & Related papers (2023-11-29T16:55:55Z) - GET: Group Event Transformer for Event-Based Vision [82.312736707534]
Event cameras are a type of novel neuromorphic sen-sor that has been gaining increasing attention.
We propose a novel Group-based vision Transformer backbone for Event-based vision, called Group Event Transformer (GET)
GET de-couples temporal-polarity information from spatial infor-mation throughout the feature extraction process.
arXiv Detail & Related papers (2023-10-04T08:02:33Z) - EvDNeRF: Reconstructing Event Data with Dynamic Neural Radiance Fields [80.94515892378053]
EvDNeRF is a pipeline for generating event data and training an event-based dynamic NeRF.
NeRFs offer geometric-based learnable rendering, but prior work with events has only considered reconstruction of static scenes.
We show that by training on varied batch sizes of events, we can improve test-time predictions of events at fine time resolutions.
arXiv Detail & Related papers (2023-10-03T21:08:41Z) - SceNeRFlow: Time-Consistent Reconstruction of General Dynamic Scenes [75.9110646062442]
We propose SceNeRFlow to reconstruct a general, non-rigid scene in a time-consistent manner.
Our method takes multi-view RGB videos and background images from static cameras with known camera parameters as input.
We show experimentally that, unlike prior work that only handles small motion, our method enables the reconstruction of studio-scale motions.
arXiv Detail & Related papers (2023-08-16T09:50:35Z) - Local-Global Information Interaction Debiasing for Dynamic Scene Graph
Generation [51.92419880088668]
We propose a novel DynSGG model based on multi-task learning, DynSGG-MTL, which introduces the local interaction information and global human-action interaction information.
Long-temporal human actions supervise the model to generate multiple scene graphs that conform to the global constraints and avoid the model being unable to learn the tail predicates.
arXiv Detail & Related papers (2023-08-10T01:24:25Z) - Continuous-time convolutions model of event sequences [46.3471121117337]
Event sequences are non-uniform and sparse, making traditional models unsuitable.
We propose COTIC, a method based on an efficient convolution neural network designed to handle the non-uniform occurrence of events over time.
COTIC outperforms existing models in predicting the next event time and type, achieving an average rank of 1.5 compared to 3.714 for the nearest competitor.
arXiv Detail & Related papers (2023-02-13T10:34:51Z) - Event-based Non-Rigid Reconstruction from Contours [17.049602518532847]
We propose a novel approach for reconstructing such deformations using measurements from event-based cameras.
Under the assumption of a static background, where all events are generated by the motion, our approach estimates the deformation of objects from events generated at the object contour.
It associates events to mesh faces on the contour and maximizes the alignment of the line of sight through the event pixel with the associated face.
arXiv Detail & Related papers (2022-10-12T14:53:11Z) - Event Data Association via Robust Model Fitting for Event-based Object Tracking [66.05728523166755]
We propose a novel Event Data Association (called EDA) approach to explicitly address the event association and fusion problem.
The proposed EDA seeks for event trajectories that best fit the event data, in order to perform unifying data association and information fusion.
The experimental results show the effectiveness of EDA under challenging scenarios, such as high speed, motion blur, and high dynamic range conditions.
arXiv Detail & Related papers (2021-10-25T13:56:00Z) - Differentiable Event Stream Simulator for Non-Rigid 3D Tracking [82.56690776283428]
Our differentiable simulator enables non-rigid 3D tracking of deformable objects from event streams.
We show the effectiveness of our approach for various types of non-rigid objects and compare to existing methods for non-rigid 3D tracking.
arXiv Detail & Related papers (2021-04-30T17:58:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.