Event2Vec: Processing neuromorphic events directly by representations in vector space
- URL: http://arxiv.org/abs/2504.15371v1
- Date: Mon, 21 Apr 2025 18:21:18 GMT
- Title: Event2Vec: Processing neuromorphic events directly by representations in vector space
- Authors: Wei Fang, Priyadarshini Panda,
- Abstract summary: Neuromorphic event cameras have advantages in temporal resolution, power efficiency, and dynamic range compared to traditional cameras.<n>However, the event cameras output asynchronous, sparse, and irregular events, which are not compatible with mainstream computer vision and deep learning methods.<n>We propose the first event to vector (event2vec) representation, showing impressive parameter efficiency, accuracy, and speed than previous graph/image/voxel-based representations.
- Score: 12.165767356450289
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The neuromorphic event cameras have overwhelming advantages in temporal resolution, power efficiency, and dynamic range compared to traditional cameras. However, the event cameras output asynchronous, sparse, and irregular events, which are not compatible with mainstream computer vision and deep learning methods. Various methods have been proposed to solve this issue but at the cost of long preprocessing procedures, losing temporal resolutions, or being incompatible with massively parallel computation. Inspired by the great success of the word to vector, we summarize the similarities between words and events, then propose the first event to vector (event2vec) representation. We validate event2vec on classifying the ASL-DVS dataset, showing impressive parameter efficiency, accuracy, and speed than previous graph/image/voxel-based representations. Beyond task performance, the most attractive advantage of event2vec is that it aligns events to the domain of natural language processing, showing the promising prospect of integrating events into large language and multimodal models. Our codes, models, and training logs are available at https://github.com/fangwei123456/event2vec.
Related papers
- EventGPT: Event Stream Understanding with Multimodal Large Language Models [59.65010502000344]
Event cameras record visual information as asynchronous pixel change streams, excelling at scene perception under unsatisfactory lighting or high-dynamic conditions.<n>Existing multimodal large language models (MLLMs) concentrate on natural RGB images, failing in scenarios where event data fits better.<n>We introduce EventGPT, the first MLLM for event stream understanding.
arXiv Detail & Related papers (2024-12-01T14:38:40Z) - Text-to-Events: Synthetic Event Camera Streams from Conditional Text Input [8.365349007799296]
Event cameras are advantageous for tasks that require vision sensors with low-latency and sparse output responses.
This paper reports a method for creating new labelled event datasets by using a text-to-X model.
We demonstrate that the model can generate realistic event sequences of human gestures prompted by different text statements.
arXiv Detail & Related papers (2024-06-05T16:34:12Z) - Scalable Event-by-event Processing of Neuromorphic Sensory Signals With Deep State-Space Models [2.551844666707809]
Event-based sensors are well suited for real-time processing.
Current methods either collapse events into frames or cannot scale up when processing the event data directly event-by-event.
arXiv Detail & Related papers (2024-04-29T08:50:27Z) - GET: Group Event Transformer for Event-Based Vision [82.312736707534]
Event cameras are a type of novel neuromorphic sen-sor that has been gaining increasing attention.
We propose a novel Group-based vision Transformer backbone for Event-based vision, called Group Event Transformer (GET)
GET de-couples temporal-polarity information from spatial infor-mation throughout the feature extraction process.
arXiv Detail & Related papers (2023-10-04T08:02:33Z) - Graph-based Asynchronous Event Processing for Rapid Object Recognition [59.112755601918074]
Event cameras capture asynchronous events stream in which each event encodes pixel location, trigger time, and the polarity of the brightness changes.
We introduce a novel graph-based framework for event cameras, namely SlideGCN.
Our approach can efficiently process data event-by-event, unlock the low latency nature of events data while still maintaining the graph's structure internally.
arXiv Detail & Related papers (2023-08-28T08:59:57Z) - EventBind: Learning a Unified Representation to Bind Them All for Event-based Open-world Understanding [7.797154022794006]
EventBind is a novel framework that unleashes the potential of vision-language models (VLMs) for event-based recognition.
We first introduce a novel event encoder that subtly models the temporal information from events.
We then design a text encoder that generates content prompts and utilizes hybrid text prompts to enhance EventBind's generalization ability.
arXiv Detail & Related papers (2023-08-06T15:05:42Z) - Dual Memory Aggregation Network for Event-Based Object Detection with
Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner.
Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation.
Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z) - Event Transformer+. A multi-purpose solution for efficient event data
processing [13.648678472312374]
Event cameras record sparse illumination changes with high temporal resolution and high dynamic range.
Current methods often ignore specific event-data properties, leading to the development of generic but computationally expensive algorithms.
We propose Event Transformer+, that improves our seminal work EvT with a refined patch-based event representation.
arXiv Detail & Related papers (2022-11-22T12:28:37Z) - Avoiding Post-Processing with Event-Based Detection in Biomedical
Signals [69.34035527763916]
We propose an event-based modeling framework that directly works with events as learning targets.
We show that event-based modeling (without post-processing) performs on par with or better than epoch-based modeling with extensive post-processing.
arXiv Detail & Related papers (2022-09-22T13:44:13Z) - Unifying Event Detection and Captioning as Sequence Generation via
Pre-Training [53.613265415703815]
We propose a unified pre-training and fine-tuning framework to enhance the inter-task association between event detection and captioning.
Our model outperforms the state-of-the-art methods, and can be further boosted when pre-trained on extra large-scale video-text data.
arXiv Detail & Related papers (2022-07-18T14:18:13Z) - Event Transformer [43.193463048148374]
Event camera's low power consumption and ability to capture microsecond brightness make it attractive for various computer vision tasks.
Existing event representation methods typically convert events into frames, voxel grids, or spikes for deep neural networks (DNNs)
This work introduces a novel token-based event representation, where each event is considered a fundamental processing unit termed an event-token.
arXiv Detail & Related papers (2022-04-11T15:05:06Z) - Unsupervised Feature Learning for Event Data: Direct vs Inverse Problem
Formulation [53.850686395708905]
Event-based cameras record an asynchronous stream of per-pixel brightness changes.
In this paper, we focus on single-layer architectures for representation learning from event data.
We show improvements of up to 9 % in the recognition accuracy compared to the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-23T10:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.