Deep Learning-based Event Data Coding: A Joint Spatiotemporal and Polarity Solution
- URL: http://arxiv.org/abs/2502.03285v1
- Date: Wed, 05 Feb 2025 15:39:55 GMT
- Title: Deep Learning-based Event Data Coding: A Joint Spatiotemporal and Polarity Solution
- Authors: Abdelrahman Seleem, André F. R. Guarda, Nuno M. M. Rodrigues, Fernando Pereira,
- Abstract summary: Event cameras generate a massive number of pixel-level events composed bytemporal and polarity information.
This paper proposes a novel lossy Deep Learning-based Joint Event data Coding (DL-JEC) solution adopting a single-point cloud representation.
It is shown that it is possible to use lossy event data coding with its reduced rate regarding coding without compromising the target computer vision task performance.
- Score: 45.8313373627054
- License:
- Abstract: Neuromorphic vision sensors, commonly referred to as event cameras, have recently gained relevance for applications requiring high-speed, high dynamic range and low-latency data acquisition. Unlike traditional frame-based cameras that capture 2D images, event cameras generate a massive number of pixel-level events, composed by spatiotemporal and polarity information, with very high temporal resolution, thus demanding highly efficient coding solutions. Existing solutions focus on lossless coding of event data, assuming that no distortion is acceptable for the target use cases, mostly including computer vision tasks. One promising coding approach exploits the similarity between event data and point clouds, thus allowing to use current point cloud coding solutions to code event data, typically adopting a two-point clouds representation, one for each event polarity. This paper proposes a novel lossy Deep Learning-based Joint Event data Coding (DL-JEC) solution adopting a single-point cloud representation, thus enabling to exploit the correlation between the spatiotemporal and polarity event information. DL-JEC can achieve significant compression performance gains when compared with relevant conventional and DL-based state-of-the-art event data coding solutions. Moreover, it is shown that it is possible to use lossy event data coding with its reduced rate regarding lossless coding without compromising the target computer vision task performance, notably for event classification. The use of novel adaptive voxel binarization strategies, adapted to the target task, further enables DL-JEC to reach a superior performance.
Related papers
- Double Deep Learning-based Event Data Coding and Classification [45.8313373627054]
Event cameras have the ability to capture asynchronous per-pixel brightness changes, called "events"
This paper proposes a novel double deep learning-based architecture for both event data coding and classification, using a point cloud-based representation for events.
arXiv Detail & Related papers (2024-07-22T10:45:55Z) - Relating Events and Frames Based on Self-Supervised Learning and
Uncorrelated Conditioning for Unsupervised Domain Adaptation [23.871860648919593]
Event-based cameras provide accurate and high temporal resolution measurements for performing computer vision tasks.
Despite their advantages, utilizing deep learning for event-based vision encounters a significant obstacle due to the scarcity of annotated data.
We propose a new algorithm tailored for adapting a deep neural network trained on annotated frame-based data to generalize well on event-based unannotated data.
arXiv Detail & Related papers (2024-01-02T05:10:08Z) - CrossZoom: Simultaneously Motion Deblurring and Event Super-Resolving [38.96663258582471]
CrossZoom is a novel unified neural Network (CZ-Net) to jointly recover sharp latent sequences within the exposure period of a blurry input and the corresponding High-Resolution (HR) events.
We present a multi-scale blur-event fusion architecture that leverages the scale-variant properties and effectively fuses cross-modality information to achieve cross-enhancement.
We propose a new dataset containing HR sharp-blurry images and the corresponding HR-LR event streams to facilitate future research.
arXiv Detail & Related papers (2023-09-29T03:27:53Z) - Unsupervised Domain Adaptation for Training Event-Based Networks Using
Contrastive Learning and Uncorrelated Conditioning [12.013345715187285]
Deep learning in event-based vision faces the challenge of annotated data scarcity due to recency of event cameras.
We develop an unsupervised domain adaptation algorithm for training a deep network for event-based data image classification.
arXiv Detail & Related papers (2023-03-22T09:51:08Z) - Dual Memory Aggregation Network for Event-Based Object Detection with
Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner.
Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation.
Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z) - Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks.
We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation.
We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z) - MEFNet: Multi-scale Event Fusion Network for Motion Deblurring [62.60878284671317]
Traditional frame-based cameras inevitably suffer from motion blur due to long exposure times.
As a kind of bio-inspired camera, the event camera records the intensity changes in an asynchronous way with high temporal resolution.
In this paper, we rethink the event-based image deblurring problem and unfold it into an end-to-end two-stage image restoration network.
arXiv Detail & Related papers (2021-11-30T23:18:35Z) - Dynamic Network-Assisted D2D-Aided Coded Distributed Learning [59.29409589861241]
We propose a novel device-to-device (D2D)-aided coded federated learning method (D2D-CFL) for load balancing across devices.
We derive an optimal compression rate for achieving minimum processing time and establish its connection with the convergence time.
Our proposed method is beneficial for real-time collaborative applications, where the users continuously generate training data.
arXiv Detail & Related papers (2021-11-26T18:44:59Z) - Learning to Detect Objects with a 1 Megapixel Event Camera [14.949946376335305]
Event cameras encode visual information with high temporal precision, low data-rate, and high-dynamic range.
Due to the novelty of the field, the performance of event-based systems on many vision tasks is still lower compared to conventional frame-based solutions.
arXiv Detail & Related papers (2020-09-28T16:03:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.