Double Deep Learning-based Event Data Coding and Classification
- URL: http://arxiv.org/abs/2407.15531v1
- Date: Mon, 22 Jul 2024 10:45:55 GMT
- Title: Double Deep Learning-based Event Data Coding and Classification
- Authors: Abdelrahman Seleem, André F. R. Guarda, Nuno M. M. Rodrigues, Fernando Pereira,
- Abstract summary: Event cameras have the ability to capture asynchronous per-pixel brightness changes, called "events"
This paper proposes a novel double deep learning-based architecture for both event data coding and classification, using a point cloud-based representation for events.
- Score: 45.8313373627054
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Event cameras have the ability to capture asynchronous per-pixel brightness changes, called "events", offering advantages over traditional frame-based cameras for computer vision applications. Efficiently coding event data is critical for transmission and storage, given the significant volume of events. This paper proposes a novel double deep learning-based architecture for both event data coding and classification, using a point cloud-based representation for events. In this context, the conversions from events to point clouds and back to events are key steps in the proposed solution, and therefore its impact is evaluated in terms of compression and classification performance. Experimental results show that it is possible to achieve a classification performance of compressed events which is similar to one of the original events, even after applying a lossy point cloud codec, notably the recent learning-based JPEG Pleno Point Cloud Coding standard, with a clear rate reduction. Experimental results also demonstrate that events coded using JPEG PCC achieve better classification performance than those coded using the conventional lossy MPEG Geometry-based Point Cloud Coding standard. Furthermore, the adoption of learning-based coding offers high potential for performing computer vision tasks in the compressed domain, which allows skipping the decoding stage while mitigating the impact of coding artifacts.
Related papers
- Deep Learning-based Event Data Coding: A Joint Spatiotemporal and Polarity Solution [45.8313373627054]
Event cameras generate a massive number of pixel-level events composed bytemporal and polarity information.
This paper proposes a novel lossy Deep Learning-based Joint Event data Coding (DL-JEC) solution adopting a single-point cloud representation.
It is shown that it is possible to use lossy event data coding with its reduced rate regarding coding without compromising the target computer vision task performance.
arXiv Detail & Related papers (2025-02-05T15:39:55Z) - Event Masked Autoencoder: Point-wise Action Recognition with Event-Based Cameras [8.089601548579116]
We propose a novel framework that preserves and exploits the structure of event data for action recognition.
Our framework consists of two main components: 1) a point-wise event masked autoencoder (MAE) that learns a compact and discrimi representation of event patches by reconstructing them from masked raw event camera points data; 2) an improved event points patch generation algorithm that leverages an event data inlier model and point-wise data augmentation techniques to enhance the quality and diversity event points patches.
arXiv Detail & Related papers (2025-01-02T03:49:03Z) - CALLIC: Content Adaptive Learning for Lossless Image Compression [64.47244912937204]
CALLIC sets a new state-of-the-art (SOTA) for learned lossless image compression.
We propose a content-aware autoregressive self-attention mechanism by leveraging convolutional gating operations.
During encoding, we decompose pre-trained layers, including depth-wise convolutions, using low-rank matrices and then adapt the incremental weights on testing image by Rate-guided Progressive Fine-Tuning (RPFT)
RPFT fine-tunes with gradually increasing patches that are sorted in descending order by estimated entropy, optimizing learning process and reducing adaptation time.
arXiv Detail & Related papers (2024-12-23T10:41:18Z) - The JPEG Pleno Learning-based Point Cloud Coding Standard: Serving Man and Machine [49.16996486119006]
Deep learning has emerged as a powerful tool in point cloud coding.
JPEG has recently finalized the JPEG Pleno Learning-based Point Cloud Coding standard.
This paper provides a complete technical description of the JPEG PCC standard.
arXiv Detail & Related papers (2024-09-12T15:20:23Z) - EZSR: Event-based Zero-Shot Recognition [21.10165234725309]
This paper studies zero-shot object recognition using event camera data.
We develop an event encoder without relying on additional reconstruction networks.
Our model with a ViT/B-16 backbone achieves 47.84% zero-shot accuracy on the N-ImageNet dataset.
arXiv Detail & Related papers (2024-07-31T14:06:06Z) - EventCLIP: Adapting CLIP for Event-based Object Recognition [26.35633454924899]
EventCLIP is a novel approach that utilizes CLIP for zero-shot and few-shot event-based object recognition.
We first generalize CLIP's image encoder to event data by converting raw events to 2D grid-based representations.
We evaluate EventCLIP on N-Caltech, N-Cars, and N-ImageNet datasets, achieving state-of-the-art few-shot performance.
arXiv Detail & Related papers (2023-06-10T06:05:35Z) - Bilevel Fast Scene Adaptation for Low-Light Image Enhancement [50.639332885989255]
Enhancing images in low-light scenes is a challenging but widely concerned task in the computer vision.
Main obstacle lies in the modeling conundrum from distribution discrepancy across different scenes.
We introduce the bilevel paradigm to model the above latent correspondence.
A bilevel learning framework is constructed to endow the scene-irrelevant generality of the encoder towards diverse scenes.
arXiv Detail & Related papers (2023-06-02T08:16:21Z) - Learning Monocular Dense Depth from Events [53.078665310545745]
Event cameras produce brightness changes in the form of a stream of asynchronous events instead of intensity frames.
Recent learning-based approaches have been applied to event-based data, such as monocular depth prediction.
We propose a recurrent architecture to solve this task and show significant improvement over standard feed-forward methods.
arXiv Detail & Related papers (2020-10-16T12:36:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.