Related papers: ALERT-Transformer: Bridging Asynchronous and Synchronous Machine Learning for Real-Time Event-based Spatio-Temporal Data

ALERT-Transformer: Bridging Asynchronous and Synchronous Machine Learning for Real-Time Event-based Spatio-Temporal Data

URL: http://arxiv.org/abs/2402.01393v3
Date: Tue, 30 Jul 2024 11:20:47 GMT
Title: ALERT-Transformer: Bridging Asynchronous and Synchronous Machine Learning for Real-Time Event-based Spatio-Temporal Data
Authors: Carmen Martin-Turrero, Maxence Bouvier, Manuel Breitenstein, Pietro Zanuttigh, Vincent Parret,
Abstract summary: We propose a hybrid pipeline composed of asynchronous sensing and synchronous processing. We achieve performances state-of-the-art with a lower latency than competitors.
Score: 8.660721666999718
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We seek to enable classic processing of continuous ultra-sparse spatiotemporal data generated by event-based sensors with dense machine learning models. We propose a novel hybrid pipeline composed of asynchronous sensing and synchronous processing that combines several ideas: (1) an embedding based on PointNet models -- the ALERT module -- that can continuously integrate new and dismiss old events thanks to a leakage mechanism, (2) a flexible readout of the embedded data that allows to feed any downstream model with always up-to-date features at any sampling rate, (3) exploiting the input sparsity in a patch-based approach inspired by Vision Transformer to optimize the efficiency of the method. These embeddings are then processed by a transformer model trained for object and gesture recognition. Using this approach, we achieve performances at the state-of-the-art with a lower latency than competitors. We also demonstrate that our asynchronous model can operate at any desired sampling rate.

Related papers

AsynEIO: Asynchronous Monocular Event-Inertial Odometry Using Gaussian Process Regression [7.892365588256595]
We introduce a monocular event-inertial odometry method called AsynEIO, designed to fuse asynchronous event and inertial data. We show that AsynEIO outperforms existing methods, especially in high-speed and low-illumination scenarios.
arXiv Detail & Related papers (2024-11-19T02:39:57Z)
Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy. At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z)
STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition [50.064502884594376]
We study the problem of human action recognition using motion capture (MoCap) sequences. We propose a novel Spatial-Temporal Mesh Transformer (STMT) to directly model the mesh sequences. The proposed method achieves state-of-the-art performance compared to skeleton-based and point-cloud-based models.
arXiv Detail & Related papers (2023-03-31T16:19:27Z)
Joint Spatial-Temporal and Appearance Modeling with Transformer for Multiple Object Tracking [59.79252390626194]
We propose a novel solution named TransSTAM, which leverages Transformer to model both the appearance features of each object and the spatial-temporal relationships among objects. The proposed method is evaluated on multiple public benchmarks including MOT16, MOT17, and MOT20, and it achieves a clear performance improvement in both IDF1 and HOTA.
arXiv Detail & Related papers (2022-05-31T01:19:18Z)
Federated Action Recognition on Heterogeneous Embedded Devices [16.88104153104136]
In this work, we enable clients with limited computing power to perform action recognition, a computationally heavy task. We first perform model compression at the central server through knowledge distillation on a large dataset. The fine-tuning is required because limited data present in smaller datasets is not adequate for action recognition models to learn complextemporal features.
arXiv Detail & Related papers (2021-07-18T02:33:24Z)
Real-time End-to-End Federated Learning: An Automotive Case Study [16.79939549201032]
We introduce an approach to real-time end-to-end Federated Learning combined with a novel asynchronous model aggregation protocol. Our results show that asynchronous Federated Learning can significantly improve the prediction performance of local edge models and reach the same accuracy level as the centralized machine learning method.
arXiv Detail & Related papers (2021-03-22T14:16:16Z)
Robust Motion In-betweening [17.473287573543065]
We present a novel, robust transition generation technique that can serve as a new tool for 3D animators. The system synthesizes high-quality motions that use temporally-sparsers as animation constraints. We present a custom MotionBuilder plugin that uses our trained model to perform in-betweening in production scenarios.
arXiv Detail & Related papers (2021-02-09T16:52:45Z)
Event-based Asynchronous Sparse Convolutional Networks [54.094244806123235]
Event cameras are bio-inspired sensors that respond to per-pixel brightness changes in the form of asynchronous and sparse "events" We present a general framework for converting models trained on synchronous image-like event representations into asynchronous models with identical output. We show both theoretically and experimentally that this drastically reduces the computational complexity and latency of high-capacity, synchronous neural networks.
arXiv Detail & Related papers (2020-03-20T08:39:49Z)
A Generative Learning Approach for Spatio-temporal Modeling in Connected Vehicular Network [55.852401381113786]
This paper proposes LaMI (Latency Model Inpainting), a novel framework to generate a comprehensive-temporal quality framework for wireless access latency of connected vehicles. LaMI adopts the idea from image inpainting and synthesizing and can reconstruct the missing latency samples by a two-step procedure. In particular, it first discovers the spatial correlation between samples collected in various regions using a patching-based approach and then feeds the original and highly correlated samples into a Varienational Autocoder (VAE)
arXiv Detail & Related papers (2020-03-16T03:43:59Z)
Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence. This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time. Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.