Deep Learning Method for Cell-Wise Object Tracking, Velocity Estimation
and Projection of Sensor Data over Time
- URL: http://arxiv.org/abs/2306.06126v2
- Date: Sun, 18 Jun 2023 19:17:33 GMT
- Title: Deep Learning Method for Cell-Wise Object Tracking, Velocity Estimation
and Projection of Sensor Data over Time
- Authors: Marco Braun, Moritz Luszek, Mirko Meuter, Dominic Spata, Kevin Kollek
and Anton Kummert
- Abstract summary: We show how ConvNets suffer from architectural restrictions for this task.
In a last step, the memory state of the Recurrent Neural Network is projected based on extracted velocity estimates.
- Score: 0.7340017786387767
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Current Deep Learning methods for environment segmentation and velocity
estimation rely on Convolutional Recurrent Neural Networks to exploit
spatio-temporal relationships within obtained sensor data. These approaches
derive scene dynamics implicitly by correlating novel input and memorized data
utilizing ConvNets. We show how ConvNets suffer from architectural restrictions
for this task. Based on these findings, we then provide solutions to various
issues on exploiting spatio-temporal correlations in a sequence of sensor
recordings by presenting a novel Recurrent Neural Network unit utilizing
Transformer mechanisms. Within this unit, object encodings are tracked across
consecutive frames by correlating key-query pairs derived from sensor inputs
and memory states, respectively. We then use resulting tracking patterns to
obtain scene dynamics and regress velocities. In a last step, the memory state
of the Recurrent Neural Network is projected based on extracted velocity
estimates to resolve aforementioned spatio-temporal misalignment.
Related papers
- Kriformer: A Novel Spatiotemporal Kriging Approach Based on Graph Transformers [5.4381914710364665]
This study addresses posed by sparse sensor deployment and unreliable data by framing the problem as an environmental challenge.
A graphkriformer model, Kriformer, estimates data at locations without sensors by mining spatial and temporal correlations, even with limited resources.
arXiv Detail & Related papers (2024-09-23T11:01:18Z) - TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Leveraging arbitrary mobile sensor trajectories with shallow recurrent
decoder networks for full-state reconstruction [4.243926243206826]
We show that a sequence-to-vector model, such as an LSTM (long, short-term memory) network, with a decoder network, dynamic information can be mapped to full state-space estimates.
The exceptional performance of the network architecture is demonstrated on three applications.
arXiv Detail & Related papers (2023-07-20T21:42:01Z) - Convolutional generative adversarial imputation networks for
spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods.
We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z) - Spatio-Temporal Recurrent Networks for Event-Based Optical Flow
Estimation [47.984368369734995]
We introduce a novel recurrent encoding-decoding neural network architecture for event-based optical flow estimation.
The network is end-to-end trained with self-supervised learning on the Multi-Vehicle Stereo Event Camera dataset.
We have shown that it outperforms all the existing state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-09-10T13:37:37Z) - SignalNet: A Low Resolution Sinusoid Decomposition and Estimation
Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples.
We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions.
In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z) - Deep Cellular Recurrent Network for Efficient Analysis of Time-Series
Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information.
The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z) - A Prospective Study on Sequence-Driven Temporal Sampling and Ego-Motion
Compensation for Action Recognition in the EPIC-Kitchens Dataset [68.8204255655161]
Action recognition is one of the top-challenging research fields in computer vision.
ego-motion recorded sequences have become of important relevance.
The proposed method aims to cope with it by estimating this ego-motion or camera motion.
arXiv Detail & Related papers (2020-08-26T14:44:45Z) - Handling Variable-Dimensional Time Series with Graph Neural Networks [20.788813485815698]
Internet of Things (IoT) technology involves capturing data from multiple sensors resulting in multi-sensor time series.
Existing neural networks based approaches for such multi-sensor time series modeling assume fixed input dimension or number of sensors.
We consider training neural network models from such multi-sensor time series, where the time series have varying input dimensionality owing to availability or installation of a different subset of sensors at each source of time series.
arXiv Detail & Related papers (2020-07-01T12:11:16Z) - Deep ConvLSTM with self-attention for human activity decoding using
wearables [0.0]
We propose a deep neural network architecture that captures features of multiple sensor time-series data but also selects important time points.
We show the validity of the proposed approach across different data sampling strategies and demonstrate that the self-attention mechanism gave a significant improvement.
The proposed methods open avenues for better decoding of human activity from multiple body sensors over extended periods time.
arXiv Detail & Related papers (2020-05-02T04:30:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.