Related papers: Learning to track environment state via predictive autoencoding

Learning to track environment state via predictive autoencoding

URL: http://arxiv.org/abs/2112.07745v1
Date: Tue, 14 Dec 2021 21:07:21 GMT
Title: Learning to track environment state via predictive autoencoding
Authors: Marian Andrecki, Nicholas K. Taylor
Abstract summary: This work introduces a neural architecture for learning forward models of environments. The task is achieved solely through learning from temporal unstructured observations in the form of images. The network can output both expectation over future observations and samples from belief distribution.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work introduces a neural architecture for learning forward models of stochastic environments. The task is achieved solely through learning from temporal unstructured observations in the form of images. Once trained, the model allows for tracking of the environment state in the presence of noise or with new percepts arriving intermittently. Additionally, the state estimate can be propagated in observation-blind mode, thus allowing for long-term predictions. The network can output both expectation over future observations and samples from belief distribution. The resulting functionalities are similar to those of a Particle Filter (PF). The architecture is evaluated in an environment where we simulate objects moving. As the forward and sensor models are available, we implement a PF to gauge the quality of the models learnt from the data.

Related papers

Perpetua: Multi-Hypothesis Persistence Modeling for Semi-Static Environments [14.727014155729826]
This paper introduces Perpetua, a method for modeling the dynamics of semi-static features.<n>We chain together mixtures of "persistence" and "emergence" filters to model the probability that features will disappear or reappear.<n>We find that Perpetua yields better accuracy than similar approaches while also being online adaptable and robust to missing observations.
arXiv Detail & Related papers (2025-07-24T21:11:23Z)
Generalist Forecasting with Frozen Video Models via Latent Diffusion [35.96406989431198]
We show a strong correlation between a vision model's perceptual ability and its generalist forecasting performance over short time horizons.<n>Our results highlight the value of bridging representation learning and generative modeling for temporally grounded video understanding.
arXiv Detail & Related papers (2025-07-18T14:14:19Z)
OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning [67.07363529640784]
We propose OpenSTL to categorize prevalent approaches into recurrent-based and recurrent-free models. We conduct standard evaluations on datasets across various domains, including synthetic moving object trajectory, human motion, driving scenes, traffic flow and forecasting weather. We find that recurrent-free models achieve a good balance between efficiency and performance than recurrent models.
arXiv Detail & Related papers (2023-06-20T03:02:14Z)
TempSAL -- Uncovering Temporal Information for Deep Saliency Prediction [64.63645677568384]
We introduce a novel saliency prediction model that learns to output saliency maps in sequential time intervals. Our approach locally modulates the saliency predictions by combining the learned temporal maps. Our code will be publicly available on GitHub.
arXiv Detail & Related papers (2023-01-05T22:10:16Z)
PRISM: Probabilistic Real-Time Inference in Spatial World Models [52.878769723544615]
PRISM is a method for real-time filtering in a probabilistic generative model of agent motion and visual perception. The proposed solution runs at 10Hz real-time and is similarly accurate to state-of-the-art SLAM in small to medium-sized indoor environments.
arXiv Detail & Related papers (2022-12-06T13:59:06Z)
Forecasting Unobserved Node States with spatio-temporal Graph Neural Networks [1.0965065178451106]
We develop a framework that allows forecasting the state at entirely unobserved locations based on spatial-temporal correlations and the graph inductive bias. Our framework can be combined with any Graph Neural Network, that exploits surrounding correlations with observed locations by using the network's graph structure. Our empirical evaluation of both simulated and real-world datasets demonstrates that Graph Neural Networks are well-suited for this task.
arXiv Detail & Related papers (2022-11-21T15:52:06Z)
Wireless Channel Prediction in Partially Observed Environments [10.803318254625687]
Site-specific radio frequency (RF) propagation prediction increasingly relies on models built from visual data such as cameras and LIDAR sensors. This paper introduces a method to extract statistical channel models, given partial observations of the surrounding environment. It is shown that the proposed method can interpolate between fully statistical models when no partial information is available and fully deterministic models when the environment is completely observed.
arXiv Detail & Related papers (2022-07-03T01:46:57Z)
Conditioned Human Trajectory Prediction using Iterative Attention Blocks [70.36888514074022]
We present a simple yet effective pedestrian trajectory prediction model aimed at pedestrians positions prediction in urban-like environments. Our model is a neural-based architecture that can run several layers of attention blocks and transformers in an iterative sequential fashion. We show that without explicit introduction of social masks, dynamical models, social pooling layers, or complicated graph-like structures, it is possible to produce on par results with SoTA models.
arXiv Detail & Related papers (2022-06-29T07:49:48Z)
Learning Multi-Object Dynamics with Compositional Neural Radiance Fields [63.424469458529906]
We present a method to learn compositional predictive models from image observations based on implicit object encoders, Neural Radiance Fields (NeRFs), and graph neural networks. NeRFs have become a popular choice for representing scenes due to their strong 3D prior. For planning, we utilize RRTs in the learned latent space, where we can exploit our model and the implicit object encoder to make sampling the latent space informative and more efficient.
arXiv Detail & Related papers (2022-02-24T01:31:29Z)
Multi-Branch Deep Radial Basis Function Networks for Facial Emotion Recognition [80.35852245488043]
We propose a CNN based architecture enhanced with multiple branches formed by radial basis function (RBF) units. RBF units capture local patterns shared by similar instances using an intermediate representation. We show it is the incorporation of local information what makes the proposed model competitive.
arXiv Detail & Related papers (2021-09-07T21:05:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.