Visual Forecasting as a Mid-level Representation for Avoidance
- URL: http://arxiv.org/abs/2310.07724v1
- Date: Sun, 17 Sep 2023 13:32:03 GMT
- Title: Visual Forecasting as a Mid-level Representation for Avoidance
- Authors: Hsuan-Kung Yang, Tsung-Chih Chiang, Ting-Ru Liu, Chun-Wei Huang,
Jou-Min Liu, Chun-Yi Lee
- Abstract summary: The challenge of navigation in environments with dynamic objects continues to be a central issue in the study of autonomous agents.
While predictive methods hold promise, their reliance on precise state information makes them less practical for real-world implementation.
This study presents visual forecasting as an innovative alternative.
- Score: 8.712750753534532
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The challenge of navigation in environments with dynamic objects continues to
be a central issue in the study of autonomous agents. While predictive methods
hold promise, their reliance on precise state information makes them less
practical for real-world implementation. This study presents visual forecasting
as an innovative alternative. By introducing intuitive visual cues, this
approach projects the future trajectories of dynamic objects to improve agent
perception and enable anticipatory actions. Our research explores two distinct
strategies for conveying predictive information through visual forecasting: (1)
sequences of bounding boxes, and (2) augmented paths. To validate the proposed
visual forecasting strategies, we initiate evaluations in simulated
environments using the Unity engine and then extend these evaluations to
real-world scenarios to assess both practicality and effectiveness. The results
confirm the viability of visual forecasting as a promising solution for
navigation and obstacle avoidance in dynamic environments.
Related papers
- Narrowing the Gap between Vision and Action in Navigation [28.753809306008996]
We introduce a low-level action decoder jointly trained with high-level action prediction.
Our agent can improve navigation performance metrics compared to the strong baselines on both high-level and low-level actions.
arXiv Detail & Related papers (2024-08-19T20:09:56Z) - AdaOcc: Adaptive Forward View Transformation and Flow Modeling for 3D Occupancy and Flow Prediction [56.72301849123049]
We present our solution for the Vision-Centric 3D Occupancy and Flow Prediction track in the nuScenes Open-Occ dataset challenge at CVPR 2024.
Our innovative approach involves a dual-stage framework that enhances 3D occupancy and flow predictions by incorporating adaptive forward view transformation and flow modeling.
Our method combines regression with classification to address scale variations in different scenes, and leverages predicted flow to warp current voxel features to future frames, guided by future frame ground truth.
arXiv Detail & Related papers (2024-07-01T16:32:15Z) - OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising [49.86409475232849]
Trajectory prediction is fundamental in computer vision and autonomous driving.
Existing approaches in this field often assume precise and complete observational data.
We present a novel method for out-of-sight trajectory prediction that leverages a vision-positioning technique.
arXiv Detail & Related papers (2024-04-02T18:30:29Z) - A Reliable Representation with Bidirectional Transition Model for Visual
Reinforcement Learning Generalization [39.6041403130768]
We introduce a Bidirectional Transition (BiT) model, which leverages the ability to bidirectionally predict environmental transitions both forward and backward to extract reliable representations.
Our model demonstrates competitive generalization performance and sample efficiency on two settings of the DeepMind Control suite.
arXiv Detail & Related papers (2023-12-04T14:19:36Z) - Towards Motion Forecasting with Real-World Perception Inputs: Are
End-to-End Approaches Competitive? [93.10694819127608]
We propose a unified evaluation pipeline for forecasting methods with real-world perception inputs.
Our in-depth study uncovers a substantial performance gap when transitioning from curated to perception-based data.
arXiv Detail & Related papers (2023-06-15T17:03:14Z) - Motion-Scenario Decoupling for Rat-Aware Video Position Prediction:
Strategy and Benchmark [49.58762201363483]
We introduce RatPose, a bio-robot motion prediction dataset constructed by considering the influence factors of individuals and environments.
We propose a Dual-stream Motion-Scenario Decoupling framework that effectively separates scenario-oriented and motion-oriented features.
We demonstrate significant performance improvements of the proposed textitDMSD framework on different difficulty-level tasks.
arXiv Detail & Related papers (2023-05-17T14:14:31Z) - Predictive Experience Replay for Continual Visual Control and
Forecasting [62.06183102362871]
We present a new continual learning approach for visual dynamics modeling and explore its efficacy in visual control and forecasting.
We first propose the mixture world model that learns task-specific dynamics priors with a mixture of Gaussians, and then introduce a new training strategy to overcome catastrophic forgetting.
Our model remarkably outperforms the naive combinations of existing continual learning and visual RL algorithms on DeepMind Control and Meta-World benchmarks with continual visual control tasks.
arXiv Detail & Related papers (2023-03-12T05:08:03Z) - Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments [37.00873004170998]
We propose a framework that integrates two capabilities together using deep network architectures.
Our method is validated on the real-world Open dataset and demonstrates higher prediction accuracy than baseline methods.
arXiv Detail & Related papers (2022-09-27T06:12:34Z) - Visual Sensor Pose Optimisation Using Rendering-based Visibility Models
for Robust Cooperative Perception [4.5144287492490625]
Visual Sensor Networks can be used in a variety of perception applications such as infrastructure support for autonomous driving in complex road segments.
The pose of the sensors in such networks directly determines the coverage of the environment and objects therein.
This paper proposes two novel sensor pose optimisation methods, based on gradient-ascent and Programming techniques.
arXiv Detail & Related papers (2021-06-09T18:02:32Z) - Instance-Aware Predictive Navigation in Multi-Agent Environments [93.15055834395304]
We propose an Instance-Aware Predictive Control (IPC) approach, which forecasts interactions between agents as well as future scene structures.
We adopt a novel multi-instance event prediction module to estimate the possible interaction among agents in the ego-centric view.
We design a sequential action sampling strategy to better leverage predicted states on both scene-level and instance-level.
arXiv Detail & Related papers (2021-01-14T22:21:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.