Path Planning based on 2D Object Bounding-box
- URL: http://arxiv.org/abs/2402.14933v1
- Date: Thu, 22 Feb 2024 19:34:56 GMT
- Title: Path Planning based on 2D Object Bounding-box
- Authors: Yanliang Huang, Liguo Zhou, Chang Liu, Alois Knoll
- Abstract summary: We present a path planning method that utilizes 2D bounding boxes of objects, developed through imitation learning in urban driving scenarios.
This is achieved by integrating high-definition (HD) map data with images captured by surrounding cameras.
We evaluate our model on the nuPlan planning task and observed that it performs competitively in comparison to existing vision-centric methods.
- Score: 8.082514573754954
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The implementation of Autonomous Driving (AD) technologies within urban
environments presents significant challenges. These challenges necessitate the
development of advanced perception systems and motion planning algorithms
capable of managing situations of considerable complexity. Although the
end-to-end AD method utilizing LiDAR sensors has achieved significant success
in this scenario, we argue that its drawbacks may hinder its practical
application. Instead, we propose the vision-centric AD as a promising
alternative offering a streamlined model without compromising performance. In
this study, we present a path planning method that utilizes 2D bounding boxes
of objects, developed through imitation learning in urban driving scenarios.
This is achieved by integrating high-definition (HD) map data with images
captured by surrounding cameras. Subsequent perception tasks involve
bounding-box detection and tracking, while the planning phase employs both
local embeddings via Graph Neural Network (GNN) and global embeddings via
Transformer for temporal-spatial feature aggregation, ultimately producing
optimal path planning information. We evaluated our model on the nuPlan
planning task and observed that it performs competitively in comparison to
existing vision-centric methods.
Related papers
- OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments [77.0399450848749]
We propose an OccNeRF method for training occupancy networks without 3D supervision.
We parameterize the reconstructed occupancy fields and reorganize the sampling strategy to align with the cameras' infinite perceptive range.
For semantic occupancy prediction, we design several strategies to polish the prompts and filter the outputs of a pretrained open-vocabulary 2D segmentation model.
arXiv Detail & Related papers (2023-12-14T18:58:52Z) - Planning as In-Painting: A Diffusion-Based Embodied Task Planning
Framework for Environments under Uncertainty [56.30846158280031]
Task planning for embodied AI has been one of the most challenging problems.
We propose a task-agnostic method named 'planning as in-painting'
The proposed framework achieves promising performances in various embodied AI tasks.
arXiv Detail & Related papers (2023-12-02T10:07:17Z) - ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal
Feature Learning [132.20119288212376]
We propose a spatial-temporal feature learning scheme towards a set of more representative features for perception, prediction and planning tasks simultaneously.
To the best of our knowledge, we are the first to systematically investigate each part of an interpretable end-to-end vision-based autonomous driving system.
arXiv Detail & Related papers (2022-07-15T16:57:43Z) - Uncertainty-driven Planner for Exploration and Navigation [36.933903274373336]
We consider the problems of exploration and point-goal navigation in previously unseen environments.
We argue that learning occupancy priors over indoor maps provides significant advantages towards addressing these problems.
We present a novel planning framework that first learns to generate occupancy maps beyond the field-of-view of the agent.
arXiv Detail & Related papers (2022-02-24T05:25:31Z) - Trajectory-Constrained Deep Latent Visual Attention for Improved Local
Planning in Presence of Heterogeneous Terrain [35.12388111707609]
We present a reward-predictive, model-based deep learning method featuring trajectory-constrained visual attention for use in mapless, local visual navigation tasks.
Our method learns to place visual attention at locations in latent image space which follow trajectories caused by vehicle control actions to enhance predictive accuracy during planning.
We validated our model in visual navigation tasks of planning low turbulence, collision-free trajectories in off-road settings and hill climbing with locking differentials in the presence of slippery terrain.
arXiv Detail & Related papers (2021-12-09T03:38:28Z) - Neural Motion Planning for Autonomous Parking [6.1805402105389895]
This paper presents a hybrid motion planning strategy that combines a deep generative network with a conventional motion planning method.
The proposed method effectively learns the representations of a given state, and shows improvement in terms of algorithm performance.
arXiv Detail & Related papers (2021-11-12T14:29:38Z) - Temporal Predictive Coding For Model-Based Planning In Latent Space [80.99554006174093]
We present an information-theoretic approach that employs temporal predictive coding to encode elements in the environment that can be predicted across time.
We evaluate our model on a challenging modification of standard DMControl tasks where the background is replaced with natural videos that contain complex but irrelevant information to the planning task.
arXiv Detail & Related papers (2021-06-14T04:31:15Z) - End-to-end Interpretable Neural Motion Planner [78.69295676456085]
We propose a neural motion planner (NMP) for learning to drive autonomously in complex urban scenarios.
We design a holistic model that takes as input raw LIDAR data and a HD map and produces interpretable intermediate representations.
We demonstrate the effectiveness of our approach in real-world driving data captured in several cities in North America.
arXiv Detail & Related papers (2021-01-17T14:16:12Z) - Latent Space Roadmap for Visual Action Planning of Deformable and Rigid
Object Manipulation [74.88956115580388]
Planning is performed in a low-dimensional latent state space that embeds images.
Our framework consists of two main components: a Visual Foresight Module (VFM) that generates a visual plan as a sequence of images, and an Action Proposal Network (APN) that predicts the actions between them.
arXiv Detail & Related papers (2020-03-19T18:43:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.