Related papers: PlanT: Explainable Planning Transformers via Object-Level Representations

PlanT: Explainable Planning Transformers via Object-Level Representations

URL: http://arxiv.org/abs/2210.14222v1
Date: Tue, 25 Oct 2022 17:59:46 GMT
Title: PlanT: Explainable Planning Transformers via Object-Level Representations
Authors: Katrin Renz, Kashyap Chitta, Otniel-Bogdan Mercea, A. Sophia Koepke, Zeynep Akata, Andreas Geiger
Abstract summary: PlanT is a novel approach for planning in the context of self-driving. PlanT is based on imitation learning with a compact object-level input representation. Our results indicate that PlanT can focus on the most relevant object in the scene, even when this object is geometrically distant.
Score: 64.93938686101309
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Planning an optimal route in a complex environment requires efficient reasoning about the surrounding scene. While human drivers prioritize important objects and ignore details not relevant to the decision, learning-based planners typically extract features from dense, high-dimensional grid representations containing all vehicle and road context information. In this paper, we propose PlanT, a novel approach for planning in the context of self-driving that uses a standard transformer architecture. PlanT is based on imitation learning with a compact object-level input representation. On the Longest6 benchmark for CARLA, PlanT outperforms all prior methods (matching the driving score of the expert) while being 5.3x faster than equivalent pixel-based planning baselines during inference. Combining PlanT with an off-the-shelf perception module provides a sensor-based driving system that is more than 10 points better in terms of driving score than the existing state of the art. Furthermore, we propose an evaluation protocol to quantify the ability of planners to identify relevant objects, providing insights regarding their decision-making. Our results indicate that PlanT can focus on the most relevant object in the scene, even when this object is geometrically distant.

Related papers

Point Cloud Based Scene Segmentation: A Survey [3.0846824529023387]
We provide an overview of the current state-of-the-art methods in the field of Point Cloud Semantics for autonomous driving. We categorize the approaches into projection-based, 3D-based and hybrid methods. We also emphasize the importance of synthetic data to support research when real-world data is limited.
arXiv Detail & Related papers (2025-03-16T18:02:41Z)
Stable Object Placement Planning From Contact Point Robustness [12.575068666209832]
Our planner selects contact points first and then determines a placement pose that solicits the selected points. Our algorithm facilitates stability-aware object placement planning, imposing no restrictions on object shape, convexity, or mass density homogeneity.
arXiv Detail & Related papers (2024-10-16T12:02:15Z)
SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation [11.011219709863875]
We propose a new end-to-end autonomous driving paradigm named SparseDrive. SparseDrive consists of a symmetric sparse perception module and a parallel motion planner. For motion prediction and planning, we review the great similarity between these two tasks, leading to a parallel design for motion planner.
arXiv Detail & Related papers (2024-05-30T02:13:56Z)
On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving [38.35997586629021]
End-to-end motion planning models equipped with deep neural networks have shown great potential for enabling full autonomous driving. The oversized neural networks render them impractical for deployment on resource-constrained systems, which unavoidably requires more computational time and resources during reference. We propose PlanKD, the first knowledge distillation framework tailored for compressing end-to-end motion planners.
arXiv Detail & Related papers (2024-03-02T15:47:42Z)
PAS-SLAM: A Visual SLAM System for Planar Ambiguous Scenes [41.47703182059505]
We propose a visual SLAM system based on planar features designed for planar ambiguous scenes. We present an integrated data association strategy that combines plane parameters, semantic information, projection IoU, and non-parametric tests. Finally, we design a set of multi-constraint factor graphs for camera pose optimization.
arXiv Detail & Related papers (2024-02-09T01:34:26Z)
Learning adaptive planning representations with natural language guidance [90.24449752926866]
This paper describes Ada, a framework for automatically constructing task-specific planning representations. Ada interactively learns a library of planner-compatible high-level action abstractions and low-level controllers adapted to a particular domain of planning tasks.
arXiv Detail & Related papers (2023-12-13T23:35:31Z)
Embodied Task Planning with Large Language Models [86.63533340293361]
We propose a TAsk Planing Agent (TaPA) in embodied tasks for grounded planning with physical scene constraint. During inference, we discover the objects in the scene by extending open-vocabulary object detectors to multi-view RGB images collected in different achievable locations. Experimental results show that the generated plan from our TaPA framework can achieve higher success rate than LLaVA and GPT-3.5 by a sizable margin.
arXiv Detail & Related papers (2023-07-04T17:58:25Z)
Planning Irregular Object Packing via Hierarchical Reinforcement Learning [85.64313062912491]
We propose a deep hierarchical reinforcement learning approach to plan packing sequence and placement for irregular objects. We show that our approach can pack more objects with less time cost than the state-of-the-art packing methods of irregular objects.
arXiv Detail & Related papers (2022-11-17T07:16:37Z)
Online Grounding of PDDL Domains by Acting and Sensing in Unknown Environments [62.11612385360421]
This paper proposes a framework that allows an agent to perform different tasks. We integrate machine learning models to abstract the sensory data, symbolic planning for goal achievement and path planning for navigation. We evaluate the proposed method in accurate simulated environments, where the sensors are RGB-D on-board camera, GPS and compass.
arXiv Detail & Related papers (2021-12-18T21:48:20Z)
Differentiable Spatial Planning using Transformers [87.90709874369192]
We propose Spatial Planning Transformers (SPT), which given an obstacle map learns to generate actions by planning over long-range spatial dependencies. In the setting where the ground truth map is not known to the agent, we leverage pre-trained SPTs in an end-to-end framework. SPTs outperform prior state-of-the-art differentiable planners across all the setups for both manipulation and navigation tasks.
arXiv Detail & Related papers (2021-12-02T06:48:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.