RelTopo: Enhancing Relational Modeling for Driving Scene Topology Reasoning
- URL: http://arxiv.org/abs/2506.13553v1
- Date: Mon, 16 Jun 2025 14:40:28 GMT
- Title: RelTopo: Enhancing Relational Modeling for Driving Scene Topology Reasoning
- Authors: Yueru Luo, Changqing Zhou, Yiming Yang, Erlong Li, Chao Zheng, Shuqi Mei, Shuguang Cui, Zhen Li,
- Abstract summary: Road topology reasoning is critical for autonomous driving, enabling effective navigation and adherence to traffic regulations.<n>Existing methods typically focus on either lane detection or Lane-to-Lane (L2L) topology reasoning, often textitneglecting Lane-to-Traffic-element (L2T) relationships to optimize these tasks jointly.<n>We argue that relational modeling is beneficial for both perception and reasoning, as humans naturally leverage contextual relationships for road element recognition and their connectivity inference.
- Score: 55.21557415676928
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate road topology reasoning is critical for autonomous driving, enabling effective navigation and adherence to traffic regulations. Central to this task are lane perception and topology reasoning. However, existing methods typically focus on either lane detection or Lane-to-Lane (L2L) topology reasoning, often \textit{neglecting} Lane-to-Traffic-element (L2T) relationships or \textit{failing} to optimize these tasks jointly. Furthermore, most approaches either overlook relational modeling or apply it in a limited scope, despite the inherent spatial relationships among road elements. We argue that relational modeling is beneficial for both perception and reasoning, as humans naturally leverage contextual relationships for road element recognition and their connectivity inference. To this end, we introduce relational modeling into both perception and reasoning, \textit{jointly} enhancing structural understanding. Specifically, we propose: 1) a relation-aware lane detector, where our geometry-biased self-attention and \curve\ cross-attention refine lane representations by capturing relational dependencies; 2) relation-enhanced topology heads, including a geometry-enhanced L2L head and a cross-view L2T head, boosting reasoning with relational cues; and 3) a contrastive learning strategy with InfoNCE loss to regularize relationship embeddings. Extensive experiments on OpenLane-V2 demonstrate that our approach significantly improves both detection and topology reasoning metrics, achieving +3.1 in DET$_l$, +5.3 in TOP$_{ll}$, +4.9 in TOP$_{lt}$, and an overall +4.4 in OLS, setting a new state-of-the-art. Code will be released.
Related papers
- Reusing Attention for One-stage Lane Topology Understanding [32.464423838732635]
We propose a one-stage architecture that simultaneously predicts traffic elements, lane centerlines and topology relationship.<n>Our key innovation lies in reusing intermediate attention resources within distinct transformer decoders.<n>Our approach outperforms baseline methods in both accuracy and efficiency.
arXiv Detail & Related papers (2025-07-23T15:48:16Z) - TopoStreamer: Temporal Lane Segment Topology Reasoning in Autonomous Driving [52.25176274203747]
TopoStreamer is an end-to-end temporal perception model for lane segment topology reasoning.<n>TopoStreamer introduces three key improvements: streaming attribute constraints, dynamic lane boundary positional encoding, and lane segment denoising.<n>On the Open-Lane-V2 dataset, TopoStreamer demonstrates significant improvements over state-of-the-art methods.
arXiv Detail & Related papers (2025-07-01T12:10:46Z) - T2SG: Traffic Topology Scene Graph for Topology Reasoning in Autonomous Driving [26.038699227233227]
Traffic Topology Scene Graph is a unified scene graph explicitly modeling the lane, controlled and guided by different road signals.<n>For the generation of T2SG, we propose TopoFormer, a novel one-stage Topology Scene Graph TransFormer with two newly designed layers.
arXiv Detail & Related papers (2024-11-28T03:55:50Z) - TopoLogic: An Interpretable Pipeline for Lane Topology Reasoning on Driving Scenes [27.930213859199473]
We propose an interpretable method for lane topology reasoning based on lane geometric distance and lane query similarity.
Our approach significantly outperforms the existing state-of-the-art methods on the mainstream benchmark OpenLane-V2.
Our proposed geometric distance topology reasoning method can be incorporated into well-trained models without re-training.
arXiv Detail & Related papers (2024-05-23T16:15:17Z) - TopoMLP: A Simple yet Strong Pipeline for Driving Topology Reasoning [51.29906807247014]
Topology reasoning aims to understand road scenes and present drivable routes in autonomous driving.
It requires detecting road centerlines (lane) and traffic elements, further reasoning their topology relationship, i.e., lane-lane topology, and lane-traffic topology.
We introduce a powerful 3D lane detector and an improved 2D traffic element detector to extend the upper limit of topology performance.
arXiv Detail & Related papers (2023-10-10T16:24:51Z) - Separated RoadTopoFormer [13.304343390479191]
Separated RoadTopoFormer is an end-to-end framework that detects lane centerline and traffic elements with reasoning relationships among them.
Our final submission achieves 0.445 OLS, which is competitive in both sub-task and combined scores.
arXiv Detail & Related papers (2023-07-04T08:21:39Z) - OpenLane-V2: A Topology Reasoning Benchmark for Unified 3D HD Mapping [84.65114565766596]
We present OpenLane-V2, the first dataset on topology reasoning for traffic scene structure.
OpenLane-V2 consists of 2,000 annotated road scenes that describe traffic elements and their correlation to the lanes.
We evaluate various state-of-the-art methods, and present their quantitative and qualitative results on OpenLane-V2 to indicate future avenues for investigating topology reasoning in traffic scenes.
arXiv Detail & Related papers (2023-04-20T16:31:22Z) - Graph-based Topology Reasoning for Driving Scenes [102.35885039110057]
We present TopoNet, the first end-to-end framework capable of abstracting traffic knowledge beyond conventional perception tasks.
We evaluate TopoNet on the challenging scene understanding benchmark, OpenLane-V2.
arXiv Detail & Related papers (2023-04-11T15:23:29Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.