Exploring Semantic Clustering and Similarity Search for Heterogeneous Traffic Scenario Graph
- URL: http://arxiv.org/abs/2507.05086v1
- Date: Mon, 07 Jul 2025 15:10:03 GMT
- Title: Exploring Semantic Clustering and Similarity Search for Heterogeneous Traffic Scenario Graph
- Authors: Ferdinand Mütsch, Maximilian Zipfl, Nikolai Polley, J. Marius Zöllner,
- Abstract summary: We first propose an expressive and flexible heterogeneous,temporal graph model for representing traffic scenarios.<n>We then propose a self-supervised method to learn a universal embedding space for scenario graphs.<n>In particular, we implement contrastive learning alongside a bootstrapping-based approach and evaluate their suitability for the scenario space.
- Score: 41.2584175136191
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scenario-based testing is an indispensable instrument for the comprehensive validation and verification of automated vehicles (AVs). However, finding a manageable and finite, yet representative subset of scenarios in a scalable, possibly unsupervised manner is notoriously challenging. Our work is meant to constitute a cornerstone to facilitate sample-efficient testing, while still capturing the diversity of relevant operational design domains (ODDs) and accounting for the "long tail" phenomenon in particular. To this end, we first propose an expressive and flexible heterogeneous, spatio-temporal graph model for representing traffic scenarios. Leveraging recent advances of graph neural networks (GNNs), we then propose a self-supervised method to learn a universal embedding space for scenario graphs that enables clustering and similarity search. In particular, we implement contrastive learning alongside a bootstrapping-based approach and evaluate their suitability for partitioning the scenario space. Experiments on the nuPlan dataset confirm the model's ability to capture semantics and thus group related scenarios in a meaningful way despite the absence of discrete class labels. Different scenario types materialize as distinct clusters. Our results demonstrate how variable-length traffic scenarios can be condensed into single vector representations that enable nearest-neighbor retrieval of representative candidates for distinct scenario categories. Notably, this is achieved without manual labeling or bias towards an explicit objective such as criticality. Ultimately, our approach can serve as a basis for scalable selection of scenarios to further enhance the efficiency and robustness of testing AVs in simulation.
Related papers
- Towards Predicting Any Human Trajectory In Context [10.332817296500533]
We introduce TrajICL, an In-Context Learning framework for pedestrian trajectory prediction.<n>TrajICL enables rapid adaptation without fine-tuning on scenario-specific data.<n>We train our model on a large-scale synthetic dataset to enhance its prediction ability.
arXiv Detail & Related papers (2025-06-01T07:18:47Z) - Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance.
DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator.
Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z) - Neural Clustering based Visual Representation Learning [61.72646814537163]
Clustering is one of the most classic approaches in machine learning and data analysis.
We propose feature extraction with clustering (FEC), which views feature extraction as a process of selecting representatives from data.
FEC alternates between grouping pixels into individual clusters to abstract representatives and updating the deep features of pixels with current representatives.
arXiv Detail & Related papers (2024-03-26T06:04:50Z) - VistaScenario: Interaction Scenario Engineering for Vehicles with Intelligent Systems for Transport Automation [18.897103921181255]
We propose VistaScenario framework to conduct scenario engineering for vehicles with intelligent systems for transport automation.
Based on summarized basic types of vehicle interactions, we slice scenario data stream into segments via scenario evolution tree.
We also propose the scenario metric Graph-DTW based on Graph Tree and Dynamic Time Warping vehicles to conduct scenario comparison and labeling.
arXiv Detail & Related papers (2024-02-12T15:34:04Z) - Graph Convolutional Networks for Complex Traffic Scenario Classification [0.7919810878571297]
A scenario-based testing approach can reduce the time required to obtain statistically significant evidence of the safety of Automated Driving Systems.
Most methods on scenario classification do not work for complex scenarios with diverse environments.
We propose a method for complex traffic scenario classification that is able to model the interaction of a vehicle with the environment.
arXiv Detail & Related papers (2023-10-26T20:51:24Z) - Traffic Scene Similarity: a Graph-based Contrastive Learning Approach [4.451479907610764]
We propose an extension to a contrastive learning approach utilizing graphs to construct a meaningful embedding space.
Our approach demonstrates the continuous mapping of scenes using scene-specific features and the formation of thematically similar clusters.
Based on the found clusters, similar scenes could be identified in the subsequent test process, which can lead to a reduction in redundant test runs.
arXiv Detail & Related papers (2023-09-18T12:35:08Z) - Deep Incomplete Multi-view Clustering with Cross-view Partial Sample and
Prototype Alignment [50.82982601256481]
We propose a Cross-view Partial Sample and Prototype Alignment Network (CPSPAN) for Deep Incomplete Multi-view Clustering.
Unlike existing contrastive-based methods, we adopt pair-observed data alignment as 'proxy supervised signals' to guide instance-to-instance correspondence construction.
arXiv Detail & Related papers (2023-03-28T02:31:57Z) - Toward Unsupervised Test Scenario Extraction for Automated Driving
Systems from Urban Naturalistic Road Traffic Data [0.0]
The presented method deploys an unsupervised machine learning pipeline to extract scenarios from road traffic data.
It is evaluated for naturalistic road traffic data at urban intersections from the inD and the Silicon Valley Intersections datasets.
Using hierarchical clustering the results show both a jump in overall accuracy of around 20% when moving from 4 to 5 clusters and a saturation effect starting at 41 clusters with an overall accuracy of 84%.
arXiv Detail & Related papers (2022-02-14T10:55:14Z) - Reliable Shot Identification for Complex Event Detection via
Visual-Semantic Embedding [72.9370352430965]
We propose a visual-semantic guided loss method for event detection in videos.
Motivated by curriculum learning, we introduce a negative elastic regularization term to start training the classifier with instances of high reliability.
An alternative optimization algorithm is developed to solve the proposed challenging non-net regularization problem.
arXiv Detail & Related papers (2021-10-12T11:46:56Z) - Traffic Scenario Clustering by Iterative Optimisation of Self-Supervised
Networks Using a Random Forest Activation Pattern Similarity [0.9711326718689492]
This work introduces a clustering technique based on a novel data-adaptive similarity measure, called Random Forest Activation Pattern (RFAP) similarity.
The RFAP similarity is generated using a tree encoding scheme in a Random Forest algorithm.
The clustering method proposed in this work takes into account that there are labelled scenarios available and the information from the labelled scenarios can help to guide the clustering of unlabelled scenarios.
arXiv Detail & Related papers (2021-05-17T06:54:59Z) - Attentional Prototype Inference for Few-Shot Segmentation [128.45753577331422]
We propose attentional prototype inference (API), a probabilistic latent variable framework for few-shot segmentation.
We define a global latent variable to represent the prototype of each object category, which we model as a probabilistic distribution.
We conduct extensive experiments on four benchmarks, where our proposal obtains at least competitive and often better performance than state-of-the-art prototype-based methods.
arXiv Detail & Related papers (2021-05-14T06:58:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.