Data-efficient Trajectory Prediction via Coreset Selection
- URL: http://arxiv.org/abs/2409.17385v1
- Date: Wed, 25 Sep 2024 22:00:11 GMT
- Title: Data-efficient Trajectory Prediction via Coreset Selection
- Authors: Ruining Yang and Lili Su
- Abstract summary: Training trajectory prediction models is challenging in two ways.
Easy-medium driving scenarios often overwhelmingly dominate the dataset.
We propose a novel data-efficient training method based on coreset selection.
- Score: 4.682090083225856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern vehicles are equipped with multiple information-collection devices
such as sensors and cameras, continuously generating a large volume of raw
data. Accurately predicting the trajectories of neighboring vehicles is a vital
component in understanding the complex driving environment. Yet, training
trajectory prediction models is challenging in two ways. Processing the
large-scale data is computation-intensive. Moreover, easy-medium driving
scenarios often overwhelmingly dominate the dataset, leaving challenging
driving scenarios such as dense traffic under-represented. For example, in the
Argoverse motion prediction dataset, there are very few instances with $\ge 50$
agents, while scenarios with $10 \thicksim 20$ agents are far more common. In
this paper, to mitigate data redundancy in the over-represented driving
scenarios and to reduce the bias rooted in the data scarcity of complex ones,
we propose a novel data-efficient training method based on coreset selection.
This method strategically selects a small but representative subset of data
while balancing the proportions of different scenario difficulties. To the best
of our knowledge, we are the first to introduce a method capable of effectively
condensing large-scale trajectory dataset, while achieving a state-of-the-art
compression ratio. Notably, even when using only 50% of the Argoverse dataset,
the model can be trained with little to no decline in performance. Moreover,
the selected coreset maintains excellent generalization ability.
Related papers
- A CLIP-Powered Framework for Robust and Generalizable Data Selection [51.46695086779598]
Real-world datasets often contain redundant and noisy data, imposing a negative impact on training efficiency and model performance.
Data selection has shown promise in identifying the most representative samples from the entire dataset.
We propose a novel CLIP-powered data selection framework that leverages multimodal information for more robust and generalizable sample selection.
arXiv Detail & Related papers (2024-10-15T03:00:58Z) - TrACT: A Training Dynamics Aware Contrastive Learning Framework for Long-tail Trajectory Prediction [7.3292387742640415]
We propose to incorporate richer training dynamics information into a prototypical contrastive learning framework.
We conduct empirical evaluations of our approach using two large-scale naturalistic datasets.
arXiv Detail & Related papers (2024-04-18T23:12:46Z) - Distil the informative essence of loop detector data set: Is
network-level traffic forecasting hungry for more data? [0.8002196839441036]
We propose an uncertainty-aware traffic forecasting framework to explore how many samples of loop data are truly effective for training forecasting models.
The proposed methodology proves valuable in evaluating large traffic datasets' true information content.
arXiv Detail & Related papers (2023-10-31T11:23:10Z) - Pre-training on Synthetic Driving Data for Trajectory Prediction [61.520225216107306]
We propose a pipeline-level solution to mitigate the issue of data scarcity in trajectory forecasting.
We adopt HD map augmentation and trajectory synthesis for generating driving data, and then we learn representations by pre-training on them.
We conduct extensive experiments to demonstrate the effectiveness of our data expansion and pre-training strategies.
arXiv Detail & Related papers (2023-09-18T19:49:22Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Large Scale Autonomous Driving Scenarios Clustering with Self-supervised
Feature Extraction [6.804209932400134]
This article proposes a comprehensive data clustering framework for a large set of vehicle driving data.
Our approach thoroughly considers the traffic elements, including both in-traffic agent objects and map information.
With the newly designed driving data clustering evaluation metrics based on data-augmentation, the accuracy assessment does not require a human-labeled data-set.
arXiv Detail & Related papers (2021-03-30T06:22:40Z) - Injecting Knowledge in Data-driven Vehicle Trajectory Predictors [82.91398970736391]
Vehicle trajectory prediction tasks have been commonly tackled from two perspectives: knowledge-driven or data-driven.
In this paper, we propose to learn a "Realistic Residual Block" (RRB) which effectively connects these two perspectives.
Our proposed method outputs realistic predictions by confining the residual range and taking into account its uncertainty.
arXiv Detail & Related papers (2021-03-08T16:03:09Z) - SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction.
multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.