Diverse Controllable Diffusion Policy with Signal Temporal Logic
- URL: http://arxiv.org/abs/2503.02924v1
- Date: Tue, 04 Mar 2025 18:59:00 GMT
- Title: Diverse Controllable Diffusion Policy with Signal Temporal Logic
- Authors: Yue Meng, Chuchu fan,
- Abstract summary: We leverage Signal Temporal Logic (STL) and Diffusion Models to learn controllable, diverse, and rule-aware policy.<n>In the closed-loop testing, our approach reaches the highest diversity, rule satisfaction rate, and the least collision rate.<n>A case study on human-robot encounter scenarios shows our approach can generate diverse and closed-to-oracle trajectories.
- Score: 13.154661571539577
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating realistic simulations is critical for autonomous system applications such as self-driving and human-robot interactions. However, driving simulators nowadays still have difficulty in generating controllable, diverse, and rule-compliant behaviors for road participants: Rule-based models cannot produce diverse behaviors and require careful tuning, whereas learning-based methods imitate the policy from data but are not designed to follow the rules explicitly. Besides, the real-world datasets are by nature "single-outcome", making the learning method hard to generate diverse behaviors. In this paper, we leverage Signal Temporal Logic (STL) and Diffusion Models to learn controllable, diverse, and rule-aware policy. We first calibrate the STL on the real-world data, then generate diverse synthetic data using trajectory optimization, and finally learn the rectified diffusion policy on the augmented dataset. We test on the NuScenes dataset and our approach can achieve the most diverse rule-compliant trajectories compared to other baselines, with a runtime 1/17X to the second-best approach. In the closed-loop testing, our approach reaches the highest diversity, rule satisfaction rate, and the least collision rate. Our method can generate varied characteristics conditional on different STL parameters in testing. A case study on human-robot encounter scenarios shows our approach can generate diverse and closed-to-oracle trajectories. The annotation tool, augmented dataset, and code are available at https://github.com/mengyuest/pSTL-diffusion-policy.
Related papers
- IMLE Policy: Fast and Sample Efficient Visuomotor Policy Learning via Implicit Maximum Likelihood Estimation [3.7584322469996896]
IMLE Policy is a novel behaviour cloning approach based on Implicit Maximum Likelihood Estimation (IMLE)<n>It excels in low-data regimes, effectively learning from minimal demonstrations and requiring 38% less data on average to match the performance of baseline methods in learning complex multi-modal behaviours.<n>We validate our approach across diverse manipulation tasks in simulated and real-world environments, showcasing its ability to capture complex behaviours under data constraints.
arXiv Detail & Related papers (2025-02-17T23:22:49Z) - Gradient-based Trajectory Optimization with Parallelized Differentiable Traffic Simulation [24.95575815501035]
We present a parallelized differentiable traffic simulator based on the Intelligent Driver Model (IDM)<n>Our vehicle simulator efficiently models vehicle motion, generating trajectories that can be supervised to fit real-world data.<n>We show that we can use the simulator to filter noise in the input trajectories (trajectory filtering), reconstruct dense trajectories from sparse ones (trajectory reconstruction), and predict future trajectories.
arXiv Detail & Related papers (2024-12-21T19:53:38Z) - Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications.<n> Ensuring their alignment with the diverse preferences of individual users has become a critical challenge.<n>We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z) - Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Online Analytic Exemplar-Free Continual Learning with Large Models for Imbalanced Autonomous Driving Task [25.38082751323396]
We propose an Analytic Exemplar-Free Online Continual Learning algorithm (AEF-OCL)
The AEF-OCL leverages analytic continual learning principles and employs ridge regression as a classifier for features extracted by a large backbone network.
Experimental results demonstrate that despite being an exemplar-free strategy, our method outperforms various methods on the autonomous driving SODA10M dataset.
arXiv Detail & Related papers (2024-05-28T03:19:15Z) - Robust Visual Sim-to-Real Transfer for Robotic Manipulation [79.66851068682779]
Learning visuomotor policies in simulation is much safer and cheaper than in the real world.
However, due to discrepancies between the simulated and real data, simulator-trained policies often fail when transferred to real robots.
One common approach to bridge the visual sim-to-real domain gap is domain randomization (DR)
arXiv Detail & Related papers (2023-07-28T05:47:24Z) - Unpaired Image Translation to Mitigate Domain Shift in Liquid Argon Time Projection Chamber Detector Responses [5.799883835053843]
The " domain shift problem" is prevalent in many scientific domains where algorithms are trained on simulated data but applied to real-world datasets.
This work explores the feasibility of using an alternative way to solve the domain shift problem that is not specific to any downstream algorithm.
The proposed approach relies on modern Unpaired Image-to-Image translation techniques, designed to find translations between different image domains in a fully unsupervised fashion.
arXiv Detail & Related papers (2023-04-25T14:29:54Z) - Robust Test-Time Adaptation in Dynamic Scenarios [9.475271284789969]
Test-time adaptation (TTA) intends to adapt the pretrained model to test distributions with only unlabeled test data streams.
We elaborate a Robust Test-Time Adaptation (RoTTA) method against the complex data stream in PTTA.
Our method is easy to implement, making it a good choice for rapid deployment.
arXiv Detail & Related papers (2023-03-24T10:19:14Z) - Towards Optimal Strategies for Training Self-Driving Perception Models
in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone.
Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator.
We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z) - TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors [74.67698916175614]
We propose TrafficSim, a multi-agent behavior model for realistic traffic simulation.
In particular, we leverage an implicit latent variable model to parameterize a joint actor policy.
We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.
arXiv Detail & Related papers (2021-01-17T00:29:30Z) - SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction.
multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.