Related papers: WAM-Flow: Parallel Coarse-to-Fine Motion Planning via Discrete Flow Matching for Autonomous Driving

WAM-Flow: Parallel Coarse-to-Fine Motion Planning via Discrete Flow Matching for Autonomous Driving

URL: http://arxiv.org/abs/2512.06112v2
Date: Thu, 11 Dec 2025 16:06:13 GMT
Title: WAM-Flow: Parallel Coarse-to-Fine Motion Planning via Discrete Flow Matching for Autonomous Driving
Authors: Yifang Xu, Jiahao Cui, Feipeng Cai, Zhihao Zhu, Hanlin Shang, Shan Luan, Mingwang Xu, Neng Zhang, Yaoyi Li, Jia Cai, Siyu Zhu,
Abstract summary: We introduce WAM-Flow, a vision--action (VLA) model that casts ego-trajectory planning as discrete flow matching over a structured token space.<n>WAM-Flow performs fully parallel, bidirectional denoising, enabling coarse-to-fine refinement with a tunable compute-accuracy trade-off.<n>These results establish discrete flow matching as a new promising paradigm for end-to-end autonomous driving.
Score: 9.719456684859606
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce WAM-Flow, a vision-language-action (VLA) model that casts ego-trajectory planning as discrete flow matching over a structured token space. In contrast to autoregressive decoders, WAM-Flow performs fully parallel, bidirectional denoising, enabling coarse-to-fine refinement with a tunable compute-accuracy trade-off. Specifically, the approach combines a metric-aligned numerical tokenizer that preserves scalar geometry via triplet-margin learning, a geometry-aware flow objective and a simulator-guided GRPO alignment that integrates safety, ego progress, and comfort rewards while retaining parallel generation. A multi-stage adaptation converts a pre-trained auto-regressive backbone (Janus-1.5B) from causal decoding to non-causal flow model and strengthens road-scene competence through continued multimodal pretraining. Thanks to the inherent nature of consistency model training and parallel decoding inference, WAM-Flow achieves superior closed-loop performance against autoregressive and diffusion-based VLA baselines, with 1-step inference attaining 89.1 PDMS and 5-step inference reaching 90.3 PDMS on NAVSIM v1 benchmark. These results establish discrete flow matching as a new promising paradigm for end-to-end autonomous driving. The code will be publicly available soon.

Related papers

MeanFuser: Fast One-Step Multi-Modal Trajectory Generation and Adaptive Reconstruction via MeanFlow for End-to-End Autonomous Driving [23.013043338076745]
MeanFuser is an end-to-end autonomous driving method.<n>We introduce GMN to guide generative sampling and adapt MeanFlow Identity" to end-to-end planning.<n>Experiments on the NAVSIM closed-loop benchmark demonstrate that MeanFuser achieves outstanding performance without the supervision of the PDM Score.
arXiv Detail & Related papers (2026-02-23T17:17:26Z)
FlowConsist: Make Your Flow Consistent with Real Trajectory [99.22869983378062]
We argue that current fast-flow training paradigms suffer from two fundamental issues.<n> conditional velocities constructed from randomly paired noise-data samples introduce systematic trajectory drift.<n>We propose FlowConsist, a training framework designed to enforce trajectory consistency in fast flows.
arXiv Detail & Related papers (2026-02-06T03:24:23Z)
GuideFlow: Constraint-Guided Flow Matching for Planning in End-to-End Autonomous Driving [22.92109402334754]
Driving planning is a critical component of end-to-end (E2E) autonomous driving.<n>textittextbfGuideFlow explicitly models the flow matching process, which inherently mitigates mode collapse.<n>textittextbfGuideFlow parameterizes driving aggressiveness as a control signal during generation, enabling precise manipulation of trajectory style.
arXiv Detail & Related papers (2025-11-24T03:45:32Z)
Optimal Control Meets Flow Matching: A Principled Route to Multi-Subject Fidelity [35.95129874095729]
Text-to-image (T2I) models excel on single-entity prompts but struggle with multi-subject descriptions.<n>We introduce the first theoretical framework with principled optimizable objective for steering sampling dynamics toward multi-subject fidelity.
arXiv Detail & Related papers (2025-10-02T17:59:58Z)
Rethinking Unsupervised Cross-modal Flow Estimation: Learning from Decoupled Optimization and Consistency Constraint [20.46870753632375]
DCFlow is a novel unsupervised cross-modal flow estimation framework.<n>We introduce a decoupled optimization strategy with task-specific supervision to address modality discrepancy and geometric misalignment distinctly.<n>For evaluation, we construct a comprehensive cross-modal flow benchmark by repurposing public datasets.
arXiv Detail & Related papers (2025-09-29T08:10:41Z)
Accelerating Diffusion LLMs via Adaptive Parallel Decoding [60.407727995313074]
We introduce adaptive parallel decoding (APD), a novel method that dynamically adjusts the number of tokens sampled in parallel.<n>APD provides markedly higher throughput with minimal quality degradations on downstream benchmarks.
arXiv Detail & Related papers (2025-05-31T06:10:10Z)
Balancing Computation Load and Representation Expressivity in Parallel Hybrid Neural Networks [5.877451898618022]
FlowHN is a novel parallel hybrid network architecture that accommodates various strategies for load balancing.<n>Two innovative differentiating factors in FlowHN include a FLOP aware dynamic token split between the attention and SSM branches.
arXiv Detail & Related papers (2025-05-26T03:52:22Z)
FlowTS: Time Series Generation via Rectified Flow [67.41208519939626]
FlowTS is an ODE-based model that leverages rectified flow with straight-line transport in probability space.<n>For unconditional setting, FlowTS achieves state-of-the-art performance, with context FID scores of 0.019 and 0.011 on Stock and ETTh datasets.<n>For conditional setting, we have achieved superior performance in solar forecasting.
arXiv Detail & Related papers (2024-11-12T03:03:23Z)
Manifold Interpolating Optimal-Transport Flows for Trajectory Inference [64.94020639760026]
We present a method called Manifold Interpolating Optimal-Transport Flow (MIOFlow) MIOFlow learns, continuous population dynamics from static snapshot samples taken at sporadic timepoints. We evaluate our method on simulated data with bifurcations and merges, as well as scRNA-seq data from embryoid body differentiation, and acute myeloid leukemia treatment.
arXiv Detail & Related papers (2022-06-29T22:19:03Z)
GMFlow: Learning Optical Flow via Global Matching [124.57850500778277]
We propose a GMFlow framework for learning optical flow estimation. It consists of three main components: a customized Transformer for feature enhancement, a correlation and softmax layer for global feature matching, and a self-attention layer for flow propagation. Our new framework outperforms 32-iteration RAFT's performance on the challenging Sintel benchmark.
arXiv Detail & Related papers (2021-11-26T18:59:56Z)
Prediction of Traffic Flow via Connected Vehicles [77.11902188162458]
We propose a Short-term Traffic flow Prediction framework so that transportation authorities take early actions to control flow and prevent congestion. We anticipate flow at future time frames on a target road segment based on historical flow data and innovative features such as real time feeds and trajectory data provided by Connected Vehicles (CV) technology. We show how this novel approach allows advanced modelling by integrating into the forecasting of flow, the impact of various events that CV realistically encountered on segments along their trajectory.
arXiv Detail & Related papers (2020-07-10T16:00:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.