Rethinking Unsupervised Cross-modal Flow Estimation: Learning from Decoupled Optimization and Consistency Constraint
- URL: http://arxiv.org/abs/2509.24423v1
- Date: Mon, 29 Sep 2025 08:10:41 GMT
- Title: Rethinking Unsupervised Cross-modal Flow Estimation: Learning from Decoupled Optimization and Consistency Constraint
- Authors: Runmin Zhang, Jialiang Wang, Si-Yuan Cao, Zhu Yu, Junchen Yu, Guangyi Zhang, Hui-Liang Shen,
- Abstract summary: DCFlow is a novel unsupervised cross-modal flow estimation framework.<n>We introduce a decoupled optimization strategy with task-specific supervision to address modality discrepancy and geometric misalignment distinctly.<n>For evaluation, we construct a comprehensive cross-modal flow benchmark by repurposing public datasets.
- Score: 20.46870753632375
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work presents DCFlow, a novel unsupervised cross-modal flow estimation framework that integrates a decoupled optimization strategy and a cross-modal consistency constraint. Unlike previous approaches that implicitly learn flow estimation solely from appearance similarity, we introduce a decoupled optimization strategy with task-specific supervision to address modality discrepancy and geometric misalignment distinctly. This is achieved by collaboratively training a modality transfer network and a flow estimation network. To enable reliable motion supervision without ground-truth flow, we propose a geometry-aware data synthesis pipeline combined with an outlier-robust loss. Additionally, we introduce a cross-modal consistency constraint to jointly optimize both networks, significantly improving flow prediction accuracy. For evaluation, we construct a comprehensive cross-modal flow benchmark by repurposing public datasets. Experimental results demonstrate that DCFlow can be integrated with various flow estimation networks and achieves state-of-the-art performance among unsupervised approaches.
Related papers
- Temporal Pair Consistency for Variance-Reduced Flow Matching [13.328987133593154]
Temporal Pair Consistency (TPC) is a lightweight variance-reduction principle that couples velocity predictions at paired timesteps along the same probability path.<n>Instantiated within flow matching, TPC improves sample quality and efficiency across CIFAR-10 and ImageNet at multiple resolutions.
arXiv Detail & Related papers (2026-02-04T00:05:21Z) - WAM-Flow: Parallel Coarse-to-Fine Motion Planning via Discrete Flow Matching for Autonomous Driving [9.719456684859606]
We introduce WAM-Flow, a vision--action (VLA) model that casts ego-trajectory planning as discrete flow matching over a structured token space.<n>WAM-Flow performs fully parallel, bidirectional denoising, enabling coarse-to-fine refinement with a tunable compute-accuracy trade-off.<n>These results establish discrete flow matching as a new promising paradigm for end-to-end autonomous driving.
arXiv Detail & Related papers (2025-12-05T19:36:46Z) - Iterative Refinement of Flow Policies in Probability Space for Online Reinforcement Learning [56.47948583452555]
We introduce the Stepwise Flow Policy (SWFP) framework, founded on the key insight that discretizing the flow matching inference process via a fixed-step Euler scheme aligns it with the variational Jordan-Kinderlehrer-Otto principle from optimal transport.<n>SWFP decomposes the global flow into a sequence of small, incremental transformations between proximate distributions.<n>This decomposition yields an efficient algorithm that fine-tunes pre-trained flows via a cascade of small flow blocks, offering significant advantages.
arXiv Detail & Related papers (2025-10-17T07:43:51Z) - Topology-Aware Conformal Prediction for Stream Networks [54.505880918607296]
We propose Spatio-Temporal Adaptive Conformal Inference (textttCISTA), a novel framework that integrates network topology and temporal dynamics into the conformal prediction framework.<n>Our results show that textttCISTA effectively balances prediction efficiency and coverage, outperforming existing conformal prediction methods for stream networks.
arXiv Detail & Related papers (2025-03-06T21:21:15Z) - Joint Optimal Transport and Embedding for Network Alignment [66.49765320358361]
We propose a joint optimal transport and embedding framework for network alignment named JOENA.<n>With a unified objective, the mutual benefits of both methods can be achieved by an alternating optimization schema with guaranteed convergence.<n>Experiments on real-world networks validate the effectiveness and scalability of JOENA, achieving up to 16% improvement in MRR and 20x speedup.
arXiv Detail & Related papers (2025-02-26T17:28:08Z) - Efficient Text-driven Motion Generation via Latent Consistency Training [21.348658259929053]
We propose a motion latent consistency training framework (MLCT) to solve nonlinear reverse diffusion trajectories.<n>By combining these enhancements, we achieve stable and consistency training in non-pixel modality and latent representation spaces.
arXiv Detail & Related papers (2024-05-05T02:11:57Z) - Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp)
We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings.
For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error.
In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z) - DistractFlow: Improving Optical Flow Estimation via Realistic
Distractions and Pseudo-Labeling [49.46842536813477]
We propose a novel data augmentation approach, DistractFlow, for training optical flow estimation models.
We combine one of the frames in the pair with a distractor image depicting a similar domain, which allows for inducing visual perturbations congruent with natural objects and scenes.
Our approach allows increasing the number of available training pairs significantly without requiring additional annotations.
arXiv Detail & Related papers (2023-03-24T15:42:54Z) - Flow Guidance Deformable Compensation Network for Video Frame
Interpolation [33.106776459443275]
We propose a flow guidance deformable compensation network (FGDCN) to overcome the drawbacks of existing motion-based methods.
FGDCN decomposes the frame sampling process into two steps: a flow step and a deformation step.
Experimental results show that the proposed algorithm achieves excellent performance on various datasets with fewer parameters.
arXiv Detail & Related papers (2022-11-22T09:35:14Z) - Complementing Brightness Constancy with Deep Networks for Optical Flow
Prediction [30.10864927536864]
COMBO is a deep network that exploits the brightness constancy (BC) model used in traditional methods.
We derive a joint training scheme for learning the different components of the decomposition ensuring an optimal cooperation.
Experiments show that COMBO can improve performances over state-of-the-art supervised networks.
arXiv Detail & Related papers (2022-07-08T09:42:40Z) - GMFlow: Learning Optical Flow via Global Matching [124.57850500778277]
We propose a GMFlow framework for learning optical flow estimation.
It consists of three main components: a customized Transformer for feature enhancement, a correlation and softmax layer for global feature matching, and a self-attention layer for flow propagation.
Our new framework outperforms 32-iteration RAFT's performance on the challenging Sintel benchmark.
arXiv Detail & Related papers (2021-11-26T18:59:56Z) - What Matters in Unsupervised Optical Flow [51.45112526506455]
We compare and analyze a set of key components in unsupervised optical flow.
We construct a number of novel improvements to unsupervised flow models.
We present a new unsupervised flow technique that significantly outperforms the previous state-of-the-art.
arXiv Detail & Related papers (2020-06-08T19:36:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.