FlowBind: Efficient Any-to-Any Generation with Bidirectional Flows
- URL: http://arxiv.org/abs/2512.15420v1
- Date: Wed, 17 Dec 2025 13:08:18 GMT
- Title: FlowBind: Efficient Any-to-Any Generation with Bidirectional Flows
- Authors: Yeonwoo Cha, Semin Kim, Jinhyeon Kwon, Seunghoon Hong,
- Abstract summary: FlowBind is an efficient framework for any-to-any generation.<n>It learns a shared latent space capturing cross-modal information, with modality-specific invertible flows bridging this latent to each modality.<n>Experiments on text, image, and audio demonstrate that FlowBind attains comparable quality while requiring up to 6x fewer parameters and training 10x faster than prior methods.
- Score: 17.924626622563924
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Any-to-any generation seeks to translate between arbitrary subsets of modalities, enabling flexible cross-modal synthesis. Despite recent success, existing flow-based approaches are challenged by their inefficiency, as they require large-scale datasets often with restrictive pairing constraints, incur high computational cost from modeling joint distribution, and rely on complex multi-stage training. We propose FlowBind, an efficient framework for any-to-any generation. Our approach is distinguished by its simplicity: it learns a shared latent space capturing cross-modal information, with modality-specific invertible flows bridging this latent to each modality. Both components are optimized jointly under a single flow-matching objective, and at inference the invertible flows act as encoders and decoders for direct translation across modalities. By factorizing interactions through the shared latent, FlowBind naturally leverages arbitrary subsets of modalities for training, and achieves competitive generation quality while substantially reducing data requirements and computational cost. Experiments on text, image, and audio demonstrate that FlowBind attains comparable quality while requiring up to 6x fewer parameters and training 10x faster than prior methods. The project page with code is available at https://yeonwoo378.github.io/official_flowbind.
Related papers
- Blockwise Flow Matching: Improving Flow Matching Models For Efficient High-Quality Generation [33.177998521195114]
Flow Matching models have pushed the boundaries of high-fidelity data generation across a wide range of domains.<n>We propose Blockwise Flow Matching (BFM), a novel framework that partitions the generative trajectory into multiple temporal segments.<n>BFM achieves 2.1x to 4.9x accelerations in inference complexity at comparable generation performance.
arXiv Detail & Related papers (2025-10-24T05:41:23Z) - OneFlow: Concurrent Mixed-Modal and Interleaved Generation with Edit Flows [59.052955667723985]
We present OneFlow, the first non-autoregressive multimodal model that enables variable-length and concurrent mixed-modal generation.<n>Unlike autoregressive models that enforce rigid causal ordering between text and image generation, OneFlow combines an insertion-based Edit Flow for discrete text tokens with Flow Matching for image latents.
arXiv Detail & Related papers (2025-10-03T20:40:30Z) - Contrastive Flow Matching [61.60002028726023]
We introduce Contrastive Flow Matching, an extension to the flow matching objective that explicitly enforces uniqueness across all conditional flows.<n>Our approach adds a contrastive objective that maximizes dissimilarities between predicted flows from arbitrary sample pairs.<n>We find that training models with Contrastive Flow Matching (1) improves training speed by a factor of up to 9x, (2) requires up to 5x fewer de-noising steps and (3) lowers FID by up to 8.9 compared to training the same models with flow matching.
arXiv Detail & Related papers (2025-06-05T17:59:58Z) - FlowUnits: Extending Dataflow for the Edge-to-Cloud Computing Continuum [41.94295877935867]
FlowUnits organizes processing operators into cohesive, independently manageable components that can be transparently replicated across different regions.<n>Our approach maintains the simplicity of dataflow while enabling seamless integration of edge and cloud resources into unified data processing pipelines.
arXiv Detail & Related papers (2025-04-15T17:14:08Z) - Flow Matching for Collaborative Filtering [37.27712576496578]
FlowCF is a flow-based recommendation system for collaborative filtering.<n>It achieves state-of-the-art recommendation accuracy across various datasets with the fastest inference speed.
arXiv Detail & Related papers (2025-02-11T07:01:19Z) - Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation [3.8959351616076745]
Flow matching has emerged as a promising framework for training generative models.<n>We introduce a self-corrected flow distillation method that integrates consistency models and adversarial training.<n>This work is a pioneer in achieving consistent generation quality in both few-step and one-step sampling.
arXiv Detail & Related papers (2024-12-22T07:48:49Z) - Consistency Flow Matching: Defining Straight Flows with Velocity Consistency [97.28511135503176]
We introduce Consistency Flow Matching (Consistency-FM), a novel FM method that explicitly enforces self-consistency in the velocity field.
Preliminary experiments demonstrate that our Consistency-FM significantly improves training efficiency by converging 4.4x faster than consistency models.
arXiv Detail & Related papers (2024-07-02T16:15:37Z) - Language Rectified Flow: Advancing Diffusion Language Generation with Probabilistic Flows [53.31856123113228]
This paper proposes Language Rectified Flow (ours)
Our method is based on the reformulation of the standard probabilistic flow models.
Experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many NLP tasks.
arXiv Detail & Related papers (2024-03-25T17:58:22Z) - Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech.
Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z) - GMFlow: Learning Optical Flow via Global Matching [124.57850500778277]
We propose a GMFlow framework for learning optical flow estimation.
It consists of three main components: a customized Transformer for feature enhancement, a correlation and softmax layer for global feature matching, and a self-attention layer for flow propagation.
Our new framework outperforms 32-iteration RAFT's performance on the challenging Sintel benchmark.
arXiv Detail & Related papers (2021-11-26T18:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.