CaReFlow: Cyclic Adaptive Rectified Flow for Multimodal Fusion
- URL: http://arxiv.org/abs/2602.19140v1
- Date: Sun, 22 Feb 2026 12:12:05 GMT
- Title: CaReFlow: Cyclic Adaptive Rectified Flow for Multimodal Fusion
- Authors: Sijie Mai, Shiqin Han,
- Abstract summary: Modality gap significantly restricts the effectiveness of multimodal fusion.<n>Previous methods often use techniques such as diffusion models and adversarial learning to reduce the modality gap.
- Score: 6.3310165899037045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modality gap significantly restricts the effectiveness of multimodal fusion. Previous methods often use techniques such as diffusion models and adversarial learning to reduce the modality gap, but they typically focus on one-to-one alignment without exposing the data points of the source modality to the global distribution information of the target modality. To this end, leveraging the characteristic of rectified flow that can map one distribution to another via a straight trajectory, we extend rectified flow for modality distribution mapping. Specifically, we leverage the `one-to-many mapping' strategy in rectified flow that allows each data point of the source modality to observe the overall target distribution. This also alleviates the issue of insufficient paired data within each sample, enabling a more robust distribution transformation. Moreover, to achieve more accurate distribution mapping and address the ambiguous flow directions in one-to-many mapping, we design `adaptive relaxed alignment', enforcing stricter alignment for modality pairs belonging to the same sample, while applying relaxed mapping for pairs not belonging to the same sample or category. Additionally, to prevent information loss during distribution mapping, we introduce `cyclic rectified flow' to ensure the transferred features can be translated back to the original features, allowing multimodal representations to learn sufficient modality-specific information. After distribution alignment, our approach achieves very competitive results on multiple tasks of multimodal affective computing even with a simple fusion method, and visualizations verify that it can effectively reduce the modality gap.
Related papers
- Test-time scaling of diffusions with flow maps [68.79792714591564]
A common recipe to improve diffusion models at test-time is to introduce the gradient of the reward into the dynamics of the diffusion itself.<n>We propose a simple solution by working directly with a flow map.<n>By exploiting a relationship between the flow map and velocity field governing the instantaneous transport, we construct an algorithm, Flow Map Trajectory Tilting (FMTT), which provably performs better ascent on the reward than standard test-time methods.
arXiv Detail & Related papers (2025-11-27T18:44:12Z) - Calibrated Multimodal Representation Learning with Missing Modalities [100.55774771852468]
Multimodal representation learning harmonizes distinct modalities by aligning them into a unified latent space.<n>Recent research generalizes traditional cross-modal alignment to produce enhanced multimodal synergy but requires all modalities to be present for a common instance.<n>We provide theoretical insights into this issue from an anchor shift perspective.<n>We propose CalMRL for multimodal representation learning to calibrate incomplete alignments caused by missing modalities.
arXiv Detail & Related papers (2025-11-15T05:01:43Z) - Adaptive Redundancy Regulation for Balanced Multimodal Information Refinement [49.596978957463385]
Long-term dominance of the dominant modality weakens representation-output coupling.<n>Previous methods often directly and uniformly adjust the gradients of the advantaged modality.<n>We propose Adaptive Redundancy Regulation for Balanced Multimodal Information Refinement.
arXiv Detail & Related papers (2025-11-14T04:44:34Z) - Three Forms of Stochastic Injection for Improved Distribution-to-Distribution Generative Modeling [40.63772844645927]
Flow matching offers a natural framework for modeling transformations between arbitrary data distributions.<n>We propose a simple and computationally efficient method that injectsperturbity into the training process by perturbing source samples and flow interpolants.<n>Our approach also reduces the transport cost between input and generated samples to better highlight the true effect of the transformation.
arXiv Detail & Related papers (2025-10-08T04:36:34Z) - Discrete Guidance Matching: Exact Guidance for Discrete Flow Matching [36.348940136801296]
A novel guidance framework for discrete data is proposed to address this problem.<n>We derive the exact transition rate for the desired distribution given a learned discrete flow matching model.<n>We demonstrate the effectiveness of our proposed guidance on energy-guided simulations and preference alignment on text-to-image generation and multimodal understanding tasks.
arXiv Detail & Related papers (2025-09-26T05:51:31Z) - Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data [2.6499018693213316]
We introduce a novel generative model for the representation of joint probability distributions of discrete random variables.<n>The approach uses measure transport by randomized assignment flows on the statistical submanifold of factorizing distributions.
arXiv Detail & Related papers (2024-06-06T21:58:33Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Augmented Bridge Matching [32.668433085737036]
Flow and bridge matching processes can interpolate between arbitrary data distributions.
We show that a simple modification of the matching process recovers this coupling by augmenting the velocity field.
We illustrate the efficiency of our augmentation in learning mixture of image translation tasks.
arXiv Detail & Related papers (2023-11-12T22:42:34Z) - Cooperative Distribution Alignment via JSD Upper Bound [7.071749623370137]
Unsupervised distribution alignment estimates a transformation that maps two or more source distributions to a shared aligned distribution.
This task has many applications including generative modeling, unsupervised domain adaptation, and socially aware learning.
We propose to unify and generalize previous flow-based approaches under a single non-adversarial framework.
arXiv Detail & Related papers (2022-07-05T20:09:03Z) - KL Guided Domain Adaptation [88.19298405363452]
Domain adaptation is an important problem and often needed for real-world applications.
A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain.
We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples.
arXiv Detail & Related papers (2021-06-14T22:24:23Z) - Semi-Supervised Learning with Normalizing Flows [54.376602201489995]
FlowGMM is an end-to-end approach to generative semi supervised learning with normalizing flows.
We show promising results on a wide range of applications, including AG-News and Yahoo Answers text data.
arXiv Detail & Related papers (2019-12-30T17:36:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.