Adaptive Domain Shift in Diffusion Models for Cross-Modality Image Translation
- URL: http://arxiv.org/abs/2601.18623v2
- Date: Mon, 02 Feb 2026 14:16:07 GMT
- Title: Adaptive Domain Shift in Diffusion Models for Cross-Modality Image Translation
- Authors: Zihao Wang, Yuzhou Chen, Shaogang Ren,
- Abstract summary: Cross-modal image translation remains brittle and inefficient.<n>Standard diffusion approaches often rely on a single, global linear transfer between domains.<n>We embed domain-shift dynamics directly into the generative process.
- Score: 35.54089670586124
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Cross-modal image translation remains brittle and inefficient. Standard diffusion approaches often rely on a single, global linear transfer between domains. We find that this shortcut forces the sampler to traverse off-manifold, high-cost regions, inflating the correction burden and inviting semantic drift. We refer to this shared failure mode as fixed-schedule domain transfer. In this paper, we embed domain-shift dynamics directly into the generative process. Our model predicts a spatially varying mixing field at every reverse step and injects an explicit, target-consistent restoration term into the drift. This in-step guidance keeps large updates on-manifold and shifts the model's role from global alignment to local residual correction. We provide a continuous-time formulation with an exact solution form and derive a practical first-order sampler that preserves marginal consistency. Empirically, across translation tasks in medical imaging, remote sensing, and electroluminescence semantic mapping, our framework improves structural fidelity and semantic consistency while converging in fewer denoising steps.
Related papers
- SRasP: Self-Reorientation Adversarial Style Perturbation for Cross-Domain Few-Shot Learning [29.002306547742347]
Cross-Domain Few-Shot Learning aims to transfer knowledge from a seen source domain to unseen target domains.<n>Existing style-based perturbation methods mitigate domain shift but often suffer from instability and convergence to sharp minima.<n>We propose a novel crop-global style perturbation network, termed Self-Reorientation Adversarial underlineStyle underlinePerturbation (SRasP)
arXiv Detail & Related papers (2026-03-05T13:03:35Z) - Unpaired Image-to-Image Translation via a Self-Supervised Semantic Bridge [59.247871132422006]
Adversarial diffusion and diffusion-inversion methods have advanced unpaired image-to-image translation, but each faces key limitations.<n>We propose the Self-Supervised Semantic Bridge ( SSB), a versatile framework that integrates external semantic priors into diffusion bridge models.<n>Our key idea is to leverage self-supervised visual encoders to learn representations that are invariant to appearance changes but capture geometric structure.
arXiv Detail & Related papers (2026-02-18T18:05:00Z) - On Exact Editing of Flow-Based Diffusion Models [97.0633397035926]
We propose Conditioned Velocity Correction (CVC) to reformulate flow-based editing as a distribution transformation problem driven by a known source prior.<n>CVC rethinks the role of velocity in inter-distribution transformation by introducing a dual-perspective velocity conversion mechanism.<n>We show that CVC consistently achieves superior fidelity, better semantic alignment, and more reliable editing behavior across diverse tasks.
arXiv Detail & Related papers (2025-12-30T06:29:20Z) - Simulating Distribution Dynamics: Liquid Temporal Feature Evolution for Single-Domain Generalized Object Detection [58.25418970608328]
Single-Domain Generalized Object Detection (Single-DGOD) aims to transfer a detector trained on one source domain to multiple unknown domains.<n>Existing methods for Single-DGOD typically rely on discrete data augmentation or static perturbation methods to expand data diversity.<n>We propose a new method, which simulates the progressive evolution of features from the source domain to simulated latent distributions.
arXiv Detail & Related papers (2025-11-13T03:10:39Z) - Multimodal LLM-Guided Semantic Correction in Text-to-Image Diffusion [52.315729095824906]
MLLM Semantic-Corrected Ping-Pong-Ahead Diffusion (PPAD) is a novel framework that introduces a Multimodal Large Language Model (MLLM) as a semantic observer during inference.<n>It performs real-time analysis on intermediate generations, identifies latent semantic inconsistencies, and translates feedback into controllable signals that actively guide the remaining denoising steps.<n>Extensive experiments demonstrate PPAD's significant improvements.
arXiv Detail & Related papers (2025-05-26T14:42:35Z) - Geometrically Regularized Transfer Learning with On-Manifold and Off-Manifold Perturbation [0.0]
MAADA is a novel framework that decomposes adversarial perturbations into on-manifold and off-manifold components.<n>We show that MAADA consistently outperforms existing adversarial and adaptation methods in both unsupervised and few-shot settings.
arXiv Detail & Related papers (2025-05-21T07:13:09Z) - Improving Diffusion-based Image Translation using Asymmetric Gradient
Guidance [51.188396199083336]
We present an approach that guides the reverse process of diffusion sampling by applying asymmetric gradient guidance.
Our model's adaptability allows it to be implemented with both image-fusion and latent-dif models.
Experiments show that our method outperforms various state-of-the-art models in image translation tasks.
arXiv Detail & Related papers (2023-06-07T12:56:56Z) - Smooth image-to-image translations with latent space interpolations [64.8170758294427]
Multi-domain image-to-image (I2I) translations can transform a source image according to the style of a target domain.
We show that our regularization techniques can improve the state-of-the-art I2I translations by a large margin.
arXiv Detail & Related papers (2022-10-03T11:57:30Z) - MIDMs: Matching Interleaved Diffusion Models for Exemplar-based Image
Translation [29.03892463588357]
We present a novel method for exemplar-based image translation, called matching interleaved diffusion models (MIDMs)
We formulate a diffusion-based matching-and-generation framework that interleaves cross-domain matching and diffusion steps in the latent space.
To improve the reliability of the diffusion process, we design a confidence-aware process using cycle-consistency to consider only confident regions.
arXiv Detail & Related papers (2022-09-22T14:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.