Related papers: AGILE: A Diffusion-Based Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait Identification

AGILE: A Diffusion-Based Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait Identification

URL: http://arxiv.org/abs/2503.22019v1
Date: Thu, 27 Mar 2025 22:20:15 GMT
Title: AGILE: A Diffusion-Based Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait Identification
Authors: Earl Ranario, Lars Lundqvist, Heesup Yun, Brian N. Bailey, J. Mason Earles,
Abstract summary: Cross-domain image translation facilitates the generation of training data by transferring labels across different domains.<n>Existing generative models struggle to maintain object-level accuracy when translating images between domains.<n>We introduce AGILE, a diffusion-based framework that leverages optimized text embeddings and attention guidance to semantically constrain image translation.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Semantically consistent cross-domain image translation facilitates the generation of training data by transferring labels across different domains, making it particularly useful for plant trait identification in agriculture. However, existing generative models struggle to maintain object-level accuracy when translating images between domains, especially when domain gaps are significant. In this work, we introduce AGILE (Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait Identification), a diffusion-based framework that leverages optimized text embeddings and attention guidance to semantically constrain image translation. AGILE utilizes pretrained diffusion models and publicly available agricultural datasets to improve the fidelity of translated images while preserving critical object semantics. Our approach optimizes text embeddings to strengthen the correspondence between source and target images and guides attention maps during the denoising process to control object placement. We evaluate AGILE on cross-domain plant datasets and demonstrate its effectiveness in generating semantically accurate translated images. Quantitative experiments show that AGILE enhances object detection performance in the target domain while maintaining realism and consistency. Compared to prior image translation methods, AGILE achieves superior semantic alignment, particularly in challenging cases where objects vary significantly or domain gaps are substantial.

Related papers

Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos [69.29778009769862]
We introduce LaGTran, a framework that guides robust transfer of discriminative knowledge from labeled source to unlabeled target data with domain gaps. Motivated by our observation that semantically richer text modality has more favorable transfer properties, we devise a transfer mechanism to use a source-trained text-classifier to generate predictions on the target text descriptions. Our approach driven by language guidance is surprisingly easy and simple, yet significantly outperforms all prior approaches on challenging datasets like GeoNet and DomainNet.
arXiv Detail & Related papers (2024-03-08T18:58:46Z)
Diffusion-based Image Translation with Label Guidance for Domain Adaptive Semantic Segmentation [35.44771460784343]
Translating images from a source domain to a target domain for learning target models is one of the most common strategies in domain adaptive semantic segmentation (DASS) Existing methods still struggle to preserve semantically-consistent local details between the original and translated images. We present an innovative approach that addresses this challenge by using source-domain labels as explicit guidance during image translation.
arXiv Detail & Related papers (2023-08-23T18:01:01Z)
Unsupervised Domain Adaptation for Semantic Segmentation using One-shot Image-to-Image Translation via Latent Representation Mixing [9.118706387430883]
We propose a new unsupervised domain adaptation method for the semantic segmentation of very high resolution images. An image-to-image translation paradigm is proposed, based on an encoder-decoder principle where latent content representations are mixed across domains. Cross-city comparative experiments have shown that the proposed method outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2022-12-07T18:16:17Z)
Marginal Contrastive Correspondence for Guided Image Generation [58.0605433671196]
Exemplar-based image translation establishes dense correspondences between a conditional input and an exemplar from two different domains. Existing work builds the cross-domain correspondences implicitly by minimizing feature-wise distances across the two domains. We design a Marginal Contrastive Learning Network (MCL-Net) that explores contrastive learning to learn domain-invariant features for realistic exemplar-based image translation.
arXiv Detail & Related papers (2022-04-01T13:55:44Z)
AFAN: Augmented Feature Alignment Network for Cross-Domain Object Detection [90.18752912204778]
Unsupervised domain adaptation for object detection is a challenging problem with many real-world applications. We propose a novel augmented feature alignment network (AFAN) which integrates intermediate domain image generation and domain-adversarial training. Our approach significantly outperforms the state-of-the-art methods on standard benchmarks for both similar and dissimilar domain adaptations.
arXiv Detail & Related papers (2021-06-10T05:01:20Z)
Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors [120.13149176992896]
We present an effectively signed attribute vector, which enables continuous translation on diverse mapping paths across various domains. To enhance the visual quality of continuous translation results, we generate a trajectory between two sign-symmetrical attribute vectors.
arXiv Detail & Related papers (2020-11-02T18:59:03Z)
Structured Domain Adaptation with Online Relation Regularization for Unsupervised Person Re-ID [62.90727103061876]
Unsupervised domain adaptation (UDA) aims at adapting the model trained on a labeled source-domain dataset to an unlabeled target-domain dataset. We propose an end-to-end structured domain adaptation framework with an online relation-consistency regularization term. Our proposed framework is shown to achieve state-of-the-art performance on multiple UDA tasks of person re-ID.
arXiv Detail & Related papers (2020-03-14T14:45:18Z)
Label-Driven Reconstruction for Domain Adaptation in Semantic Segmentation [43.09068177612067]
Unsupervised domain adaptation enables to alleviate the need for pixel-wise annotation in the semantic segmentation. One of the most common strategies is to translate images from the source domain to the target domain and then align their marginal distributions in the feature space using adversarial learning. Here, we present an innovative framework, designed to mitigate the image translation bias and align cross-domain features with the same category.
arXiv Detail & Related papers (2020-03-10T10:06:35Z)
CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency [119.45667331836583]
Unsupervised domain adaptation algorithms aim to transfer the knowledge learned from one domain to another. We present a novel pixel-wise adversarial domain adaptation algorithm.
arXiv Detail & Related papers (2020-01-09T19:00:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.