Related papers: UGG-ReID: Uncertainty-Guided Graph Model for Multi-Modal Object Re-Identification

UGG-ReID: Uncertainty-Guided Graph Model for Multi-Modal Object Re-Identification

URL: http://arxiv.org/abs/2507.04638v2
Date: Tue, 08 Jul 2025 02:49:43 GMT
Title: UGG-ReID: Uncertainty-Guided Graph Model for Multi-Modal Object Re-Identification
Authors: Xixi Wan, Aihua Zheng, Bo Jiang, Beibei Wang, Chenglong Li, Jin Tang,
Abstract summary: We propose a robust approach named Uncertainty-Guided Graph model for multi-modal object ReID (UGG-ReID)<n>UGG-ReID is designed to mitigate noise interference and facilitate effective multi-modal fusion.<n> Experimental results show that the proposed method achieves excellent performance on all datasets.
Score: 26.770271366177603
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-modal object Re-IDentification (ReID) has gained considerable attention with the goal of retrieving specific targets across cameras using heterogeneous visual data sources. Existing methods primarily aim to improve identification performance, but often overlook the uncertainty arising from inherent defects, such as intra-modal noise and inter-modal conflicts. This uncertainty is particularly significant in the case of fine-grained local occlusion and frame loss, which becomes a challenge in multi-modal learning. To address the above challenge, we propose a robust approach named Uncertainty-Guided Graph model for multi-modal object ReID (UGG-ReID). UGG-ReID is designed to mitigate noise interference and facilitate effective multi-modal fusion by estimating both local and sample-level aleatoric uncertainty and explicitly modeling their dependencies. Specifically, we first propose the Gaussian patch-graph representation model that leverages uncertainty to quantify fine-grained local cues and capture their structural relationships. This process boosts the expressiveness of modal-specific information, ensuring that the generated embeddings are both more informative and robust. Subsequently, we design an uncertainty-guided mixture of experts strategy that dynamically routes samples to experts exhibiting low uncertainty. This strategy effectively suppresses noise-induced instability, leading to enhanced robustness. Meanwhile, we design an uncertainty-guided routing to strengthen the multi-modal interaction, improving the performance. UGG-ReID is comprehensively evaluated on five representative multi-modal object ReID datasets, encompassing diverse spectral modalities. Experimental results show that the proposed method achieves excellent performance on all datasets and is significantly better than current methods in terms of noise immunity. Our code will be made public upon acceptance.

Related papers

Confidence-Aware Self-Distillation for Multimodal Sentiment Analysis with Incomplete Modalities [15.205192581534973]
Multimodal sentiment analysis aims to understand human sentiment through multimodal data.<n>Existing methods for handling modality missingness are based on data reconstruction or common subspace projections.<n>We propose a Confidence-Aware Self-Distillation (CASD) strategy that effectively incorporates multimodal probabilistic embeddings.
arXiv Detail & Related papers (2025-06-02T09:48:41Z)
RODEO: Robust Outlier Detection via Exposing Adaptive Out-of-Distribution Samples [4.76428036044684]
We introduce RODEO, a data-centric approach that generates effective outliers for robust outlier detection.<n>We show that incorporating outlier exposure (OE) and adversarial training can be an effective strategy for this purpose.<n>We demonstrate both quantitatively and qualitatively that our adaptive OE method effectively generates diverse'' and near-distribution'' outliers.
arXiv Detail & Related papers (2025-01-28T14:13:17Z)
Uncertainty Quantification via Hölder Divergence for Multi-View Representation Learning [18.076966572539547]
This paper introduces a novel algorithm based on H"older Divergence (HD) to enhance the reliability of multi-view learning.<n>Through the Dempster-Shafer theory, integration of uncertainty from different modalities, thereby generating a comprehensive result.<n>Mathematically, HD proves to better measure the distance'' between real data distribution and predictive distribution of the model.
arXiv Detail & Related papers (2024-10-29T04:29:44Z)
Trusted Multi-view Learning under Noisy Supervision [20.668620759102115]
We propose a method to develop a reliable multi-view learning model under the guidance of noisy labels.<n>TMNR employs evidential deep neural networks to construct view-specific opinions that capture both beliefs and uncertainty.<n>TMNR2 identifies potentially mislabeled samples through evidence-label consistency and generates pseudo-labels from neighboring information.
arXiv Detail & Related papers (2024-04-18T06:47:30Z)
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition [53.800937914403654]
Advanced Audio-Visual Speech Recognition (AVSR) systems have been observed to be sensitive to missing video frames. While applying the dropout technique to the video modality enhances robustness to missing frames, it simultaneously results in a performance loss when dealing with complete data input. We propose a novel Multimodal Distribution Approximation with Knowledge Distillation (MDA-KD) framework to reduce over-reliance on the audio modality.
arXiv Detail & Related papers (2024-03-07T06:06:55Z)
The Risk of Federated Learning to Skew Fine-Tuning Features and Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model. We introduce three robustness indicators and conduct experiments across diverse robust datasets. Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z)
Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity [80.16488817177182]
GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions. We introduce three model stealing attacks to adapt to different actual scenarios.
arXiv Detail & Related papers (2023-12-18T05:42:31Z)
Informative Data Selection with Uncertainty for Multi-modal Object Detection [25.602915381482468]
We propose a universal uncertainty-aware multi-modal fusion model. Our model reduces the randomness in fusion and generates reliable output. Our fusion model is proven to resist severe noise interference like Gaussian, motion blur, and frost, with only slight degradation.
arXiv Detail & Related papers (2023-04-23T16:36:13Z)
Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization [73.04187954213471]
We introduce a unified learning approach to simultaneously modeling the coarse- and fine-grained retrieval. The proposed method has achieved +4.03%, +3.38%, and +2.40% Recall@50 accuracy over a strong baseline.
arXiv Detail & Related papers (2022-11-14T14:25:40Z)
Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result. Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.