Related papers: ICON: Invariant Counterfactual Optimization with Neuro-Symbolic Priors for Text-Based Person Search

ICON: Invariant Counterfactual Optimization with Neuro-Symbolic Priors for Text-Based Person Search

URL: http://arxiv.org/abs/2601.15931v1
Date: Thu, 22 Jan 2026 13:09:22 GMT
Title: ICON: Invariant Counterfactual Optimization with Neuro-Symbolic Priors for Text-Based Person Search
Authors: Xiangyu Wang, Zhixin Lv, Yongjiao Sun, Anrui Han, Ye Yuan, Hangxu Ji,
Abstract summary: Text-Based Person Search holds unique value in real-world surveillance bridging visual perception and language understanding.<n>Current paradigms utilizing pre-training models often fail to transfer effectively to complex open-world scenarios.<n>This paper proposes ICON, a framework integrating causal and topological priors.
Score: 6.247167721048087
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Text-Based Person Search (TBPS) holds unique value in real-world surveillance bridging visual perception and language understanding, yet current paradigms utilizing pre-training models often fail to transfer effectively to complex open-world scenarios. The reliance on "Passive Observation" leads to multifaceted spurious correlations and spatial semantic misalignment, causing a lack of robustness against distribution shifts. To fundamentally resolve these defects, this paper proposes ICON (Invariant Counterfactual Optimization with Neuro-symbolic priors), a framework integrating causal and topological priors. First, we introduce Rule-Guided Spatial Intervention to strictly penalize sensitivity to bounding box noise, forcibly severing location shortcuts to achieve geometric invariance. Second, Counterfactual Context Disentanglement is implemented via semantic-driven background transplantation, compelling the model to ignore background interference for environmental independence. Then, we employ Saliency-Driven Semantic Regularization with adaptive masking to resolve local saliency bias and guarantee holistic completeness. Finally, Neuro-Symbolic Topological Alignment utilizes neuro-symbolic priors to constrain feature matching, ensuring activated regions are topologically consistent with human structural logic. Experimental results demonstrate that ICON not only maintains leading performance on standard benchmarks but also exhibits exceptional robustness against occlusion, background interference, and localization noise. This approach effectively advances the field by shifting from fitting statistical co-occurrences to learning causal invariance.

Related papers

Not All Preferences Are Created Equal: Stability-Aware and Gradient-Efficient Alignment for Reasoning Models [52.48582333951919]
We propose a dynamic framework designed to enhance alignment reliability by maximizing the Signal-to-Noise Ratio of policy updates.<n>SAGE (Stability-Aware Gradient Efficiency) integrates a coarse-grained curriculum mechanism that refreshes candidate pools based on model competence.<n> Experiments on multiple mathematical reasoning benchmarks demonstrate that SAGE significantly accelerates convergence and outperforms static baselines.
arXiv Detail & Related papers (2026-02-01T12:56:10Z)
Latent Adversarial Regularization for Offline Preference Optimization [21.271580780278473]
We introduce GANPO, which achieves latent-space regularization by penalizing divergence between internal representations of a policy model and a reference model.<n>Experiments across multiple model architectures and tasks show consistent improvements from latent-space regularization.
arXiv Detail & Related papers (2026-01-29T18:21:57Z)
Ontology Neural Networks for Topologically Conditioned Constraint Satisfaction [0.0]
We present an enhanced framework that integrates topological conditioning with gradient stabilization mechanisms.<n>The framework exhibits seed-independent convergence and graceful scaling behavior up to twenty-node problems.
arXiv Detail & Related papers (2026-01-08T18:01:52Z)
BiPrompt: Bilateral Prompt Optimization for Visual and Textual Debiasing in Vision-Language Models [7.174865411448373]
We propose a bilateral prompt optimization framework (BiPrompt) that simultaneously mitigates non-causal feature reliance in both modalities during test-time adaptation.<n>On the visual side, it employs structured attention-guided erasure to suppress background activations and enforce prediction consistency between causal and spurious regions.<n>On the textual side, it introduces balanced prompt normalization, a learnable re-centering mechanism that aligns class embeddings toward an isotropic semantic space.
arXiv Detail & Related papers (2026-01-05T14:22:20Z)
Adaptive Dual Uncertainty Optimization: Boosting Monocular 3D Object Detection under Test-Time Shifts [80.32933059529135]
Test-Time Adaptation (TTA) methods have emerged to adapt to target distributions during inference.<n>We propose Dual Uncertainty Optimization (DUO), the first TTA framework designed to jointly minimize both uncertainties for robust M3OD.<n>In parallel, we design a semantic-aware normal field constraint that preserves geometric coherence in regions with clear semantic cues.
arXiv Detail & Related papers (2025-08-28T07:09:21Z)
TRUST: Leveraging Text Robustness for Unsupervised Domain Adaptation [9.906359339999039]
We introduce a novel UDA approach that exploits the robustness of the language modality to guide the adaptation of a vision model.<n>We propose a multimodal soft-contrastive learning loss that aligns the vision and language feature spaces.<n>Our approach outperforms previous methods, setting the new state-of-the-art on classical (DomainNet) and complex (GeoNet) domain shifts.
arXiv Detail & Related papers (2025-08-08T16:51:44Z)
Latent Diffusion Model Based Denoising Receiver for 6G Semantic Communication: From Stochastic Differential Theory to Application [11.385703484113552]
We propose a novel semantic communication framework empowered by generative artificial intelligence (GAI)<n>A latent diffusion model (LDM)-based semantic communication framework is proposed that combines a variational autoencoder for semantic features extraction.<n>The proposed system is a training-free framework that supports zero-shot generalization, and achieves superior performance under low-SNR and out-of-distribution conditions.
arXiv Detail & Related papers (2025-06-06T03:20:32Z)
Certified Neural Approximations of Nonlinear Dynamics [51.01318247729693]
In safety-critical contexts, the use of neural approximations requires formal bounds on their closeness to the underlying system.<n>We propose a novel, adaptive, and parallelizable verification method based on certified first-order models.
arXiv Detail & Related papers (2025-05-21T13:22:20Z)
Tuning for Trustworthiness -- Balancing Performance and Explanation Consistency in Neural Network Optimization [49.567092222782435]
We introduce the novel concept of XAI consistency, defined as the agreement among different feature attribution methods.<n>We create a multi-objective optimization framework that balances predictive performance with explanation.<n>Our research provides a foundation for future investigations into whether models from the trade-off zone-balancing performance loss and XAI consistency-exhibit greater robustness.
arXiv Detail & Related papers (2025-05-12T13:19:14Z)
Topology-Aware Conformal Prediction for Stream Networks [68.02503121089633]
We propose Spatio-Temporal Adaptive Conformal Inference (textttCISTA), a novel framework that integrates network topology and temporal dynamics into the conformal prediction framework.<n>Our results show that textttCISTA effectively balances prediction efficiency and coverage, outperforming existing conformal prediction methods for stream networks.
arXiv Detail & Related papers (2025-03-06T21:21:15Z)
Neighbor-Aware Calibration of Segmentation Networks with Penalty-Based Constraints [19.897181782914437]
We propose a principled and simple solution based on equality constraints on the logit values, which enables to control explicitly both the enforced constraint and the weight of the penalty. Our approach can be used to train a wide span of deep segmentation networks.
arXiv Detail & Related papers (2024-01-25T19:46:57Z)
Towards Continual Learning Desiderata via HSIC-Bottleneck Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion. Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z)
Achieving Constraints in Neural Networks: A Stochastic Augmented Lagrangian Approach [49.1574468325115]
Regularizing Deep Neural Networks (DNNs) is essential for improving generalizability and preventing overfitting. We propose a novel approach to DNN regularization by framing the training process as a constrained optimization problem. We employ the Augmented Lagrangian (SAL) method to achieve a more flexible and efficient regularization mechanism.
arXiv Detail & Related papers (2023-10-25T13:55:35Z)
Robust Regularization with Adversarial Labelling of Perturbed Samples [22.37046166576859]
We propose Adversarial Labelling of Perturbed Samples (ALPS) as a regularization scheme. ALPS trains neural networks with synthetic samples formed by perturbing each authentic input sample towards another one along with an adversarially assigned label. Experiments on the SVHN, CIFAR-10, CIFAR-100 and Tiny-ImageNet datasets show that the ALPS has a state-of-the-art regularization performance.
arXiv Detail & Related papers (2021-05-28T11:26:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.