Related papers: Bridging the Semantic Chasm: Synergistic Conceptual Anchoring for Generalized Few-Shot and Zero-Shot OOD Perception

Bridging the Semantic Chasm: Synergistic Conceptual Anchoring for Generalized Few-Shot and Zero-Shot OOD Perception

URL: http://arxiv.org/abs/2602.00340v1
Date: Fri, 30 Jan 2026 21:44:33 GMT
Title: Bridging the Semantic Chasm: Synergistic Conceptual Anchoring for Generalized Few-Shot and Zero-Shot OOD Perception
Authors: Alexandros Christoforos, Sarah Jenkins, Michael Brown, Tuan Pham, David Chen,
Abstract summary: This manuscript presents a pioneering Synergistic Neural Agents Network (SynerNet) framework designed to mitigate the phenomenon of cross-modal alignment degeneration.<n>Four specialized computational units - visual perception, linguistic context, nominal embedding, and global coordination - collaboratively rectify modality disparities.<n> Empirical evaluations conducted on the VISTA-Beyond benchmark demonstrate that SynerNet yields substantial performance augmentations in both few-shot and zero-shot scenarios.
Score: 39.37877716254272
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This manuscript presents a pioneering Synergistic Neural Agents Network (SynerNet) framework designed to mitigate the phenomenon of cross-modal alignment degeneration in Vision-Language Models (VLMs) when encountering Out-of-Distribution (OOD) concepts. Specifically, four specialized computational units - visual perception, linguistic context, nominal embedding, and global coordination - collaboratively rectify modality disparities via a structured message-propagation protocol. The principal contributions encompass a multi-agent latent space nomenclature acquisition framework, a semantic context-interchange algorithm for enhanced few-shot adaptation, and an adaptive dynamic equilibrium mechanism. Empirical evaluations conducted on the VISTA-Beyond benchmark demonstrate that SynerNet yields substantial performance augmentations in both few-shot and zero-shot scenarios, exhibiting precision improvements ranging from 1.2% to 5.4% across a diverse array of domains.

Related papers

AdvSynGNN: Structure-Adaptive Graph Neural Nets via Adversarial Synthesis and Self-Corrective Propagation [8.765438402697892]
Graph neural networks frequently encounter significant performance degradation when confronted with structural noise or non-homophilous topologies.<n>We present AdvSynGNN, a comprehensive architecture designed for resilient node-level representation learning.
arXiv Detail & Related papers (2026-02-19T04:26:57Z)
SSVP: Synergistic Semantic-Visual Prompting for Industrial Zero-Shot Anomaly Detection [55.54007781679915]
We propose Synergistic Semantic-Visual Prompting (SSVP), that efficiently fuses diverse visual encodings to elevate model's fine-grained perception.<n>SSVP achieves state-of-the-art performance with 93.0% Image-AUROC and 92.2% Pixel-AUROC on MVTec-AD, significantly outperforming existing zero-shot approaches.
arXiv Detail & Related papers (2026-01-14T04:42:19Z)
Multi-Agent Cooperative Learning for Robust Vision-Language Alignment under OOD Concepts [0.0]
This paper introduces a novel Multi-Agent Cooperative Learning framework to address cross-modal alignment collapse in vision-language models.<n>Experiments on the VISTA-Beyond dataset demonstrate that MACL significantly improves performance in both few-shot and zero-shot settings.
arXiv Detail & Related papers (2026-01-11T20:36:47Z)
GTMA: Dynamic Representation Optimization for OOD Vision-Language Models [10.940718051047023]
Vision-Matching models (VLMs) struggle in open-world applications, where out-of-distribution (OOD) concepts often trigger cross-modal alignment collapse.<n>We propose dynamic representation optimization, realized through the Guided Target-language Adaptation (GTMA) framework.<n> Experiments on ImageNet-R and the VISTA-Beyond benchmark demonstrate that GTMA improves zero-shot and few-shot OOD accuracy by up to 15-20 percent over the base VLM.
arXiv Detail & Related papers (2025-12-20T20:44:07Z)
Perception-Consistency Multimodal Large Language Models Reasoning via Caption-Regularized Policy Optimization [72.30168853571216]
multimodal large language models excel at tasks that integrate visual perception with symbolic reasoning.<n>CapPO integrates two key mechanisms: (1) a caption-based consistency regularization, which minimizes the divergence between responses conditioned on raw images and those conditioned on captions, and (2) a KL-weighted advantage estimation scheme, which adaptively scales reinforcement signals to strengthen perceptually consistent trajectories.
arXiv Detail & Related papers (2025-09-26T04:32:26Z)
Tuning for Trustworthiness -- Balancing Performance and Explanation Consistency in Neural Network Optimization [49.567092222782435]
We introduce the novel concept of XAI consistency, defined as the agreement among different feature attribution methods.<n>We create a multi-objective optimization framework that balances predictive performance with explanation.<n>Our research provides a foundation for future investigations into whether models from the trade-off zone-balancing performance loss and XAI consistency-exhibit greater robustness.
arXiv Detail & Related papers (2025-05-12T13:19:14Z)
Federated Contrastive Learning for Personalized Semantic Communication [55.46383524190467]
We design a federated contrastive learning framework aimed at supporting personalized semantic communication. FedCL enables collaborative training of local semantic encoders across multiple clients and a global semantic decoder owned by the base station. To tackle the semantic imbalance issue arising from heterogeneous datasets across distributed clients, we employ contrastive learning to train a semantic centroid generator.
arXiv Detail & Related papers (2024-06-13T14:45:35Z)
Reasoning with the Theory of Mind for Pragmatic Semantic Communication [62.87895431431273]
A pragmatic semantic communication framework is proposed in this paper. It enables effective goal-oriented information sharing between two-intelligent agents. Numerical evaluations demonstrate the framework's ability to achieve efficient communication with a reduced amount of bits.
arXiv Detail & Related papers (2023-11-30T03:36:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.