Distributed Zero-Shot Learning for Visual Recognition
- URL: http://arxiv.org/abs/2511.08170v1
- Date: Wed, 12 Nov 2025 01:44:24 GMT
- Title: Distributed Zero-Shot Learning for Visual Recognition
- Authors: Zhi Chen, Yadan Luo, Zi Huang, Jingjing Li, Sen Wang, Xin Yu,
- Abstract summary: A Distributed Zero-Shot Learning (DistZSL) framework can fully exploit decentralized data to learn an effective model for unseen classes.<n>We introduce two key components to ensure the effective learning of DistZSL: a cross-node attribute regularizer and a global attribute-to-visual consensus.
- Score: 54.776277273875195
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a Distributed Zero-Shot Learning (DistZSL) framework that can fully exploit decentralized data to learn an effective model for unseen classes. Considering the data heterogeneity issues across distributed nodes, we introduce two key components to ensure the effective learning of DistZSL: a cross-node attribute regularizer and a global attribute-to-visual consensus. Our proposed cross-node attribute regularizer enforces the distances between attribute features to be similar across different nodes. In this manner, the overall attribute feature space would be stable during learning, and thus facilitate the establishment of visual-to-attribute(V2A) relationships. Then, we introduce the global attribute-tovisual consensus to mitigate biased V2A mappings learned from individual nodes. Specifically, we enforce the bilateral mapping between the attribute and visual feature distributions to be consistent across different nodes. Thus, the learned consistent V2A mapping can significantly enhance zero-shot learning across different nodes. Extensive experiments demonstrate that DistZSL achieves superior performance to the state-of-the-art in learning from distributed data.
Related papers
- Semantic-Aligned Learning with Collaborative Refinement for Unsupervised VI-ReID [82.12123628480371]
Unsupervised person re-identification (USL-VI-ReID) seeks to match pedestrian images of the same individual across different modalities without human annotations for model learning.<n>Previous methods unify pseudo-labels of cross-modality images through label association algorithms and then design contrastive learning framework for global feature learning.<n>We propose a Semantic-Aligned Learning with Collaborative Refinement (SALCR) framework, which builds up objective for specific fine-grained patterns emphasized by each modality.
arXiv Detail & Related papers (2025-04-27T13:58:12Z) - Hybrid Discriminative Attribute-Object Embedding Network for Compositional Zero-Shot Learning [83.10178754323955]
Hybrid Discriminative Attribute-Object Embedding (HDA-OE) network is proposed to solve the problem of complex interactions between attributes and object visual representations.<n>To increase the variability of training data, HDA-OE introduces an attribute-driven data synthesis (ADDS) module.<n>To further improve the discriminative ability of the model, HDA-OE introduces the subclass-driven discriminative embedding (SDDE) module.<n>The proposed model has been evaluated on three benchmark datasets, and the results verify its effectiveness and reliability.
arXiv Detail & Related papers (2024-11-28T09:50:25Z) - Matcha: Mitigating Graph Structure Shifts with Test-Time Adaptation [66.40525136929398]
Test-time adaptation (TTA) has attracted attention due to its ability to adapt a pre-trained model to a target domain, without re-accessing the source domain.<n>We propose Matcha, an innovative framework designed for effective and efficient adaptation to structure shifts in graphs.<n>We validate the effectiveness of Matcha on both synthetic and real-world datasets, demonstrating its robustness across various combinations of structure and attribute shifts.
arXiv Detail & Related papers (2024-10-09T15:15:40Z) - CREST: Cross-modal Resonance through Evidential Deep Learning for Enhanced Zero-Shot Learning [48.46511584490582]
Zero-shot learning (ZSL) enables the recognition of novel classes by leveraging semantic knowledge transfer from known to unknown categories.
Real-world challenges such as distribution imbalances and attribute co-occurrence hinder the discernment of local variances in images.
We propose a bidirectional cross-modal ZSL approach CREST to overcome these challenges.
arXiv Detail & Related papers (2024-04-15T10:19:39Z) - Dual Feature Augmentation Network for Generalized Zero-shot Learning [14.410978100610489]
Zero-shot learning (ZSL) aims to infer novel classes without training samples by transferring knowledge from seen classes.
Existing embedding-based approaches for ZSL typically employ attention mechanisms to locate attributes on an image.
We propose a novel Dual Feature Augmentation Network (DFAN), which comprises two feature augmentation modules.
arXiv Detail & Related papers (2023-09-25T02:37:52Z) - MUSE: Multi-View Contrastive Learning for Heterophilic Graphs [4.409889336732851]
We propose a multi-view contrastive learning model for heterophilic graphs, namely, MUSE.
In this work, we construct two views to capture the information of the ego node and its neighborhood by GNNs enhanced with contrastive learning.
We employ an information fusion controller to model the diversity of node-neighborhood similarity at both the local and global levels.
arXiv Detail & Related papers (2023-07-29T16:54:34Z) - STERLING: Synergistic Representation Learning on Bipartite Graphs [78.86064828220613]
A fundamental challenge of bipartite graph representation learning is how to extract node embeddings.
Most recent bipartite graph SSL methods are based on contrastive learning which learns embeddings by discriminating positive and negative node pairs.
We introduce a novel synergistic representation learning model (STERLING) to learn node embeddings without negative node pairs.
arXiv Detail & Related papers (2023-01-25T03:21:42Z) - Variational Co-embedding Learning for Attributed Network Clustering [30.7006907516984]
Recent works for attributed network clustering utilize graph convolution to obtain node embeddings and simultaneously perform clustering assignments on the embedding space.
We propose a variational co-embedding learning model for attributed network clustering (ANC)
ANC is composed of dual variational auto-encoders to simultaneously embed nodes and attributes.
arXiv Detail & Related papers (2021-04-15T08:11:47Z) - Bidirectional Mapping Coupled GAN for Generalized Zero-Shot Learning [7.22073260315824]
Bidirectional mapping-based generalized zero-shot learning (GZSL) methods rely on the quality of synthesized features to recognize seen and unseen data.
We learn a joint distribution of seen-unseen domains and preserving domain distinction is crucial for these methods.
In this work, we utilize the available unseen class semantics alongside seen class semantics and learn joint distribution through a strong visual-semantic coupling.
arXiv Detail & Related papers (2020-12-30T06:11:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.