HOMOE: A Memory-Based and Composition-Aware Framework for Zero-Shot
Learning with Hopfield Network and Soft Mixture of Experts
- URL: http://arxiv.org/abs/2311.14747v1
- Date: Thu, 23 Nov 2023 07:32:20 GMT
- Title: HOMOE: A Memory-Based and Composition-Aware Framework for Zero-Shot
Learning with Hopfield Network and Soft Mixture of Experts
- Authors: Do Huu Dat, Po Yuan Mao, Tien Hoang Nguyen, Wray Buntine, Mohammed
Bennamoun
- Abstract summary: We propose a novel framework that combines the Modern Hopfield Network with a Mixture of Experts to classify the compositions of previously unseen objects.
Our approach achieves SOTA performance on several benchmarks, including MIT-States and UT-Zappos.
- Score: 25.930021907054797
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compositional Zero-Shot Learning (CZSL) has emerged as an essential paradigm
in machine learning, aiming to overcome the constraints of traditional
zero-shot learning by incorporating compositional thinking into its
methodology. Conventional zero-shot learning has difficulty managing unfamiliar
combinations of seen and unseen classes because it depends on pre-defined class
embeddings. In contrast, Compositional Zero-Shot Learning uses the inherent
hierarchies and structural connections among classes, creating new class
representations by combining attributes, components, or other semantic
elements. In our paper, we propose a novel framework that for the first time
combines the Modern Hopfield Network with a Mixture of Experts (HOMOE) to
classify the compositions of previously unseen objects. Specifically, the
Modern Hopfield Network creates a memory that stores label prototypes and
identifies relevant labels for a given input image. Following this, the Mixture
of Expert models integrates the image with the fitting prototype to produce the
final composition classification. Our approach achieves SOTA performance on
several benchmarks, including MIT-States and UT-Zappos. We also examine how
each component contributes to improved generalization.
Related papers
- Meta-Exploiting Frequency Prior for Cross-Domain Few-Shot Learning [86.99944014645322]
We introduce a novel framework, Meta-Exploiting Frequency Prior for Cross-Domain Few-Shot Learning.
We decompose each query image into its high-frequency and low-frequency components, and parallel incorporate them into the feature embedding network.
Our framework establishes new state-of-the-art results on multiple cross-domain few-shot learning benchmarks.
arXiv Detail & Related papers (2024-11-03T04:02:35Z) - IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation [70.8833857249951]
IterComp is a novel framework that aggregates composition-aware model preferences from multiple models.
We propose an iterative feedback learning method to enhance compositionality in a closed-loop manner.
IterComp opens new research avenues in reward feedback learning for diffusion models and compositional generation.
arXiv Detail & Related papers (2024-10-09T17:59:13Z) - Cross-composition Feature Disentanglement for Compositional Zero-shot Learning [49.919635694894204]
Disentanglement of visual features of primitives (i.e., attributes and objects) has shown exceptional results in Compositional Zero-shot Learning (CZSL)
We propose the solution of cross-composition feature disentanglement, which takes multiple primitive-sharing compositions as inputs and constrains the disentangled primitive features to be general across these compositions.
arXiv Detail & Related papers (2024-08-19T08:23:09Z) - CSCNET: Class-Specified Cascaded Network for Compositional Zero-Shot
Learning [62.090051975043544]
Attribute and object (A-O) disentanglement is a fundamental and critical problem for Compositional Zero-shot Learning (CZSL)
We propose a novel A-O disentangled framework for CZSL, namely Class-specified Cascaded Network (CSCNet)
arXiv Detail & Related papers (2024-03-09T14:18:41Z) - Hierarchical Visual Primitive Experts for Compositional Zero-Shot
Learning [52.506434446439776]
Compositional zero-shot learning (CZSL) aims to recognize compositions with prior knowledge of known primitives (attribute and object)
We propose a simple and scalable framework called Composition Transformer (CoT) to address these issues.
Our method achieves SoTA performance on several benchmarks, including MIT-States, C-GQA, and VAW-CZSL.
arXiv Detail & Related papers (2023-08-08T03:24:21Z) - Mutual Balancing in State-Object Components for Compositional Zero-Shot
Learning [0.0]
Compositional Zero-Shot Learning (CZSL) aims to recognize unseen compositions from seen states and objects.
We propose a novel method called MUtual balancing in STate-object components (MUST) for CZSL, which provides a balancing inductive bias for the model.
Our approach significantly outperforms the state-of-the-art on MIT-States, UT-Zappos, and C-GQA when combined with the basic CZSL frameworks.
arXiv Detail & Related papers (2022-11-19T10:21:22Z) - KG-SP: Knowledge Guided Simple Primitives for Open World Compositional
Zero-Shot Learning [52.422873819371276]
The goal of open-world compositional zero-shot learning (OW-CZSL) is to recognize compositions of state and objects in images.
Here, we revisit a simple CZSL baseline and predict the primitives, i.e. states and objects, independently.
We estimate the feasibility of each composition through external knowledge, using this prior to remove unfeasible compositions from the output space.
Our model, Knowledge-Guided Simple Primitives (KG-SP), achieves state of the art in both OW-CZSL and pCZSL.
arXiv Detail & Related papers (2022-05-13T17:18:15Z) - COMPAS: Representation Learning with Compositional Part Sharing for
Few-Shot Classification [10.718573053194742]
Few-shot image classification consists of two consecutive learning processes.
Inspired by the compositional representation of objects in humans, we train a neural network architecture that explicitly represents objects as a set of parts.
We demonstrate the value of our compositional learning framework for a few-shot classification using miniImageNet, tieredImageNet, CIFAR-FS, and FC100.
arXiv Detail & Related papers (2021-01-28T09:16:21Z) - Compositional Embeddings for Multi-Label One-Shot Learning [30.748605784254355]
We present a compositional embedding framework that infers not just a single class per input image, but a set of classes, in the setting of one-shot learning.
Experiments on the OmniGlot, Open Images, and COCO datasets show that the proposed compositional embedding models outperform existing embedding methods.
arXiv Detail & Related papers (2020-02-11T03:54:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.