Fuzzy Simplicial Networks: A Topology-Inspired Model to Improve Task
Generalization in Few-shot Learning
- URL: http://arxiv.org/abs/2009.11253v1
- Date: Wed, 23 Sep 2020 17:01:09 GMT
- Title: Fuzzy Simplicial Networks: A Topology-Inspired Model to Improve Task
Generalization in Few-shot Learning
- Authors: Henry Kvinge, Zachary New, Nico Courts, Jung H. Lee, Lauren A.
Phillips, Courtney D. Corley, Aaron Tuor, Andrew Avila, Nathan O. Hodas
- Abstract summary: Few-shot learning algorithms are designed to generalize well to new tasks with limited data.
We introduce a new few-shot model called Fuzzy Simplicial Networks (FSN) which leverages a construction from topology to more flexibly represent each class from limited data.
- Score: 1.0062040918634414
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning has shown great success in settings with massive amounts of
data but has struggled when data is limited. Few-shot learning algorithms,
which seek to address this limitation, are designed to generalize well to new
tasks with limited data. Typically, models are evaluated on unseen classes and
datasets that are defined by the same fundamental task as they are trained for
(e.g. category membership). One can also ask how well a model can generalize to
fundamentally different tasks within a fixed dataset (for example: moving from
category membership to tasks that involve detecting object orientation or
quantity). To formalize this kind of shift we define a notion of "independence
of tasks" and identify three new sets of labels for established computer vision
datasets that test a model's ability to generalize to tasks which draw on
orthogonal attributes in the data. We use these datasets to investigate the
failure modes of metric-based few-shot models. Based on our findings, we
introduce a new few-shot model called Fuzzy Simplicial Networks (FSN) which
leverages a construction from topology to more flexibly represent each class
from limited data. In particular, FSN models can not only form multiple
representations for a given class but can also begin to capture the
low-dimensional structure which characterizes class manifolds in the encoded
space of deep networks. We show that FSN outperforms state-of-the-art models on
the challenging tasks we introduce in this paper while remaining competitive on
standard few-shot benchmarks.
Related papers
- Low-Rank Mixture-of-Experts for Continual Medical Image Segmentation [18.984447545932706]
"catastrophic forgetting" problem occurs when model forgets previously learned features when it is extended to new categories or tasks.
We propose a network by introducing the data-specific Mixture of Experts structure to handle the new tasks or categories.
We validate our method on both class-level and task-level continual learning challenges.
arXiv Detail & Related papers (2024-06-19T14:19:50Z) - UniFS: Universal Few-shot Instance Perception with Point Representations [36.943019984075065]
We propose UniFS, a universal few-shot instance perception model that unifies a wide range of instance perception tasks.
Our approach makes minimal assumptions about the tasks, yet it achieves competitive results compared to highly specialized and well optimized specialist models.
arXiv Detail & Related papers (2024-04-30T09:47:44Z) - An Efficient General-Purpose Modular Vision Model via Multi-Task
Heterogeneous Training [79.78201886156513]
We present a model that can perform multiple vision tasks and can be adapted to other downstream tasks efficiently.
Our approach achieves comparable results to single-task state-of-the-art models and demonstrates strong generalization on downstream tasks.
arXiv Detail & Related papers (2023-06-29T17:59:57Z) - LLM2Loss: Leveraging Language Models for Explainable Model Diagnostics [5.33024001730262]
We propose an approach that can provide semantic insights into a model's patterns of failures and biases.
We show that an ensemble of such lightweight models can be used to generate insights on the performance of the black-box model.
arXiv Detail & Related papers (2023-05-04T23:54:37Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - Few-Shot Class-Incremental Learning by Sampling Multi-Phase Tasks [59.12108527904171]
A model should recognize new classes and maintain discriminability over old classes.
The task of recognizing few-shot new classes without forgetting old classes is called few-shot class-incremental learning (FSCIL)
We propose a new paradigm for FSCIL based on meta-learning by LearnIng Multi-phase Incremental Tasks (LIMIT)
arXiv Detail & Related papers (2022-03-31T13:46:41Z) - Mixture of basis for interpretable continual learning with distribution
shifts [1.6114012813668934]
Continual learning in environments with shifting data distributions is a challenging problem with several real-world applications.
We propose a novel approach called mixture of Basismodels (MoB) for addressing this problem setting.
arXiv Detail & Related papers (2022-01-05T22:53:15Z) - Self-Supervision based Task-Specific Image Collection Summarization [3.115375810642661]
We propose a novel approach to task-specific image corpus summarization using semantic information and self-supervision.
Our method uses a classification-based Wasserstein generative adversarial network (WGAN) as a feature generating network.
The model then generates a summary at inference time by using K-means clustering in the semantic embedding space.
arXiv Detail & Related papers (2020-12-19T10:58:04Z) - KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation [100.79870384880333]
We propose a knowledge-grounded pre-training (KGPT) to generate knowledge-enriched text.
We adopt three settings, namely fully-supervised, zero-shot, few-shot to evaluate its effectiveness.
Under zero-shot setting, our model achieves over 30 ROUGE-L on WebNLG while all other baselines fail.
arXiv Detail & Related papers (2020-10-05T19:59:05Z) - Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network.
PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.