Learning from One and Only One Shot
- URL: http://arxiv.org/abs/2201.08815v2
- Date: Tue, 21 May 2024 05:19:52 GMT
- Title: Learning from One and Only One Shot
- Authors: Haizi Yu, Igor Mineyev, Lav R. Varshney, James A. Evans,
- Abstract summary: Humans can generalize from only a few examples and from little pretraining on similar tasks.
Motivated by nativism and artificial general intelligence, we model human-innate priors in abstract visual tasks.
We achieve human-level recognition with only $1$--$10$ examples per class and no pretraining.
- Score: 15.835306446986856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans can generalize from only a few examples and from little pretraining on similar tasks. Yet, machine learning (ML) typically requires large data to learn or pre-learn to transfer. Motivated by nativism and artificial general intelligence, we directly model human-innate priors in abstract visual tasks such as character and doodle recognition. This yields a white-box model that learns general-appearance similarity by mimicking how humans naturally ``distort'' an object at first sight. Using just nearest-neighbor classification on this cognitively-inspired similarity space, we achieve human-level recognition with only $1$--$10$ examples per class and no pretraining. This differs from few-shot learning that uses massive pretraining. In the tiny-data regime of MNIST, EMNIST, Omniglot, and QuickDraw benchmarks, we outperform both modern neural networks and classical ML. For unsupervised learning, by learning the non-Euclidean, general-appearance similarity space in a $k$-means style, we achieve multifarious visual realizations of abstract concepts by generating human-intuitive archetypes as cluster centroids.
Related papers
- Evaluating alignment between humans and neural network representations in image-based learning tasks [5.657101730705275]
We tested how well the representations of $86$ pretrained neural network models mapped to human learning trajectories.
We found that while training dataset size was a core determinant of alignment with human choices, contrastive training with multi-modal data (text and imagery) was a common feature of currently publicly available models that predicted human generalisation.
In conclusion, pretrained neural networks can serve to extract representations for cognitive models, as they appear to capture some fundamental aspects of cognition that are transferable across tasks.
arXiv Detail & Related papers (2023-06-15T08:18:29Z) - Neural Novel Actor: Learning a Generalized Animatable Neural
Representation for Human Actors [98.24047528960406]
We propose a new method for learning a generalized animatable neural representation from a sparse set of multi-view imagery of multiple persons.
The learned representation can be used to synthesize novel view images of an arbitrary person from a sparse set of cameras, and further animate them with the user's pose control.
arXiv Detail & Related papers (2022-08-25T07:36:46Z) - Learning Primitive-aware Discriminative Representations for Few-shot
Learning [28.17404445820028]
Few-shot learning aims to learn a classifier that can be easily adapted to recognize novel classes with only a few labeled examples.
We propose a Primitive Mining and Reasoning Network (PMRN) to learn primitive-aware representations.
Our method achieves state-of-the-art results on six standard benchmarks.
arXiv Detail & Related papers (2022-08-20T16:22:22Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - WenLan 2.0: Make AI Imagine via a Multimodal Foundation Model [74.4875156387271]
We develop a novel foundation model pre-trained with huge multimodal (visual and textual) data.
We show that state-of-the-art results can be obtained on a wide range of downstream tasks.
arXiv Detail & Related papers (2021-10-27T12:25:21Z) - Fast Concept Mapping: The Emergence of Human Abilities in Artificial
Neural Networks when Learning Embodied and Self-Supervised [0.0]
We introduce a setup in which an artificial agent first learns in a simulated world through self-supervised exploration.
We use a method we call fast concept mapping which uses correlated firing patterns of neurons to define and detect semantic concepts.
arXiv Detail & Related papers (2021-02-03T17:19:49Z) - CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action
Recognition [52.66360172784038]
We propose a clustering-based model, which considers all training samples at once, instead of optimizing for each instance individually.
We call the proposed method CLASTER and observe that it consistently improves over the state-of-the-art in all standard datasets.
arXiv Detail & Related papers (2021-01-18T12:46:24Z) - Reinforcement Based Learning on Classification Task Could Yield Better
Generalization and Adversarial Accuracy [0.0]
We propose a novel method to train deep learning models on an image classification task.
We use a reward-based optimization function, similar to the vanilla policy gradient method used in reinforcement learning.
arXiv Detail & Related papers (2020-12-08T11:03:17Z) - Unsupervised One-shot Learning of Both Specific Instances and
Generalised Classes with a Hippocampal Architecture [0.0]
Distinguishing specific instances is necessary for many real-world tasks, such as remembering which cup belongs to you.
Generalisation within classes conflicts with the ability to separate instances of classes, making it difficult to achieve both capabilities within a single architecture.
We propose an extension to the standard Omniglot classification-generalisation framework that tests the ability to distinguish specific instances after one exposure.
arXiv Detail & Related papers (2020-10-30T00:10:23Z) - What Can You Learn from Your Muscles? Learning Visual Representation
from Human Interactions [50.435861435121915]
We use human interaction and attention cues to investigate whether we can learn better representations compared to visual-only representations.
Our experiments show that our "muscly-supervised" representation outperforms a visual-only state-of-the-art method MoCo.
arXiv Detail & Related papers (2020-10-16T17:46:53Z) - Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and
Reasoning [78.13740873213223]
Bongard problems (BPs) were introduced as an inspirational challenge for visual cognition in intelligent systems.
We propose a new benchmark Bongard-LOGO for human-level concept learning and reasoning.
arXiv Detail & Related papers (2020-10-02T03:19:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.