Neural networks that overcome classic challenges through practice
- URL: http://arxiv.org/abs/2410.10596v1
- Date: Mon, 14 Oct 2024 15:07:37 GMT
- Title: Neural networks that overcome classic challenges through practice
- Authors: Kazuki Irie, Brenden M. Lake,
- Abstract summary: We review recent work that has used metalearning to help overcome some of these challenges.
We review applications of this principle to four classic challenges: systematicity, catastrophic forgetting, few-shot learning and multi-step reasoning.
- Score: 22.741266810854228
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Since the earliest proposals for neural network models of the mind and brain, critics have pointed out key weaknesses in these models compared to human cognitive abilities. Here we review recent work that has used metalearning to help overcome some of these challenges. We characterize their successes as addressing an important developmental problem: they provide machines with an incentive to improve X (where X represents the desired capability) and opportunities to practice it, through explicit optimization for X; unlike conventional approaches that hope for achieving X through generalization from related but different objectives. We review applications of this principle to four classic challenges: systematicity, catastrophic forgetting, few-shot learning and multi-step reasoning; we also discuss related aspects of human development in natural environments.
Related papers
- How Vision-Language Tasks Benefit from Large Pre-trained Models: A Survey [59.23394353614928]
In recent years, the rise of pre-trained models is driving the research on vision-language tasks.
Inspired by the powerful capabilities of pre-trained models, new paradigms have emerged to solve the classic challenges.
arXiv Detail & Related papers (2024-12-11T07:29:04Z) - BloomWise: Enhancing Problem-Solving capabilities of Large Language Models using Bloom's-Taxonomy-Inspired Prompts [59.83547898874152]
We introduce BloomWise, a new prompting technique, inspired by Bloom's taxonomy, to improve the performance of Large Language Models (LLMs)
The decision regarding the need to employ more sophisticated cognitive skills is based on self-evaluation performed by the LLM.
In extensive experiments across 4 popular math reasoning datasets, we have demonstrated the effectiveness of our proposed approach.
arXiv Detail & Related papers (2024-10-05T09:27:52Z) - Benign or Not-Benign Overfitting in Token Selection of Attention Mechanism [34.316270145027616]
We analyze benign overfitting in the token selection mechanism of the attention architecture.
To the best of our knowledge, this is the first study to characterize benign overfitting for the attention mechanism.
arXiv Detail & Related papers (2024-09-26T08:20:05Z) - Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning [59.98430756337374]
Supervised fine-tuning enhances the problem-solving abilities of language models across various mathematical reasoning tasks.
Our work introduces a novel technique aimed at cultivating a deeper understanding of the training problems at hand.
We propose reflective augmentation, a method that embeds problem reflection into each training instance.
arXiv Detail & Related papers (2024-06-17T19:42:22Z) - Failures Are Fated, But Can Be Faded: Characterizing and Mitigating Unwanted Behaviors in Large-Scale Vision and Language Models [7.736445799116692]
In large deep neural networks that seem to perform surprisingly well on many tasks, we also observe a few failures related to accuracy, social biases, and alignment with human values.
We introduce a post-hoc method that utilizes emphdeep reinforcement learning to explore and construct the landscape of failure modes in pre-trained discriminative and generative models.
We empirically show the effectiveness of the proposed method across common Computer Vision, Natural Language Processing, and Vision-Language tasks.
arXiv Detail & Related papers (2024-06-11T10:45:41Z) - Avoiding Catastrophic Forgetting in Visual Classification Using Human
Concept Formation [0.8159711103888622]
We propose Cobweb4V, a novel visual classification approach that builds on Cobweb, a human like learning system.
In this research, we conduct a comprehensive evaluation, showcasing the proficiency of Cobweb4V in learning visual concepts.
These characteristics align with learning strategies in human cognition, positioning Cobweb4V as a promising alternative to neural network approaches.
arXiv Detail & Related papers (2024-02-26T17:20:16Z) - Simple and Effective Transfer Learning for Neuro-Symbolic Integration [50.592338727912946]
A potential solution to this issue is Neuro-Symbolic Integration (NeSy), where neural approaches are combined with symbolic reasoning.
Most of these methods exploit a neural network to map perceptions to symbols and a logical reasoner to predict the output of the downstream task.
They suffer from several issues, including slow convergence, learning difficulties with complex perception tasks, and convergence to local minima.
This paper proposes a simple yet effective method to ameliorate these problems.
arXiv Detail & Related papers (2024-02-21T15:51:01Z) - Towards Improving Robustness Against Common Corruptions using Mixture of
Class Specific Experts [10.27974860479791]
This paper introduces a novel paradigm known as the Mixture of Class-Specific Expert Architecture.
The proposed architecture aims to mitigate vulnerabilities associated with common neural network structures.
arXiv Detail & Related papers (2023-11-16T20:09:47Z) - A Survey on Transferability of Adversarial Examples across Deep Neural Networks [53.04734042366312]
adversarial examples can manipulate machine learning models into making erroneous predictions.
The transferability of adversarial examples enables black-box attacks which circumvent the need for detailed knowledge of the target model.
This survey explores the landscape of the adversarial transferability of adversarial examples.
arXiv Detail & Related papers (2023-10-26T17:45:26Z) - A Neuro-mimetic Realization of the Common Model of Cognition via Hebbian
Learning and Free Energy Minimization [55.11642177631929]
Large neural generative models are capable of synthesizing semantically rich passages of text or producing complex images.
We discuss the COGnitive Neural GENerative system, such an architecture that casts the Common Model of Cognition.
arXiv Detail & Related papers (2023-10-14T23:28:48Z) - Learning Expressive Priors for Generalization and Uncertainty Estimation
in Neural Networks [77.89179552509887]
We propose a novel prior learning method for advancing generalization and uncertainty estimation in deep neural networks.
The key idea is to exploit scalable and structured posteriors of neural networks as informative priors with generalization guarantees.
We exhaustively show the effectiveness of this method for uncertainty estimation and generalization.
arXiv Detail & Related papers (2023-07-15T09:24:33Z) - Abrupt and spontaneous strategy switches emerge in simple regularised
neural networks [8.737068885923348]
We study whether insight-like behaviour can occur in simple artificial neural networks.
Analyses of network architectures and learning dynamics revealed that insight-like behaviour crucially depended on a regularised gating mechanism.
This suggests that insight-like behaviour can arise naturally from gradual learning in simple neural networks.
arXiv Detail & Related papers (2023-02-22T12:48:45Z) - Technical Challenges for Training Fair Neural Networks [62.466658247995404]
We conduct experiments on both facial recognition and automated medical diagnosis datasets using state-of-the-art architectures.
We observe that large models overfit to fairness objectives, and produce a range of unintended and undesirable consequences.
arXiv Detail & Related papers (2021-02-12T20:36:45Z) - Developing Constrained Neural Units Over Time [81.19349325749037]
This paper focuses on an alternative way of defining Neural Networks, that is different from the majority of existing approaches.
The structure of the neural architecture is defined by means of a special class of constraints that are extended also to the interaction with data.
The proposed theory is cast into the time domain, in which data are presented to the network in an ordered manner.
arXiv Detail & Related papers (2020-09-01T09:07:25Z) - On the Reliability and Generalizability of Brain-inspired Reinforcement
Learning Algorithms [10.09712608508383]
We show that the computational model combining model-based and model-free control, which we term the prefrontal RL, reliably encodes the information of high-level policy that humans learned.
This is the first attempt to formally test the possibility that computational models mimicking the way the brain solves general problems can lead to practical solutions.
arXiv Detail & Related papers (2020-07-09T06:32:42Z) - Revisit Systematic Generalization via Meaningful Learning [15.90288956294373]
Recent studies argue that neural networks appear inherently ineffective in such cognitive capacity.
We reassess the compositional skills of sequence-to-sequence models conditioned on the semantic links between new and old concepts.
arXiv Detail & Related papers (2020-03-14T15:27:29Z) - Neuro-evolutionary Frameworks for Generalized Learning Agents [1.2691047660244335]
Recent successes of deep learning and deep reinforcement learning have firmly established their statuses as state-of-the-art artificial learning techniques.
Longstanding drawbacks of these approaches point to a need for re-thinking the way such systems are designed and deployed.
We discuss the anticipated improvements from such neuro-evolutionary frameworks, along with the associated challenges.
arXiv Detail & Related papers (2020-02-04T02:11:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.