Beneficial Perturbation Network for designing general adaptive
artificial intelligence systems
- URL: http://arxiv.org/abs/2009.13954v2
- Date: Tue, 2 Feb 2021 02:12:37 GMT
- Title: Beneficial Perturbation Network for designing general adaptive
artificial intelligence systems
- Authors: Shixian Wen, Amanda Rios, Yunhao Ge, Laurent Itti
- Abstract summary: We propose a new type of deep neural network with extra, out-of-network, task-dependent biasing units to accommodate dynamic situations.
Our approach is memory-efficient and parameter-efficient, can accommodate many tasks, and achieves state-of-the-art performance across different tasks and domains.
- Score: 14.226973149346886
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The human brain is the gold standard of adaptive learning. It not only can
learn and benefit from experience, but also can adapt to new situations. In
contrast, deep neural networks only learn one sophisticated but fixed mapping
from inputs to outputs. This limits their applicability to more dynamic
situations, where input to output mapping may change with different contexts. A
salient example is continual learning - learning new independent tasks
sequentially without forgetting previous tasks. Continual learning of multiple
tasks in artificial neural networks using gradient descent leads to
catastrophic forgetting, whereby a previously learned mapping of an old task is
erased when learning new mappings for new tasks. Here, we propose a new
biologically plausible type of deep neural network with extra, out-of-network,
task-dependent biasing units to accommodate these dynamic situations. This
allows, for the first time, a single network to learn potentially unlimited
parallel input to output mappings, and to switch on the fly between them at
runtime. Biasing units are programmed by leveraging beneficial perturbations
(opposite to well-known adversarial perturbations) for each task. Beneficial
perturbations for a given task bias the network toward that task, essentially
switching the network into a different mode to process that task. This largely
eliminates catastrophic interference between tasks. Our approach is
memory-efficient and parameter-efficient, can accommodate many tasks, and
achieves state-of-the-art performance across different tasks and domains.
Related papers
- Negotiated Representations to Prevent Forgetting in Machine Learning
Applications [0.0]
Catastrophic forgetting is a significant challenge in the field of machine learning.
We propose a novel method for preventing catastrophic forgetting in machine learning applications.
arXiv Detail & Related papers (2023-11-30T22:43:50Z) - Adaptive Reorganization of Neural Pathways for Continual Learning with Spiking Neural Networks [9.889775504641925]
We propose a brain-inspired continual learning algorithm with adaptive reorganization of neural pathways.
The proposed model demonstrates consistent superiority in performance, energy consumption, and memory capacity on diverse continual learning tasks.
arXiv Detail & Related papers (2023-09-18T07:56:40Z) - Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks.
Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z) - Multi-Task Neural Processes [105.22406384964144]
We develop multi-task neural processes, a new variant of neural processes for multi-task learning.
In particular, we propose to explore transferable knowledge from related tasks in the function space to provide inductive bias for improving each individual task.
Results demonstrate the effectiveness of multi-task neural processes in transferring useful knowledge among tasks for multi-task learning.
arXiv Detail & Related papers (2021-11-10T17:27:46Z) - Thinking Deeply with Recurrence: Generalizing from Easy to Hard
Sequential Reasoning Problems [51.132938969015825]
We observe that recurrent networks have the uncanny ability to closely emulate the behavior of non-recurrent deep models.
We show that recurrent networks that are trained to solve simple mazes with few recurrent steps can indeed solve much more complex problems simply by performing additional recurrences during inference.
arXiv Detail & Related papers (2021-02-22T14:09:20Z) - Routing Networks with Co-training for Continual Learning [5.957609459173546]
We propose the use of sparse routing networks for continual learning.
For each input, these network architectures activate a different path through a network of experts.
In practice, we find it is necessary to develop a new training method for routing networks, which we call co-training.
arXiv Detail & Related papers (2020-09-09T15:58:51Z) - Reparameterizing Convolutions for Incremental Multi-Task Learning
without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature.
First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning)
Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z) - Auxiliary Learning by Implicit Differentiation [54.92146615836611]
Training neural networks with auxiliary tasks is a common practice for improving the performance on a main task of interest.
Here, we propose a novel framework, AuxiLearn, that targets both challenges based on implicit differentiation.
First, when useful auxiliaries are known, we propose learning a network that combines all losses into a single coherent objective function.
Second, when no useful auxiliary task is known, we describe how to learn a network that generates a meaningful, novel auxiliary task.
arXiv Detail & Related papers (2020-06-22T19:35:07Z) - Learning to Branch for Multi-Task Learning [12.49373126819798]
We present an automated multi-task learning algorithm that learns where to share or branch within a network.
We propose a novel tree-structured design space that casts a tree branching operation as a gumbel-softmax sampling procedure.
arXiv Detail & Related papers (2020-06-02T19:23:21Z) - Side-Tuning: A Baseline for Network Adaptation via Additive Side
Networks [95.51368472949308]
Adaptation can be useful in cases when training data is scarce, or when one wishes to encode priors in the network.
In this paper, we propose a straightforward alternative: side-tuning.
arXiv Detail & Related papers (2019-12-31T18:52:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.