Is a Modular Architecture Enough?
- URL: http://arxiv.org/abs/2206.02713v1
- Date: Mon, 6 Jun 2022 16:12:06 GMT
- Title: Is a Modular Architecture Enough?
- Authors: Sarthak Mittal, Yoshua Bengio and Guillaume Lajoie
- Abstract summary: We provide a thorough assessment of common modular architectures, through the lens of simple and known modular data distributions.
We highlight the benefits of modularity and sparsity and reveal insights on the challenges faced while optimizing modular systems.
- Score: 80.32451720642209
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inspired from human cognition, machine learning systems are gradually
revealing advantages of sparser and more modular architectures. Recent work
demonstrates that not only do some modular architectures generalize well, but
they also lead to better out-of-distribution generalization, scaling
properties, learning speed, and interpretability. A key intuition behind the
success of such systems is that the data generating system for most real-world
settings is considered to consist of sparsely interacting parts, and endowing
models with similar inductive biases will be helpful. However, the field has
been lacking in a rigorous quantitative assessment of such systems because
these real-world data distributions are complex and unknown. In this work, we
provide a thorough assessment of common modular architectures, through the lens
of simple and known modular data distributions. We highlight the benefits of
modularity and sparsity and reveal insights on the challenges faced while
optimizing modular systems. In doing so, we propose evaluation metrics that
highlight the benefits of modularity, the regimes in which these benefits are
substantial, as well as the sub-optimality of current end-to-end learned
modular systems as opposed to their claimed potential.
Related papers
- On The Specialization of Neural Modules [16.83151955540625]
We study the ability of network modules to specialize to useful structures in a dataset and achieve systematic generalization.
Our results shed light on the difficulty of module specialization, what is required for modules to successfully specialize, and the necessity of modular architectures to achieve systematicity.
arXiv Detail & Related papers (2024-09-23T12:58:11Z) - Discovering modular solutions that generalize compositionally [55.46688816816882]
We show that identification up to linear transformation purely from demonstrations is possible without having to learn an exponential number of module combinations.
We further demonstrate empirically that meta-learning from finite data can discover modular policies that generalize compositionally in a number of complex environments.
arXiv Detail & Related papers (2023-12-22T16:33:50Z) - Modularity in Deep Learning: A Survey [0.0]
We review the notion of modularity in deep learning around three axes: data, task, and model.
Data modularity refers to the observation or creation of data groups for various purposes.
Task modularity refers to the decomposition of tasks into sub-tasks.
Model modularity means that the architecture of a neural network system can be decomposed into identifiable modules.
arXiv Detail & Related papers (2023-10-02T12:41:34Z) - ModuleFormer: Modularity Emerges from Mixture-of-Experts [60.6148988099284]
This paper proposes a new neural network architecture, ModuleFormer, to improve the efficiency and flexibility of large language models.
Unlike the previous SMoE-based modular language model, ModuleFormer can induce modularity from uncurated data.
arXiv Detail & Related papers (2023-06-07T17:59:57Z) - Modular Deep Learning [120.36599591042908]
Transfer learning has recently become the dominant paradigm of machine learning.
It remains unclear how to develop models that specialise towards multiple tasks without incurring negative interference.
Modular deep learning has emerged as a promising solution to these challenges.
arXiv Detail & Related papers (2023-02-22T18:11:25Z) - Learning Modular Simulations for Homogeneous Systems [23.355189771765644]
We present a modular simulation framework for modeling homogeneous multibody dynamical systems.
An arbitrary number of modules can be combined to simulate systems of a variety of coupling topologies.
We show that our models can be transferred to new system configurations lower with data requirement and training effort, compared to those trained from scratch.
arXiv Detail & Related papers (2022-10-28T17:48:01Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures.
We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.