Related papers: Is a Modular Architecture Enough?

Is a Modular Architecture Enough?

URL: http://arxiv.org/abs/2206.02713v1
Date: Mon, 6 Jun 2022 16:12:06 GMT
Title: Is a Modular Architecture Enough?
Authors: Sarthak Mittal, Yoshua Bengio and Guillaume Lajoie
Abstract summary: We provide a thorough assessment of common modular architectures, through the lens of simple and known modular data distributions. We highlight the benefits of modularity and sparsity and reveal insights on the challenges faced while optimizing modular systems.
Score: 80.32451720642209
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Inspired from human cognition, machine learning systems are gradually revealing advantages of sparser and more modular architectures. Recent work demonstrates that not only do some modular architectures generalize well, but they also lead to better out-of-distribution generalization, scaling properties, learning speed, and interpretability. A key intuition behind the success of such systems is that the data generating system for most real-world settings is considered to consist of sparsely interacting parts, and endowing models with similar inductive biases will be helpful. However, the field has been lacking in a rigorous quantitative assessment of such systems because these real-world data distributions are complex and unknown. In this work, we provide a thorough assessment of common modular architectures, through the lens of simple and known modular data distributions. We highlight the benefits of modularity and sparsity and reveal insights on the challenges faced while optimizing modular systems. In doing so, we propose evaluation metrics that highlight the benefits of modularity, the regimes in which these benefits are substantial, as well as the sub-optimality of current end-to-end learned modular systems as opposed to their claimed potential.

Related papers

The Generalist Brain Module: Module Repetition in Neural Networks in Light of the Minicolumn Hypothesis [0.0]
Review aims to synthesizing historical, theoretical, and methodological perspectives on neural module repetition.<n>We believe that a system that adopts the benefits of CI, while adhering to architectural and functional principles of the minicolumns, could challenge the modern AI problems of scalability, energy consumption, and democratization.
arXiv Detail & Related papers (2025-07-01T09:13:10Z)
Adaptive Orchestration of Modular Generative Information Access Systems [59.102816309859584]
We argue that the architecture of future modular generative information access systems will not just assemble powerful components, but enable a self-organizing system. This perspective urges the IR community to rethink modular system designs for developing adaptive, self-optimizing, and future-ready architectures.
arXiv Detail & Related papers (2025-04-24T11:35:43Z)
On The Specialization of Neural Modules [16.83151955540625]
We study the ability of network modules to specialize to useful structures in a dataset and achieve systematic generalization. Our results shed light on the difficulty of module specialization, what is required for modules to successfully specialize, and the necessity of modular architectures to achieve systematicity.
arXiv Detail & Related papers (2024-09-23T12:58:11Z)
Discovering modular solutions that generalize compositionally [55.46688816816882]
We show that identification up to linear transformation purely from demonstrations is possible without having to learn an exponential number of module combinations. We further demonstrate empirically that meta-learning from finite data can discover modular policies that generalize compositionally in a number of complex environments.
arXiv Detail & Related papers (2023-12-22T16:33:50Z)
Modularity in Deep Learning: A Survey [0.0]
We review the notion of modularity in deep learning around three axes: data, task, and model. Data modularity refers to the observation or creation of data groups for various purposes. Task modularity refers to the decomposition of tasks into sub-tasks. Model modularity means that the architecture of a neural network system can be decomposed into identifiable modules.
arXiv Detail & Related papers (2023-10-02T12:41:34Z)
A Deep Unrolling Model with Hybrid Optimization Structure for Hyperspectral Image Deconvolution [50.13564338607482]
We propose a novel optimization framework for the hyperspectral deconvolution problem, called DeepMix.<n>It consists of three distinct modules, namely, a data consistency module, a module that enforces the effect of the handcrafted regularizers, and a denoising module.<n>This work proposes a context aware denoising module designed to sustain the advancements achieved by the cooperative efforts of the other modules.
arXiv Detail & Related papers (2023-06-10T08:25:16Z)
ModuleFormer: Modularity Emerges from Mixture-of-Experts [60.6148988099284]
This paper proposes a new neural network architecture, ModuleFormer, to improve the efficiency and flexibility of large language models. Unlike the previous SMoE-based modular language model, ModuleFormer can induce modularity from uncurated data.
arXiv Detail & Related papers (2023-06-07T17:59:57Z)
Modular Deep Learning [120.36599591042908]
Transfer learning has recently become the dominant paradigm of machine learning. It remains unclear how to develop models that specialise towards multiple tasks without incurring negative interference. Modular deep learning has emerged as a promising solution to these challenges.
arXiv Detail & Related papers (2023-02-22T18:11:25Z)
Learning Modular Simulations for Homogeneous Systems [23.355189771765644]
We present a modular simulation framework for modeling homogeneous multibody dynamical systems. An arbitrary number of modules can be combined to simulate systems of a variety of coupling topologies. We show that our models can be transferred to new system configurations lower with data requirement and training effort, compared to those trained from scratch.
arXiv Detail & Related papers (2022-10-28T17:48:01Z)
Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs) NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge. NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z)
S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures. We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.