A Probabilistic Framework for Modular Continual Learning
- URL: http://arxiv.org/abs/2306.06545v2
- Date: Thu, 2 May 2024 12:03:53 GMT
- Title: A Probabilistic Framework for Modular Continual Learning
- Authors: Lazar Valkov, Akash Srivastava, Swarat Chaudhuri, Charles Sutton,
- Abstract summary: We develop a modular continuous learning framework, PICLE, to search through the large, discrete space of module compositions.
We show PICLE is the first modular CL algorithm to achieve perceptual, few-shot and latent transfer while scaling well to large search spaces.
- Score: 27.398496741452554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modular approaches that use a different composition of modules for each problem are a promising direction in continual learning (CL). However, searching through the large, discrete space of module compositions is challenging, especially because evaluating a composition's performance requires a round of neural network training. We address this challenge through a modular CL framework, PICLE, that uses a probabilistic model to cheaply compute the fitness of each composition, allowing PICLE to achieve both perceptual, few-shot and latent transfer. The model combines prior knowledge about good module compositions with dataset-specific information. We evaluate PICLE using two benchmark suites designed to assess different desiderata of CL techniques. Comparing to a wide range of approaches, we show that PICLE is the first modular CL algorithm to achieve perceptual, few-shot and latent transfer while scaling well to large search spaces, outperforming previous state-of-the-art modular CL approaches on long problem sequences.
Related papers
- GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model [66.35608254724566]
State-space models (SSMs) have showcased effective performance in modeling long-range dependencies with subquadratic complexity.
However, pure SSM-based models still face challenges related to stability and achieving optimal performance on computer vision tasks.
Our paper addresses the challenges of scaling SSM-based models for computer vision, particularly the instability and inefficiency of large model sizes.
arXiv Detail & Related papers (2024-07-18T17:59:58Z) - SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models [71.78800549517298]
Continual learning (CL) ability is vital for deploying large language models (LLMs) in the dynamic world.
Existing methods devise the learning module to acquire task-specific knowledge with parameter-efficient tuning (PET) block and the selection module to pick out the corresponding one for the testing input.
We propose a novel Shared Attention Framework (SAPT) to align the PET learning and selection via the Shared Attentive Learning & Selection module.
arXiv Detail & Related papers (2024-01-16T11:45:03Z) - Modularity in Deep Learning: A Survey [0.0]
We review the notion of modularity in deep learning around three axes: data, task, and model.
Data modularity refers to the observation or creation of data groups for various purposes.
Task modularity refers to the decomposition of tasks into sub-tasks.
Model modularity means that the architecture of a neural network system can be decomposed into identifiable modules.
arXiv Detail & Related papers (2023-10-02T12:41:34Z) - Modular Deep Learning [120.36599591042908]
Transfer learning has recently become the dominant paradigm of machine learning.
It remains unclear how to develop models that specialise towards multiple tasks without incurring negative interference.
Modular deep learning has emerged as a promising solution to these challenges.
arXiv Detail & Related papers (2023-02-22T18:11:25Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - A Unifying Multi-sampling-ratio CS-MRI Framework With Two-grid-cycle
Correction and Geometric Prior Distillation [7.643154460109723]
We propose a unifying deep unfolding multi-sampling-ratio CS-MRI framework, by merging advantages of model-based and deep learning-based methods.
Inspired by multigrid algorithm, we first embed the CS-MRI-based optimization algorithm into correction-distillation scheme.
We employ a condition module to learn adaptively step-length and noise level from compressive sampling ratio in every stage.
arXiv Detail & Related papers (2022-05-14T13:36:27Z) - Continual Learning via Local Module Composition [11.380264053565082]
Local module composition (LMC) is an approach to modular continual learning.
LMC provides each module a local structural component that estimates a module's relevance to the input.
arXiv Detail & Related papers (2021-11-15T13:34:15Z) - Evaluating Modules in Graph Contrastive Learning [29.03038320344791]
We propose a framework that decomposes graph contrastive learning models into four modules.
We conduct experiments on node and graph classification tasks.
We release our implementations and results as OpenGCL, a modularized toolkit.
arXiv Detail & Related papers (2021-06-15T14:14:23Z) - Deep Keypoint-Based Camera Pose Estimation with Geometric Constraints [80.60538408386016]
Estimating relative camera poses from consecutive frames is a fundamental problem in visual odometry.
We propose an end-to-end trainable framework consisting of learnable modules for detection, feature extraction, matching and outlier rejection.
arXiv Detail & Related papers (2020-07-29T21:41:31Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.