Related papers: A Probabilistic Framework for Modular Continual Learning

A Probabilistic Framework for Modular Continual Learning

URL: http://arxiv.org/abs/2306.06545v2
Date: Thu, 2 May 2024 12:03:53 GMT
Title: A Probabilistic Framework for Modular Continual Learning
Authors: Lazar Valkov, Akash Srivastava, Swarat Chaudhuri, Charles Sutton,
Abstract summary: We develop a modular continuous learning framework, PICLE, to search through the large, discrete space of module compositions. We show PICLE is the first modular CL algorithm to achieve perceptual, few-shot and latent transfer while scaling well to large search spaces.
Score: 27.398496741452554
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modular approaches that use a different composition of modules for each problem are a promising direction in continual learning (CL). However, searching through the large, discrete space of module compositions is challenging, especially because evaluating a composition's performance requires a round of neural network training. We address this challenge through a modular CL framework, PICLE, that uses a probabilistic model to cheaply compute the fitness of each composition, allowing PICLE to achieve both perceptual, few-shot and latent transfer. The model combines prior knowledge about good module compositions with dataset-specific information. We evaluate PICLE using two benchmark suites designed to assess different desiderata of CL techniques. Comparing to a wide range of approaches, we show that PICLE is the first modular CL algorithm to achieve perceptual, few-shot and latent transfer while scaling well to large search spaces, outperforming previous state-of-the-art modular CL approaches on long problem sequences.

Related papers

Closed-form merging of parameter-efficient modules for Federated Continual Learning [9.940242741914748]
We introduce LoRM, an alternating optimization strategy that trains one LoRA matrix at a time. This allows solving for each unknown variable individually, thus finding a unique solution. Our method demonstrates state-of-the-art performance across a range of FCIL scenarios.
arXiv Detail & Related papers (2024-10-23T15:30:13Z)
Configurable Foundation Models: Building LLMs from a Modular Perspective [115.63847606634268]
A growing tendency to decompose LLMs into numerous functional modules allows for inference with part of modules and dynamic assembly of modules to tackle complex tasks. We coin the term brick to represent each functional module, designating the modularized structure as customizable foundation models. We present four brick-oriented operations: retrieval and routing, merging, updating, and growing. We find that the FFN layers follow modular patterns with functional specialization of neurons and functional neuron partitions.
arXiv Detail & Related papers (2024-09-04T17:01:02Z)
Learning to Route for Dynamic Adapter Composition in Continual Learning with Language Models [56.93608812478369]
We present L2R, a method that isolates the training of new PEFT modules to ensure their task specialization. L2R then learns to compose the learned modules by training a network of routers that leverages a small memory containing examples of previously seen tasks. Our results demonstrate that L2R provides an effective composition of PEFT modules, leading to improved generalization and performance compared to other methods.
arXiv Detail & Related papers (2024-08-16T23:57:29Z)
GroupMamba: Efficient Group-Based Visual State Space Model [66.35608254724566]
State-space models (SSMs) have recently shown promise in capturing long-range dependencies with subquadratic computational complexity. However, purely SSM-based models face critical challenges related to stability and achieving state-of-the-art performance in computer vision tasks. Our paper addresses the challenges of scaling SSM-based models for computer vision, particularly the instability and inefficiency of large model sizes.
arXiv Detail & Related papers (2024-07-18T17:59:58Z)
SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models [71.78800549517298]
Continual learning (CL) ability is vital for deploying large language models (LLMs) in the dynamic world. Existing methods devise the learning module to acquire task-specific knowledge with parameter-efficient tuning (PET) block and the selection module to pick out the corresponding one for the testing input. We propose a novel Shared Attention Framework (SAPT) to align the PET learning and selection via the Shared Attentive Learning & Selection module.
arXiv Detail & Related papers (2024-01-16T11:45:03Z)
Modular Deep Learning [120.36599591042908]
Transfer learning has recently become the dominant paradigm of machine learning. It remains unclear how to develop models that specialise towards multiple tasks without incurring negative interference. Modular deep learning has emerged as a promising solution to these challenges.
arXiv Detail & Related papers (2023-02-22T18:11:25Z)
Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs) NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge. NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z)
A Unifying Multi-sampling-ratio CS-MRI Framework With Two-grid-cycle Correction and Geometric Prior Distillation [7.643154460109723]
We propose a unifying deep unfolding multi-sampling-ratio CS-MRI framework, by merging advantages of model-based and deep learning-based methods. Inspired by multigrid algorithm, we first embed the CS-MRI-based optimization algorithm into correction-distillation scheme. We employ a condition module to learn adaptively step-length and noise level from compressive sampling ratio in every stage.
arXiv Detail & Related papers (2022-05-14T13:36:27Z)
Continual Learning via Local Module Composition [11.380264053565082]
Local module composition (LMC) is an approach to modular continual learning. LMC provides each module a local structural component that estimates a module's relevance to the input.
arXiv Detail & Related papers (2021-11-15T13:34:15Z)
Evaluating Modules in Graph Contrastive Learning [29.03038320344791]
We propose a framework that decomposes graph contrastive learning models into four modules. We conduct experiments on node and graph classification tasks. We release our implementations and results as OpenGCL, a modularized toolkit.
arXiv Detail & Related papers (2021-06-15T14:14:23Z)
Deep Keypoint-Based Camera Pose Estimation with Geometric Constraints [80.60538408386016]
Estimating relative camera poses from consecutive frames is a fundamental problem in visual odometry. We propose an end-to-end trainable framework consisting of learnable modules for detection, feature extraction, matching and outlier rejection.
arXiv Detail & Related papers (2020-07-29T21:41:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.