Related papers: Extending Contextual Self-Modulation: Meta-Learning Across Modalities, Task Dimensionalities, and Data Regimes

Extending Contextual Self-Modulation: Meta-Learning Across Modalities, Task Dimensionalities, and Data Regimes

URL: http://arxiv.org/abs/2410.01655v1
Date: Wed, 2 Oct 2024 15:19:35 GMT
Title: Extending Contextual Self-Modulation: Meta-Learning Across Modalities, Task Dimensionalities, and Data Regimes
Authors: Roussel Desmond Nzoyem, David A. W. Barton, Tom Deakin,
Abstract summary: Contextual Self-Modulation (CSM) is a potent regularization mechanism for the Neural Context Flow (NCF) framework. We introduce two extensions: $i$CSM, which expands CSM to infinite-dimensional tasks, and $i$NCF, which improves scalability. These extensions are demonstrated through comprehensive experimentation on a range of tasks.
Score: 0.7373617024876724
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Contextual Self-Modulation (CSM) is a potent regularization mechanism for the Neural Context Flow (NCF) framework which demonstrates powerful meta-learning of physical systems. However, CSM has limitations in its applicability across different modalities and in high-data regimes. In this work, we introduce two extensions: $i$CSM, which expands CSM to infinite-dimensional tasks, and StochasticNCF, which improves scalability. These extensions are demonstrated through comprehensive experimentation on a range of tasks, including dynamical systems with parameter variations, computer vision challenges, and curve fitting problems. $i$CSM embeds the contexts into an infinite-dimensional function space, as opposed to CSM which uses finite-dimensional context vectors. StochasticNCF enables the application of both CSM and $i$CSM to high-data scenarios by providing an unbiased approximation of meta-gradient updates through a sampled set of nearest environments. Additionally, we incorporate higher-order Taylor expansions via Taylor-Mode automatic differentiation, revealing that higher-order approximations do not necessarily enhance generalization. Finally, we demonstrate how CSM can be integrated into other meta-learning frameworks with FlashCAVIA, a computationally efficient extension of the CAVIA meta-learning framework (Zintgraf et al. 2019). FlashCAVIA outperforms its predecessor across various benchmarks and reinforces the utility of bi-level optimization techniques. Together, these contributions establish a robust framework for tackling an expanded spectrum of meta-learning tasks, offering practical insights for out-of-distribution generalization. Our open-sourced library, designed for flexible integration of self-modulation into contextual meta-learning workflows, is available at \url{github.com/ddrous/self-mod}.

Related papers

A Unified Gradient-based Framework for Task-agnostic Continual Learning-Unlearning [30.2773429357068]
Recent advancements in deep models have highlighted the need for intelligent systems that combine continual learning (CL) for knowledge acquisition with machine unlearning (MU) for data removal.<n>We reveal their intrinsic connection through a unified optimization framework based on Kullback-Leibler divergence minimization.<n>Experiments demonstrate that the proposed UG-CLU framework effectively coordinates incremental learning, precise unlearning, and knowledge stability across multiple datasets and model architectures.
arXiv Detail & Related papers (2025-05-21T06:49:05Z)
Feature Calibration enhanced Parameter Synthesis for CLIP-based Class-incremental Learning [10.253058594622017]
Class-Incremental Learning (CIL) enables models to continuously learn new class knowledge while retaining previous classes. Traditional CIL methods rely primarily on visual features, which limits their effectiveness in complex, multimodal scenarios. We propose a Feature Enhanced Synthesis (FCPS) framework that mitigates catastrophic generalization while preserving the model's intrinsic generalization capability.
arXiv Detail & Related papers (2025-03-24T13:44:12Z)
Semi-supervised Semantic Segmentation with Multi-Constraint Consistency Learning [81.02648336552421]
We propose a Multi-Constraint Consistency Learning approach to facilitate the staged enhancement of the encoder and decoder. Self-adaptive feature masking and noise injection are designed in an instance-specific manner to perturb the features for robust learning of the decoder. Experimental results on Pascal VOC2012 and Cityscapes datasets demonstrate that our proposed MCCL achieves new state-of-the-art performance.
arXiv Detail & Related papers (2025-03-23T03:21:33Z)
Will Pre-Training Ever End? A First Step Toward Next-Generation Foundation MLLMs via Self-Improving Systematic Cognition [86.21199607040147]
Self-Improving cognition (SIcog) is a self-learning framework for constructing next-generation foundation language models. We introduce Chain-of-Description, a step-by-step visual understanding method, and integrate structured chain-of-thought (CoT) reasoning to support in-depth multimodal reasoning. Extensive experiments demonstrate that SIcog produces next-generation foundation MLLMs with substantially improved multimodal cognition.
arXiv Detail & Related papers (2025-03-16T00:25:13Z)
To Code or not to Code? Adaptive Tool Integration for Math Language Models via Expectation-Maximization [30.057052324461534]
We propose a novel framework that synergizes structured exploration (E-step) with off-policy optimization (M-step) to create a self-reinforcing cycle between metacognitive tool-use decisions and evolving capabilities.<n>Our 7B model improves over 11% on MATH500 and 9.4% on AIME without o1-like CoT.
arXiv Detail & Related papers (2025-02-02T06:32:23Z)
ConfigX: Modular Configuration for Evolutionary Algorithms via Multitask Reinforcement Learning [24.66031883340297]
We introduce ConfigX, a new paradigm of the MetaBBO framework that is capable of learning a universal configuration agent (model) for boosting diverse EAs. Our ConfigX, after large-scale pre-training, achieves robust zero-shot generalization to unseen tasks and outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2024-12-10T13:43:51Z)
MetaSSC: Enhancing 3D Semantic Scene Completion for Autonomous Driving through Meta-Learning and Long-sequence Modeling [3.139165705827712]
We introduce MetaSSC, a novel meta-learning-based framework for semantic scene completion (SSC)<n>Our approach begins with a voxel-based semantic segmentation (SS) pretraining task, aimed at exploring the semantics and geometry of incomplete regions.<n>Using simulated cooperative perception datasets, we supervise the perception training of a single vehicle using aggregated sensor data.<n>This meta-knowledge is then adapted to the target domain through a dual-phase training strategy, enabling efficient deployment.
arXiv Detail & Related papers (2024-11-06T05:11:25Z)
LLMs Can Evolve Continually on Modality for X-Modal Reasoning [62.2874638875554]
Existing methods rely heavily on modal-specific pretraining and joint-modal tuning, leading to significant computational burdens when expanding to new modalities. We propose PathWeave, a flexible and scalable framework with modal-Path sWitching and ExpAnsion abilities. PathWeave performs comparably to state-of-the-art MLLMs while concurrently reducing parameter training burdens by 98.73%.
arXiv Detail & Related papers (2024-10-26T13:19:57Z)
Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning [113.89327264634984]
Few-shot class-incremental learning (FSCIL) confronts the challenge of integrating new classes into a model with minimal training samples. Traditional methods widely adopt static adaptation relying on a fixed parameter space to learn from data that arrive sequentially. We propose a dual selective SSM projector that dynamically adjusts the projection parameters based on the intermediate features for dynamic adaptation.
arXiv Detail & Related papers (2024-07-08T17:09:39Z)
Unleashing Network Potentials for Semantic Scene Completion [50.95486458217653]
This paper proposes a novel SSC framework - Adrial Modality Modulation Network (AMMNet) AMMNet introduces two core modules: a cross-modal modulation enabling the interdependence of gradient flows between modalities, and a customized adversarial training scheme leveraging dynamic gradient competition. Extensive experimental results demonstrate that AMMNet outperforms state-of-the-art SSC methods by a large margin.
arXiv Detail & Related papers (2024-03-12T11:48:49Z)
Interactive Continual Learning: Fast and Slow Thinking [19.253164551254734]
This paper presents a novel Interactive Continual Learning framework, enabled by collaborative interactions among models of various sizes. To improve memory retrieval in System1, we introduce the CL-vMF mechanism, based on the von Mises-Fisher (vMF) distribution. Comprehensive evaluation of our proposed ICL demonstrates significant resistance to forgetting and superior performance relative to existing methods.
arXiv Detail & Related papers (2024-03-05T03:37:28Z)
USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality. Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z)
Leveraging Memory Effects and Gradient Information in Consensus-Based Optimization: On Global Convergence in Mean-Field Law [0.0]
We present a versatile, flexible and customizable consensus-based optimization (CBO) method suitable for global and non-smooth optimizations in high dimensions. We prove that dynamics converges to a global minimizer of the objective function in mean-field law. We present evidence for the superiority of this CBO variant in applications such as machine learning and compressed sensingant widen the scope of applications of CBO.
arXiv Detail & Related papers (2022-11-22T11:22:11Z)
Continual Variational Autoencoder Learning via Online Cooperative Memorization [11.540150938141034]
Variational Autoencoders (VAE) have been successfully used in continual learning classification tasks. However, their ability to generate images with specifications corresponding to the classes and databases learned during Continual Learning is not well understood. We develop a new theoretical framework that formulates CL as a dynamic optimal transport problem. We then propose a novel memory buffering approach, namely the Online Cooperative Memorization (OCM) framework.
arXiv Detail & Related papers (2022-07-20T18:19:27Z)
Learning towards Synchronous Network Memorizability and Generalizability for Continual Segmentation across Multiple Sites [52.84959869494459]
In clinical practice, a segmentation network is often required to continually learn on a sequential data stream from multiple sites. Existing methods are usually restricted in either network memorizability on previous sites or generalizability on unseen sites. This paper aims to tackle the problem of Synchronous Memorizability and Generalizability with a novel proposed SMG-learning framework.
arXiv Detail & Related papers (2022-06-14T13:04:36Z)
Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK) Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework. We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z)
Optimization-Inspired Learning with Architecture Augmentations and Control Mechanisms for Low-Level Vision [74.9260745577362]
This paper proposes a unified optimization-inspired learning framework to aggregate Generative, Discriminative, and Corrective (GDC) principles. We construct three propagative modules to effectively solve the optimization models with flexible combinations. Experiments across varied low-level vision tasks validate the efficacy and adaptability of GDC.
arXiv Detail & Related papers (2020-12-10T03:24:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.