Related papers: Task agnostic continual learning with Pairwise layer architecture

Task agnostic continual learning with Pairwise layer architecture

URL: http://arxiv.org/abs/2405.13632v1
Date: Wed, 22 May 2024 13:30:01 GMT
Title: Task agnostic continual learning with Pairwise layer architecture
Authors: Santtu Keskinen,
Abstract summary: We show that we can improve the continual learning performance by replacing the final layer of our networks with our pairwise interaction layer. The networks using this architecture show competitive performance in MNIST and FashionMNIST-based continual image classification experiments.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Most of the dominant approaches to continual learning are based on either memory replay, parameter isolation, or regularization techniques that require task boundaries to calculate task statistics. We propose a static architecture-based method that doesn't use any of these. We show that we can improve the continual learning performance by replacing the final layer of our networks with our pairwise interaction layer. The pairwise interaction layer uses sparse representations from a Winner-take-all style activation function to find the relevant correlations in the hidden layer representations. The networks using this architecture show competitive performance in MNIST and FashionMNIST-based continual image classification experiments. We demonstrate this in an online streaming continual learning setup where the learning system cannot access task labels or boundaries.

Related papers

Do All Individual Layers Help? An Empirical Study of Task-Interfering Layers in Vision-Language Models [51.754991950934375]
In a pretrained VLM, all layers are engaged by default to make predictions on downstream tasks.<n>We find that intervening on a single layer, such as by zeroing its parameters, can improve the performance on certain tasks.<n>We propose TaLo, a training-free, test-time adaptation method that dynamically identifies and bypasses the most interfering layer for a given task.
arXiv Detail & Related papers (2026-02-01T11:37:05Z)
Learning with Preserving for Continual Multitask Learning [4.847042727427382]
We introduce Learning with Preserving (LwP), a novel framework that shifts the focus from preserving task outputs to maintaining the shared representation space.<n>LwP not only mitigates catastrophic forgetting but also consistently outperforms state-of-the-art baselines in CMTL tasks.
arXiv Detail & Related papers (2025-11-11T22:23:20Z)
Hierarchical Context Transformer for Multi-level Semantic Scene Understanding [37.35498412336018]
We propose to represent the tasks set as multi-level semantic scene understanding (MSSU) For this target, we propose a novel hierarchical context transformer (HCT) network. Experiments on our cataract dataset and a publicly available endoscopic PSI-AVA dataset demonstrate the outstanding performance of our method.
arXiv Detail & Related papers (2025-02-21T03:36:16Z)
Strengthening Layer Interaction via Dynamic Layer Attention [12.341997220052486]
Existing layer attention methods achieve layer interaction on fixed feature maps in a static manner. To restore the dynamic context representation capability of the attention mechanism, we propose a Dynamic Layer Attention architecture. Experimental results demonstrate the effectiveness of the proposed DLA architecture, outperforming other state-of-the-art methods in image recognition and object detection tasks.
arXiv Detail & Related papers (2024-06-19T09:35:14Z)
Multilinear Operator Networks [60.7432588386185]
Polynomial Networks is a class of models that does not require activation functions. We propose MONet, which relies solely on multilinear operators.
arXiv Detail & Related papers (2024-01-31T16:52:19Z)
Use All The Labels: A Hierarchical Multi-Label Contrastive Learning Framework [75.79736930414715]
We present a hierarchical multi-label representation learning framework that can leverage all available labels and preserve the hierarchical relationship between classes. We introduce novel hierarchy preserving losses, which jointly apply a hierarchical penalty to the contrastive loss, and enforce the hierarchy constraint.
arXiv Detail & Related papers (2022-04-27T21:41:44Z)
SIRe-Networks: Skip Connections over Interlaced Multi-Task Learning and Residual Connections for Structure Preserving Object Classification [28.02302915971059]
In this paper, we introduce an interlaced multi-task learning strategy, defined SIRe, to reduce the vanishing gradient in relation to the object classification task. The presented methodology directly improves a convolutional neural network (CNN) by enforcing the input image structure preservation through auto-encoders. To validate the presented methodology, a simple CNN and various implementations of famous networks are extended via the SIRe strategy and extensively tested on the CIFAR100 dataset.
arXiv Detail & Related papers (2021-10-06T13:54:49Z)
Encoders and Ensembles for Task-Free Continual Learning [15.831773437720429]
We present an architecture that is effective for continual learning in an especially demanding setting, where task boundaries do not exist or are unknown. We show that models trained with the architecture are state-of-the-art for the task-free setting on standard image classification continual learning benchmarks. We also show that the architecture learns well in a fully incremental setting, where one class is learned at a time, and we demonstrate its effectiveness in this setting with up to 100 classes.
arXiv Detail & Related papers (2021-05-27T17:34:31Z)
Multi-Perspective LSTM for Joint Visual Representation Learning [81.21490913108835]
We present a novel LSTM cell architecture capable of learning both intra- and inter-perspective relationships available in visual sequences captured from multiple perspectives. Our architecture adopts a novel recurrent joint learning strategy that uses additional gates and memories at the cell level. We show that by using the proposed cell to create a network, more effective and richer visual representations are learned for recognition tasks.
arXiv Detail & Related papers (2021-05-06T16:44:40Z)
Continual Learning in Low-rank Orthogonal Subspaces [86.36417214618575]
In continual learning (CL), a learner is faced with a sequence of tasks, arriving one after the other, and the goal is to remember all the tasks once the learning experience is finished. The prior art in CL uses episodic memory, parameter regularization or network structures to reduce interference among tasks, but in the end, all the approaches learn different tasks in a joint vector space. We propose to learn tasks in different (low-rank) vector subspaces that are kept orthogonal to each other in order to minimize interference.
arXiv Detail & Related papers (2020-10-22T12:07:43Z)
Region Comparison Network for Interpretable Few-shot Image Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes. We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works. We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z)
Neuromodulated Neural Architectures with Local Error Signals for Memory-Constrained Online Continual Learning [4.2903672492917755]
We develop a biologically-inspired light weight neural network architecture that incorporates local learning and neuromodulation. We demonstrate the efficacy of our approach on both single task and continual learning setting.
arXiv Detail & Related papers (2020-07-16T07:41:23Z)
Adversarial Continual Learning [99.56738010842301]
We propose a hybrid continual learning framework that learns a disjoint representation for task-invariant and task-specific features. Our model combines architecture growth to prevent forgetting of task-specific skills and an experience replay approach to preserve shared skills.
arXiv Detail & Related papers (2020-03-21T02:08:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.