Related papers: Ada-QPacknet -- adaptive pruning with bit width reduction as an efficient continual learning method without forgetting

Ada-QPacknet -- adaptive pruning with bit width reduction as an efficient continual learning method without forgetting

URL: http://arxiv.org/abs/2308.07939v2
Date: Sun, 1 Oct 2023 15:43:45 GMT
Title: Ada-QPacknet -- adaptive pruning with bit width reduction as an efficient continual learning method without forgetting
Authors: Marcin Pietro\'n, Dominik \.Zurek, Kamil Faber, Roberto Corizzo
Abstract summary: In this work new architecture based approach Ada-QPacknet is described. It incorporates the pruning for extracting the sub-network for each task. Results show that proposed approach outperforms most of the CL strategies in task and class incremental scenarios.
Score: 0.8681331155356999
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Continual Learning (CL) is a process in which there is still huge gap between human and deep learning model efficiency. Recently, many CL algorithms were designed. Most of them have many problems with learning in dynamic and complex environments. In this work new architecture based approach Ada-QPacknet is described. It incorporates the pruning for extracting the sub-network for each task. The crucial aspect in architecture based CL methods is theirs capacity. In presented method the size of the model is reduced by efficient linear and nonlinear quantisation approach. The method reduces the bit-width of the weights format. The presented results shows that low bit quantisation achieves similar accuracy as floating-point sub-network on a well-know CL scenarios. To our knowledge it is the first CL strategy which incorporates both compression techniques pruning and quantisation for generating task sub-networks. The presented algorithm was tested on well-known episode combinations and compared with most popular algorithms. Results show that proposed approach outperforms most of the CL strategies in task and class incremental scenarios.

Related papers

A mini-batch training strategy for deep subspace clustering networks [6.517972913340111]
Mini-batch training is a cornerstone of modern deep learning, offering computational efficiency and scalability for training complex architectures.<n>In this work, we introduce a mini-batch training strategy for deep subspace clustering by integrating a memory bank that preserves global feature representations.<n>Our approach enables scalable training of deep architectures for subspace clustering with high-resolution images, overcoming previous limitations.
arXiv Detail & Related papers (2025-07-26T11:44:39Z)
CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs [44.03692512352445]
Column-Level Adaptive weight Quantization (CLAQ) is a novel and effective framework for Large Language Models (LLMs) quantization. In this paper, we present a novel and effective CLAQ framework by introducing three different types of adaptive strategies for LLM quantization. Experiments on various mainstream open source LLMs including LLaMA-1, LLaMA-2 and Yi demonstrate that our methods achieve the state-of-the-art results across different bit settings.
arXiv Detail & Related papers (2024-05-27T14:49:39Z)
Hyperparameters in Continual Learning: A Reality Check [53.30082523545212]
Continual learning (CL) aims to train a model on a sequence of tasks while balancing the trade-off between plasticity (learning new tasks) and stability (retaining prior knowledge)
arXiv Detail & Related papers (2024-03-14T03:13:01Z)
A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation [121.0693322732454]
Contrastive Language-Image Pretraining (CLIP) has gained popularity for its remarkable zero-shot capacity. Recent research has focused on developing efficient fine-tuning methods to enhance CLIP's performance in downstream tasks. We revisit a classical algorithm, Gaussian Discriminant Analysis (GDA), and apply it to the downstream classification of CLIP.
arXiv Detail & Related papers (2024-02-06T15:45:27Z)
Fast and Scalable Network Slicing by Integrating Deep Learning with Lagrangian Methods [8.72339110741777]
Network slicing is a key technique in 5G and beyond for efficiently supporting diverse services. Deep learning models suffer limited generalization and adaptability to dynamic slicing configurations. We propose a novel framework that integrates constrained optimization methods and deep learning models.
arXiv Detail & Related papers (2024-01-22T07:19:16Z)
Online Network Source Optimization with Graph-Kernel MAB [62.6067511147939]
We propose Grab-UCB, a graph- kernel multi-arms bandit algorithm to learn online the optimal source placement in large scale networks. We describe the network processes with an adaptive graph dictionary model, which typically leads to sparse spectral representations. We derive the performance guarantees that depend on network parameters, which further influence the learning curve of the sequential decision strategy.
arXiv Detail & Related papers (2023-07-07T15:03:42Z)
Unifying Synergies between Self-supervised Learning and Dynamic Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms. We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting. The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z)
Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter. We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures'' Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z)
Tricks and Plugins to GBM on Images and Sequences [18.939336393665553]
We propose a new algorithm for boosting Deep Convolutional Neural Networks (BoostCNN) to combine the merits of dynamic feature selection and BoostCNN. We also propose a set of algorithms to incorporate boosting weights into a deep learning architecture based on a least squares objective function. Experiments show that the proposed methods outperform benchmarks on several fine-grained classification tasks.
arXiv Detail & Related papers (2022-03-01T21:59:00Z)
Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks. The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z)
Online Sequential Extreme Learning Machines: Features Combined From Hundreds of Midlayers [0.0]
In this paper, we develop an algorithm called hierarchal online sequential learning algorithm (H-OS-ELM) The algorithm can learn chunk by chunk with fixed or varying block size.
arXiv Detail & Related papers (2020-06-12T00:50:04Z)
Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks. With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.