Ada-QPacknet -- adaptive pruning with bit width reduction as an
efficient continual learning method without forgetting
- URL: http://arxiv.org/abs/2308.07939v2
- Date: Sun, 1 Oct 2023 15:43:45 GMT
- Title: Ada-QPacknet -- adaptive pruning with bit width reduction as an
efficient continual learning method without forgetting
- Authors: Marcin Pietro\'n, Dominik \.Zurek, Kamil Faber, Roberto Corizzo
- Abstract summary: In this work new architecture based approach Ada-QPacknet is described.
It incorporates the pruning for extracting the sub-network for each task.
Results show that proposed approach outperforms most of the CL strategies in task and class incremental scenarios.
- Score: 0.8681331155356999
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual Learning (CL) is a process in which there is still huge gap between
human and deep learning model efficiency. Recently, many CL algorithms were
designed. Most of them have many problems with learning in dynamic and complex
environments. In this work new architecture based approach Ada-QPacknet is
described. It incorporates the pruning for extracting the sub-network for each
task. The crucial aspect in architecture based CL methods is theirs capacity.
In presented method the size of the model is reduced by efficient linear and
nonlinear quantisation approach. The method reduces the bit-width of the
weights format. The presented results shows that low bit quantisation achieves
similar accuracy as floating-point sub-network on a well-know CL scenarios. To
our knowledge it is the first CL strategy which incorporates both compression
techniques pruning and quantisation for generating task sub-networks. The
presented algorithm was tested on well-known episode combinations and compared
with most popular algorithms. Results show that proposed approach outperforms
most of the CL strategies in task and class incremental scenarios.
Related papers
- CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs [44.03692512352445]
Column-Level Adaptive weight Quantization (CLAQ) is a novel and effective framework for Large Language Models (LLMs) quantization.
In this paper, we present a novel and effective CLAQ framework by introducing three different types of adaptive strategies for LLM quantization.
Experiments on various mainstream open source LLMs including LLaMA-1, LLaMA-2 and Yi demonstrate that our methods achieve the state-of-the-art results across different bit settings.
arXiv Detail & Related papers (2024-05-27T14:49:39Z) - A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation [121.0693322732454]
Contrastive Language-Image Pretraining (CLIP) has gained popularity for its remarkable zero-shot capacity.
Recent research has focused on developing efficient fine-tuning methods to enhance CLIP's performance in downstream tasks.
We revisit a classical algorithm, Gaussian Discriminant Analysis (GDA), and apply it to the downstream classification of CLIP.
arXiv Detail & Related papers (2024-02-06T15:45:27Z) - Fast and Scalable Network Slicing by Integrating Deep Learning with
Lagrangian Methods [8.72339110741777]
Network slicing is a key technique in 5G and beyond for efficiently supporting diverse services.
Deep learning models suffer limited generalization and adaptability to dynamic slicing configurations.
We propose a novel framework that integrates constrained optimization methods and deep learning models.
arXiv Detail & Related papers (2024-01-22T07:19:16Z) - Online Network Source Optimization with Graph-Kernel MAB [62.6067511147939]
We propose Grab-UCB, a graph- kernel multi-arms bandit algorithm to learn online the optimal source placement in large scale networks.
We describe the network processes with an adaptive graph dictionary model, which typically leads to sparse spectral representations.
We derive the performance guarantees that depend on network parameters, which further influence the learning curve of the sequential decision strategy.
arXiv Detail & Related papers (2023-07-07T15:03:42Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Tricks and Plugins to GBM on Images and Sequences [18.939336393665553]
We propose a new algorithm for boosting Deep Convolutional Neural Networks (BoostCNN) to combine the merits of dynamic feature selection and BoostCNN.
We also propose a set of algorithms to incorporate boosting weights into a deep learning architecture based on a least squares objective function.
Experiments show that the proposed methods outperform benchmarks on several fine-grained classification tasks.
arXiv Detail & Related papers (2022-03-01T21:59:00Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Online Sequential Extreme Learning Machines: Features Combined From
Hundreds of Midlayers [0.0]
In this paper, we develop an algorithm called hierarchal online sequential learning algorithm (H-OS-ELM)
The algorithm can learn chunk by chunk with fixed or varying block size.
arXiv Detail & Related papers (2020-06-12T00:50:04Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.