Discovering Long-Term Effects on Parameter Efficient Fine-tuning
- URL: http://arxiv.org/abs/2409.06706v1
- Date: Sat, 24 Aug 2024 03:27:29 GMT
- Title: Discovering Long-Term Effects on Parameter Efficient Fine-tuning
- Authors: Gaole Dai, Yiming Tang, Chunkai Fan, Qizhe Zhang, Zhi Zhang, Yulu Gan, Chengqing Zeng, Shanghang Zhang, Tiejun Huang,
- Abstract summary: Pre-trained Artificial Neural Networks (Annns) exhibit robust pattern recognition capabilities.
Annns and BNNs share extensive similarities with the human brain, specifically Biological Neural Networks (BNNs)
Annns can acquire new knowledge through fine-tuning.
- Score: 36.83255498301937
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Pre-trained Artificial Neural Networks (ANNs) exhibit robust pattern recognition capabilities and share extensive similarities with the human brain, specifically Biological Neural Networks (BNNs). We are particularly intrigued by these models' ability to acquire new knowledge through fine-tuning. In this regard, Parameter-efficient Fine-tuning (PEFT) has gained widespread adoption as a substitute for full fine-tuning due to its cost reduction in training and mitigation of over-fitting risks by limiting the number of trainable parameters during adaptation. Since both ANNs and BNNs propagate information layer-by-layer, a common analogy can be drawn: weights in ANNs represent synapses in BNNs, while features (also known as latent variables or logits) in ANNs represent neurotransmitters released by neurons in BNNs. Mainstream PEFT methods aim to adjust feature or parameter values using only a limited number of trainable parameters (usually less than 1% of the total parameters), yet achieve surprisingly good results. Building upon this clue, we delve deeper into exploring the connections between feature adjustment and parameter adjustment, resulting in our proposed method Synapses & Neurons (SAN) that learns scaling matrices for features and propagates their effects towards posterior weight matrices. Our approach draws strong inspiration from well-known neuroscience phenomena - Long-term Potentiation (LTP) and Long-term Depression (LTD), which also reveal the relationship between synapse development and neurotransmitter release levels. We conducted extensive comparisons of PEFT on 26 datasets using attention-based networks as well as convolution-based networks, leading to significant improvements compared to other tuning methods (+8.5% over fully-finetune, +7% over Visual Prompt Tuning, and +3.2% over LoRA). The codes would be released.
Related papers
- Dendrites endow artificial neural networks with accurate, robust and parameter-efficient learning [0.0]
We show that a new ANN architecture incorporates the structured connectivity and restricted sampling properties of biological dendrites.
We find that dendritic ANNs are more robust to overfitting and outperform traditional ANNs on several image classification tasks.
arXiv Detail & Related papers (2024-04-04T11:22:58Z) - Co-learning synaptic delays, weights and adaptation in spiking neural
networks [0.0]
Spiking neural networks (SNN) distinguish themselves from artificial neural networks (ANN) because of their inherent temporal processing and spike-based computations.
We show that data processing with spiking neurons can be enhanced by co-learning the connection weights with two other biologically inspired neuronal features.
arXiv Detail & Related papers (2023-09-12T09:13:26Z) - The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon Tasks [64.08042492426992]
We introduce the Expressive Memory (ELM) neuron model, a biologically inspired model of a cortical neuron.
Our ELM neuron can accurately match the aforementioned input-output relationship with under ten thousand trainable parameters.
We evaluate it on various tasks with demanding temporal structures, including the Long Range Arena (LRA) datasets.
arXiv Detail & Related papers (2023-06-14T13:34:13Z) - MSAT: Biologically Inspired Multi-Stage Adaptive Threshold for
Conversion of Spiking Neural Networks [11.392893261073594]
Spiking Neural Networks (SNNs) can do inference with low power consumption due to their spike sparsity.
ANN-SNN conversion is an efficient way to achieve deep SNNs by converting well-trained Artificial Neural Networks (ANNs)
Existing methods commonly use constant threshold for conversion, which prevents neurons from rapidly delivering spikes to deeper layers.
arXiv Detail & Related papers (2023-03-23T07:18:08Z) - Knowledge Enhanced Neural Networks for relational domains [83.9217787335878]
We focus on a specific method, KENN, a Neural-Symbolic architecture that injects prior logical knowledge into a neural network.
In this paper, we propose an extension of KENN for relational data.
arXiv Detail & Related papers (2022-05-31T13:00:34Z) - Training High-Performance Low-Latency Spiking Neural Networks by
Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware.
It is a challenge to efficiently train SNNs due to their non-differentiability.
We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - BackEISNN: A Deep Spiking Neural Network with Adaptive Self-Feedback and
Balanced Excitatory-Inhibitory Neurons [8.956708722109415]
Spiking neural networks (SNNs) transmit information through discrete spikes, which performs well in processing spatial-temporal information.
We propose a deep spiking neural network with adaptive self-feedback and balanced excitatory and inhibitory neurons (BackEISNN)
For the MNIST, FashionMNIST, and N-MNIST datasets, our model has achieved state-of-the-art performance.
arXiv Detail & Related papers (2021-05-27T08:38:31Z) - Skip-Connected Self-Recurrent Spiking Neural Networks with Joint
Intrinsic Parameter and Synaptic Weight Training [14.992756670960008]
We propose a new type of RSNN called Skip-Connected Self-Recurrent SNNs (ScSr-SNNs)
ScSr-SNNs can boost performance by up to 2.55% compared with other types of RSNNs trained by state-of-the-art BP methods.
arXiv Detail & Related papers (2020-10-23T22:27:13Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.