Fixed-Weight Difference Target Propagation
- URL: http://arxiv.org/abs/2212.10352v1
- Date: Mon, 19 Dec 2022 13:34:36 GMT
- Title: Fixed-Weight Difference Target Propagation
- Authors: Tatsukichi Shibuya, Nakamasa Inoue, Rei Kawakami, Ikuro Sato
- Abstract summary: We present Fixed-Weight Difference Target Propagation (FW-DTP) that keeps the feedback weights constant during training.
FW-DTP consistently achieves higher test performance than a baseline, the Difference Target propagation (DTP) on four classification datasets.
We also present a novel propagation architecture that explains the exact form of the feedback function of DTP to analyze FW-DTP.
- Score: 12.559727665706687
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Target Propagation (TP) is a biologically more plausible algorithm than the
error backpropagation (BP) to train deep networks, and improving practicality
of TP is an open issue. TP methods require the feedforward and feedback
networks to form layer-wise autoencoders for propagating the target values
generated at the output layer. However, this causes certain drawbacks; e.g.,
careful hyperparameter tuning is required to synchronize the feedforward and
feedback training, and frequent updates of the feedback path are usually
required than that of the feedforward path. Learning of the feedforward and
feedback networks is sufficient to make TP methods capable of training, but is
having these layer-wise autoencoders a necessary condition for TP to work? We
answer this question by presenting Fixed-Weight Difference Target Propagation
(FW-DTP) that keeps the feedback weights constant during training. We confirmed
that this simple method, which naturally resolves the abovementioned problems
of TP, can still deliver informative target values to hidden layers for a given
task; indeed, FW-DTP consistently achieves higher test performance than a
baseline, the Difference Target Propagation (DTP), on four classification
datasets. We also present a novel propagation architecture that explains the
exact form of the feedback function of DTP to analyze FW-DTP.
Related papers
- Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT)
We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z) - Fast Trainable Projection for Robust Fine-Tuning [36.51660287722338]
Robust fine-tuning aims to achieve competitive in-distribution (ID) performance.
Projection-based fine-tuning has been successfully used in robust fine-tuning.
Fast Trainable Projection is a new projection-based fine-tuning algorithm.
arXiv Detail & Related papers (2023-10-29T22:52:43Z) - Layer-wise Feedback Propagation [53.00944147633484]
We present Layer-wise Feedback Propagation (LFP), a novel training approach for neural-network-like predictors.
LFP assigns rewards to individual connections based on their respective contributions to solving a given task.
We demonstrate its effectiveness in achieving comparable performance to gradient descent on various models and datasets.
arXiv Detail & Related papers (2023-08-23T10:48:28Z) - Boosting Distributed Machine Learning Training Through Loss-tolerant
Transmission Protocol [11.161913989794257]
Distributed Machine Learning (DML) systems are utilized to enhance the speed of model training in data centers (DCs) and edge nodes.
PS communication architecture faces severe long-tail latency caused by many-to-one "incast" traffic patterns, negatively impacting training throughput.
textbfLoss-tolerant textbfTransmission textbfProcol allows partial loss of gradients during synchronization to avoid unneeded retransmission.
textitEarly Close adjusts the loss-tolerant threshold based on network conditions and textit
arXiv Detail & Related papers (2023-05-07T14:01:52Z) - Position-guided Text Prompt for Vision-Language Pre-training [121.15494549650548]
We propose a novel Position-guided Text Prompt (PTP) paradigm to enhance the visual grounding ability of cross-modal models trained with Vision-Language Pre-Training.
PTP reformulates the visual grounding task into a fill-in-the-blank problem given a PTP by encouraging the model to predict the objects in the given blocks or regress the blocks of a given object.
PTP achieves comparable results with object-detector based methods, and much faster inference speed since PTP discards its object detector for inference while the later cannot.
arXiv Detail & Related papers (2022-12-19T18:55:43Z) - Trainability Preserving Neural Structured Pruning [64.65659982877891]
We present trainability preserving pruning (TPP), a regularization-based structured pruning method that can effectively maintain trainability during sparsification.
TPP can compete with the ground-truth dynamical isometry recovery method on linear networks.
It delivers encouraging performance in comparison to many top-performing filter pruning methods.
arXiv Detail & Related papers (2022-07-25T21:15:47Z) - DeepTPI: Test Point Insertion with Deep Reinforcement Learning [6.357061090668433]
Test point insertion (TPI) is a widely used technique for testability enhancement.
We propose a novel TPI approach based on deep reinforcement learning (DRL), named DeepTPI.
We show that DeepTPI significantly improves test coverage compared to the commercial DFT tool.
arXiv Detail & Related papers (2022-06-07T14:13:42Z) - Towards Scaling Difference Target Propagation by Learning Backprop
Targets [64.90165892557776]
Difference Target Propagation is a biologically-plausible learning algorithm with close relation with Gauss-Newton (GN) optimization.
We propose a novel feedback weight training scheme that ensures both that DTP approximates BP and that layer-wise feedback weight training can be restored.
We report the best performance ever achieved by DTP on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2022-01-31T18:20:43Z) - Target Propagation via Regularized Inversion [4.289574109162585]
We present a simple version of target propagation based on regularized inversion of network layers, easily implementable in a differentiable programming framework.
We show how our TP can be used to train recurrent neural networks with long sequences on various sequence modeling problems.
arXiv Detail & Related papers (2021-12-02T17:49:25Z) - A Theoretical Framework for Target Propagation [75.52598682467817]
We analyze target propagation (TP), a popular but not yet fully understood alternative to backpropagation (BP)
Our theory shows that TP is closely related to Gauss-Newton optimization and thus substantially differs from BP.
We provide a first solution to this problem through a novel reconstruction loss that improves feedback weight training.
arXiv Detail & Related papers (2020-06-25T12:07:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.