Related papers: Unidirectional Thin Adapter for Efficient Adaptation of Deep Neural Networks

Unidirectional Thin Adapter for Efficient Adaptation of Deep Neural Networks

URL: http://arxiv.org/abs/2203.10463v2
Date: Wed, 23 Mar 2022 11:45:37 GMT
Title: Unidirectional Thin Adapter for Efficient Adaptation of Deep Neural Networks
Authors: Han Gyel Sun, Hyunjae Ahn, HyunGyu Lee, Injung Kim
Abstract summary: We propose a new adapter network for adapting a pre-trained deep neural network to a target domain with minimal computation. The proposed model, unidirectional thin adapter (UDTA), helps the classifier adapt to new data by providing auxiliary features that complement the backbone network. In experiments on five fine-grained classification datasets, UDTA significantly reduced computation and training time required for backpropagation.
Score: 5.995023738151625
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we propose a new adapter network for adapting a pre-trained deep neural network to a target domain with minimal computation. The proposed model, unidirectional thin adapter (UDTA), helps the classifier adapt to new data by providing auxiliary features that complement the backbone network. UDTA takes outputs from multiple layers of the backbone as input features but does not transmit any feature to the backbone. As a result, UDTA can learn without computing the gradient of the backbone, which saves computation for training significantly. In addition, since UDTA learns the target task without modifying the backbone, a single backbone can adapt to multiple tasks by learning only UDTAs separately. In experiments on five fine-grained classification datasets consisting of a small number of samples, UDTA significantly reduced computation and training time required for backpropagation while showing comparable or even improved accuracy compared with conventional adapter models.

Related papers

PRUNE: A Patching Based Repair Framework for Certifiable Unlearning of Neural Networks [3.2845881871629095]
It is desirable to remove (a.k.a. unlearn) a specific part of the training data from a trained neural network model.<n>Existing unlearning methods involve training alternative models with remaining data.<n>We propose a novel unlearning approach by imposing carefully crafted "patch" on the original neural network to achieve targeted "forgetting" of the requested data to delete.
arXiv Detail & Related papers (2025-05-10T05:35:08Z)
Adaptive Adapter Routing for Long-Tailed Class-Incremental Learning [55.384428765798496]
New data exhibits a long-tailed distribution, such as e-commerce platform reviews. This necessitates continuous model learning imbalanced data without forgetting. We introduce AdaPtive Adapter RouTing (APART) as an exemplar-free solution for LTCIL.
arXiv Detail & Related papers (2024-09-11T17:52:00Z)
Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters. In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z)
KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training [2.8804804517897935]
We propose a method for hiding the least-important samples during the training of deep neural networks. We adaptively find samples to exclude in a given epoch based on their contribution to the overall learning process. Our method can reduce total training time by up to 22% impacting accuracy only by 0.4% compared to the baseline.
arXiv Detail & Related papers (2023-10-16T06:19:29Z)
Parameter-Efficient Sparse Retrievers and Rerankers using Adapters [4.9545244468634655]
We study adapters for SPLADE, a sparse retriever, for which adapters retain the efficiency and effectiveness otherwise achieved by finetuning. We also address domain adaptation of neural retrieval thanks to adapters on cross-domain BEIR datasets and TripClick.
arXiv Detail & Related papers (2023-03-23T12:34:30Z)
Prompt Tuning for Parameter-efficient Medical Image Segmentation [79.09285179181225]
We propose and investigate several contributions to achieve a parameter-efficient but effective adaptation for semantic segmentation on two medical imaging datasets. We pre-train this architecture with a dedicated dense self-supervision scheme based on assignments to online generated prototypes. We demonstrate that the resulting neural network model is able to attenuate the gap between fully fine-tuned and parameter-efficiently adapted models.
arXiv Detail & Related papers (2022-11-16T21:55:05Z)
Neural Implicit Dictionary via Mixture-of-Expert Training [111.08941206369508]
We present a generic INR framework that achieves both data and training efficiency by learning a Neural Implicit Dictionary (NID) Our NID assembles a group of coordinate-based Impworks which are tuned to span the desired function space. Our experiments show that, NID can improve reconstruction of 2D images or 3D scenes by 2 orders of magnitude faster with up to 98% less input data.
arXiv Detail & Related papers (2022-07-08T05:07:19Z)
Contextual HyperNetworks for Novel Feature Adaptation [43.49619456740745]
Contextual HyperNetwork (CHN) generates parameters for extending the base model to a new feature. At prediction time, the CHN requires only a single forward pass through a neural network, yielding a significant speed-up. We show that this system obtains improved few-shot learning performance for novel features over existing imputation and meta-learning baselines.
arXiv Detail & Related papers (2021-04-12T23:19:49Z)
A Hybrid Method for Training Convolutional Neural Networks [3.172761915061083]
We propose a hybrid method that uses both backpropagation and evolutionary strategies to train Convolutional Neural Networks. We show that the proposed hybrid method is capable of improving upon regular training in the task of image classification.
arXiv Detail & Related papers (2020-04-15T17:52:48Z)
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks [95.51368472949308]
Adaptation can be useful in cases when training data is scarce, or when one wishes to encode priors in the network. In this paper, we propose a straightforward alternative: side-tuning.
arXiv Detail & Related papers (2019-12-31T18:52:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.