CST: Calibration Side-Tuning for Parameter and Memory Efficient Transfer
Learning
- URL: http://arxiv.org/abs/2402.12736v1
- Date: Tue, 20 Feb 2024 06:01:31 GMT
- Title: CST: Calibration Side-Tuning for Parameter and Memory Efficient Transfer
Learning
- Authors: Feng Chen
- Abstract summary: This paper introduces a lightweight fine-tuning strategy called side tuning.
It incorporates aspects of adapter tuning and side tuning to adapt the successful techniques employed in transformers for use with ResNet.
The paper has conducted an analysis on multiple fine-tuning strategies and have implemented their application within ResNet.
- Score: 4.776619551860301
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Achieving a universally high accuracy in object detection is quite
challenging, and the mainstream focus in the industry currently lies on
detecting specific classes of objects. However, deploying one or multiple
object detection networks requires a certain amount of GPU memory for training
and storage capacity for inference. This presents challenges in terms of how to
effectively coordinate multiple object detection tasks under
resource-constrained conditions. This paper introduces a lightweight
fine-tuning strategy called Calibration side tuning, which integrates aspects
of adapter tuning and side tuning to adapt the successful techniques employed
in transformers for use with ResNet. The Calibration side tuning architecture
that incorporates maximal transition calibration, utilizing a small number of
additional parameters to enhance network performance while maintaining a smooth
training process. Furthermore, this paper has conducted an analysis on multiple
fine-tuning strategies and have implemented their application within ResNet,
thereby expanding the research on fine-tuning strategies for object detection
networks. Besides, this paper carried out extensive experiments using five
benchmark datasets. The experimental results demonstrated that this method
outperforms other compared state-of-the-art techniques, and a better balance
between the complexity and performance of the finetune schemes is achieved.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - A survey on efficient vision transformers: algorithms, techniques, and
performance benchmarking [19.65897437342896]
Vision Transformer (ViT) architectures are becoming increasingly popular and widely employed to tackle computer vision applications.
This paper mathematically defines the strategies used to make Vision Transformer efficient, describes and discusses state-of-the-art methodologies, and analyzes their performances over different application scenarios.
arXiv Detail & Related papers (2023-09-05T08:21:16Z) - Transferability Metrics for Object Detection [0.0]
Transfer learning aims to make the most of existing pre-trained models to achieve better performance on a new task in limited data scenarios.
We extend transferability metrics to object detection using ROI-Align and TLogME.
We show that TLogME provides a robust correlation with transfer performance and outperforms other transferability metrics on local and global level features.
arXiv Detail & Related papers (2023-06-27T08:49:31Z) - Pointerformer: Deep Reinforced Multi-Pointer Transformer for the
Traveling Salesman Problem [67.32731657297377]
Traveling Salesman Problem (TSP) is a classic routing optimization problem originally arising in the domain of transportation and logistics.
Recently, Deep Reinforcement Learning has been increasingly employed to solve TSP due to its high inference efficiency.
We propose a novel end-to-end DRL approach, referred to as Pointerformer, based on multi-pointer Transformer.
arXiv Detail & Related papers (2023-04-19T03:48:32Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - Resource-Efficient Invariant Networks: Exponential Gains by Unrolled
Optimization [8.37077056358265]
We propose a new computational primitive for building invariant networks based instead on optimization.
We provide empirical and theoretical corroboration of the efficiency gains and soundness of our proposed method.
We demonstrate its utility in constructing an efficient invariant network for a simple hierarchical object detection task.
arXiv Detail & Related papers (2022-03-09T19:04:08Z) - Fully Quantized Image Super-Resolution Networks [81.75002888152159]
We propose a Fully Quantized image Super-Resolution framework (FQSR) to jointly optimize efficiency and accuracy.
We apply our quantization scheme on multiple mainstream super-resolution architectures, including SRResNet, SRGAN and EDSR.
Our FQSR using low bits quantization can achieve on par performance compared with the full-precision counterparts on five benchmark datasets.
arXiv Detail & Related papers (2020-11-29T03:53:49Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z) - Intra Order-preserving Functions for Calibration of Multi-Class Neural
Networks [54.23874144090228]
A common approach is to learn a post-hoc calibration function that transforms the output of the original network into calibrated confidence scores.
Previous post-hoc calibration techniques work only with simple calibration functions.
We propose a new neural network architecture that represents a class of intra order-preserving functions.
arXiv Detail & Related papers (2020-03-15T12:57:21Z) - Experimental adaptive Bayesian estimation of multiple phases with
limited data [0.0]
adaptive protocols, exploiting additional control parameters, provide a tool to optimize the performance of a quantum sensor to work in such limited data regime.
Finding the optimal strategies to tune the control parameters during the estimation process is a non-trivial problem, and machine learning techniques are a natural solution to address such task.
We employ a compact and flexible integrated photonic circuit, fabricated by femtosecond laser writing, which allows to implement different strategies with high degree of control.
arXiv Detail & Related papers (2020-02-04T11:32:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.