Related papers: Transducer Tuning: Efficient Model Adaptation for Software Tasks Using Code Property Graphs

Transducer Tuning: Efficient Model Adaptation for Software Tasks Using Code Property Graphs

URL: http://arxiv.org/abs/2412.13467v1
Date: Wed, 18 Dec 2024 03:25:17 GMT
Title: Transducer Tuning: Efficient Model Adaptation for Software Tasks Using Code Property Graphs
Authors: Imam Nur Bani Yusuf, Lingxiao Jiang,
Abstract summary: approach is a technique to adapt large models for downstream code tasks using Code Property Graphs (CPGs)<n>Our approach introduces a modular component called transducer that enriches code embeddings with structural and dependency information from CPGs.<n>Our results demonstrate competitive performance compared to full parameter fine-tuning while reducing up to 99% trainable parameters to save memory.
Score: 8.26418657158164
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large language models have demonstrated promising performance across various software engineering tasks. While fine-tuning is a common practice to adapt these models for downstream tasks, it becomes challenging in resource-constrained environments due to increased memory requirements from growing trainable parameters in increasingly large language models. We introduce \approach, a technique to adapt large models for downstream code tasks using Code Property Graphs (CPGs). Our approach introduces a modular component called \transducer that enriches code embeddings with structural and dependency information from CPGs. The Transducer comprises two key components: Graph Vectorization Engine (GVE) and Attention-Based Fusion Layer (ABFL). GVE extracts CPGs from input source code and transforms them into graph feature vectors. ABFL then fuses those graphs feature vectors with initial code embeddings from a large language model. By optimizing these transducers for different downstream tasks, our approach enhances the models without the need to fine-tune them for specific tasks. We have evaluated \approach on three downstream tasks: code summarization, assert generation, and code translation. Our results demonstrate competitive performance compared to full parameter fine-tuning while reducing up to 99\% trainable parameters to save memory. \approach also remains competitive against other fine-tuning approaches (e.g., LoRA, Prompt-Tuning, Prefix-Tuning) while using only 1.5\%-80\% of their trainable parameters. Our findings show that integrating structural and dependency information through Transducer Tuning enables more efficient model adaptation, making it easier for users to adapt large models in resource-constrained settings.

Related papers

VecFormer: Towards Efficient and Generalizable Graph Transformer with Graph Token Attention [61.96837866507746]
VecFormer is an efficient and highly generalizable model for node classification.<n>VecFormer outperforms the existing Graph Transformer in both performance and speed.
arXiv Detail & Related papers (2026-02-23T09:10:39Z)
Controllable-LPMoE: Adapting to Challenging Object Segmentation via Dynamic Local Priors from Mixture-of-Experts [16.21786310193235]
We propose a novel dynamic priors-based fine-tuning paradigm with fewer trainable parameters, dubbed Controllable-LPMoE.<n>We construct a lightweight dynamic mixed local priors extractor that captures diverse local priors from input images through heterogeneous convolutions.<n>We also design a bi-directional interaction adapter that employs cosine-aligned deformable attention and channel-oriented adaptive scale enhancement.
arXiv Detail & Related papers (2025-10-24T03:03:59Z)
Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks [17.067788440109137]
Mixture-of-Experts (MoE) models are now standard in state-of-the-art systems.<n>We investigate how MoE sparsity influences two distinct capability regimes: memorization skills and reasoning skills.
arXiv Detail & Related papers (2025-08-26T04:31:28Z)
Contextual Graph Transformer: A Small Language Model for Enhanced Engineering Document Information Extraction [0.0]
Contextual Graph Transformer (CGT) is a hybrid neural architecture that combines Graph Neural Networks (GNNs) and Transformers for domain-specific question answering.<n>CGT constructs a dynamic graph over input tokens using sequential, skip-gram, and semantic similarity edges.<n>It outperforms baselines like GPT-2 and BERT, achieving 24.7% higher accuracy than GPT-2 with 62.4% fewer parameters.
arXiv Detail & Related papers (2025-08-04T15:41:35Z)
Hyper Compressed Fine-Tuning of Large Foundation Models with Quantum Inspired Adapters [0.0]
emphQuantum-Inspired Adapters, a PEFT approach inspired by Hamming-weight quantum circuits from quantum machine learning literature. We test our proposed adapters by adapting large language models and large vision transformers on benchmark datasets.
arXiv Detail & Related papers (2025-02-10T13:06:56Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision [52.80792724919329]
We introduce a novel framework named Adapter-X to improve fine-tuning in 2D image and 3D point cloud modalities. It is the first to outperform full fine-tuning in both 2D image and 3D point cloud modalities with significantly fewer parameters, i.e., only 0.20% and 1.88% of original trainable parameters for 2D and 3D classification tasks.
arXiv Detail & Related papers (2024-06-05T08:26:44Z)
Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference [14.030836300221756]
textbfSparse-Tuning is a novel PEFT method that accounts for the information redundancy in images and videos. Sparse-Tuning minimizes the quantity of tokens processed at each layer, leading to a quadratic reduction in computational and memory overhead. Our results show that our Sparse-Tuning reduces GFLOPs to textbf62%-70% of the original ViT-B while achieving state-of-the-art performance.
arXiv Detail & Related papers (2024-05-23T15:34:53Z)
Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis [51.14136878142034]
Point cloud analysis has achieved outstanding performance by transferring point cloud pre-trained models. Existing methods for model adaptation usually update all model parameters, which is inefficient as it relies on high computational costs. In this paper, we aim to study parameter-efficient transfer learning for point cloud analysis with an ideal trade-off between task performance and parameter efficiency.
arXiv Detail & Related papers (2024-03-03T08:25:04Z)
Generative Parameter-Efficient Fine-Tuning [8.481707805559589]
GIFT learns to generate the fine-tuned weights for a layer directly from its pretrained weights. We show this formulation bridges parameter-efficient fine-tuning and representation fine-tuning.
arXiv Detail & Related papers (2023-12-01T16:33:57Z)
G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning for Graph Transformer Networks [0.7118812771905295]
We show that it is sub-optimal to directly transfer existing PEFTs to graph-based tasks due to the issue of feature distribution shift. We propose a novel structure-aware PEFT approach, named G-Adapter, to guide the updating process. Extensive experiments demonstrate that G-Adapter obtains the state-of-the-art performance compared to the counterparts on nine graph benchmark datasets.
arXiv Detail & Related papers (2023-05-17T16:10:36Z)
Consolidator: Mergeable Adapter with Grouped Connections for Visual Adaptation [53.835365470800916]
We show how to efficiently and effectively transfer knowledge in a vision transformer. We propose consolidator to modify the pre-trained model with the addition of a small set of tunable parameters. Our consolidator can reach up to 7.56 better accuracy than full fine-tuning with merely 0.35% parameters.
arXiv Detail & Related papers (2023-04-30T23:59:02Z)
Energy-efficient Task Adaptation for NLP Edge Inference Leveraging Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks. We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z)
AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large Language Models [119.7093605087114]
Fine-tuning large-scale pre-trained language models to downstream tasks require updating hundreds of millions of parameters. This not only increases the serving cost to store a large copy of the model weights for every task, but also exhibits instability during few-shot task adaptation. We introduce a new mechanism to improve adapter capacity without increasing parameters or computational cost by two key techniques.
arXiv Detail & Related papers (2022-05-24T23:41:22Z)
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models [152.29364079385635]
As pre-trained models grow bigger, the fine-tuning process can be time-consuming and computationally expensive. We propose a framework for resource- and parameter-efficient fine-tuning by leveraging the sparsity prior in both weight updates and the final model weights. Our proposed framework, dubbed Dually Sparsity-Embedded Efficient Tuning (DSEE), aims to achieve two key objectives: (i) parameter efficient fine-tuning and (ii) resource-efficient inference.
arXiv Detail & Related papers (2021-10-30T03:29:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.