Related papers: Implant Global and Local Hierarchy Information to Sequence based Code Representation Models

Implant Global and Local Hierarchy Information to Sequence based Code Representation Models

URL: http://arxiv.org/abs/2303.07826v1
Date: Tue, 14 Mar 2023 12:01:39 GMT
Title: Implant Global and Local Hierarchy Information to Sequence based Code Representation Models
Authors: Kechi Zhang, Zhuo Li, Zhi Jin, Ge Li
Abstract summary: We analyze how the complete hierarchical structure influences the tokens in code sequences and abstract this influence as a property of code tokens called hierarchical embedding. We propose the Hierarchy Transformer (HiT), a simple but effective sequence model to incorporate the complete hierarchical embeddings of source code into a Transformer model.
Score: 25.776540440893257
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Source code representation with deep learning techniques is an important research field. There have been many studies that learn sequential or structural information for code representation. But sequence-based models and non-sequence-models both have their limitations. Researchers attempt to incorporate structural information to sequence-based models, but they only mine part of token-level hierarchical structure information. In this paper, we analyze how the complete hierarchical structure influences the tokens in code sequences and abstract this influence as a property of code tokens called hierarchical embedding. The hierarchical embedding is further divided into statement-level global hierarchy and token-level local hierarchy. Furthermore, we propose the Hierarchy Transformer (HiT), a simple but effective sequence model to incorporate the complete hierarchical embeddings of source code into a Transformer model. We demonstrate the effectiveness of hierarchical embedding on learning code structure with an experiment on variable scope detection task. Further evaluation shows that HiT outperforms SOTA baseline models and show stable training efficiency on three source code-related tasks involving classification and generation tasks across 8 different datasets.

Related papers

Learning and Evaluating Hierarchical Feature Representations [3.770103075126785]
We propose a novel framework, Hierarchical Composition of Orthogonal Subspaces (Hier-COS) Hier-COS learns to map deep feature embeddings into a vector space that is, by design, consistent with the structure of a given taxonomy tree. We demonstrate that Hier-COS achieves state-of-the-art hierarchical performance across all the datasets while simultaneously beating top-1 accuracy in all but one case.
arXiv Detail & Related papers (2025-03-10T20:59:41Z)
EpiCoder: Encompassing Diversity and Complexity in Code Generation [49.170195362149386]
We introduce a novel feature tree-based synthesis framework inspired by Abstract Syntax Trees (AST) Unlike AST, which captures syntactic structure of code, our framework models semantic relationships between code elements. We fine-tuned widely-used base models to create the EpiCoder series, achieving state-of-the-art performance at both the function and file levels.
arXiv Detail & Related papers (2025-01-08T18:58:15Z)
From Logits to Hierarchies: Hierarchical Clustering made Simple [16.132657141993548]
We show that a lightweight procedure implemented on top of pre-trained non-hierarchical clustering models outperforms models designed specifically for hierarchical clustering. Our proposed approach is computationally efficient and applicable to any pre-trained clustering model that outputs logits, without requiring any fine-tuning.
arXiv Detail & Related papers (2024-10-10T12:27:45Z)
How transformers learn structured data: insights from hierarchical filtering [2.7784685368355744]
We introduce a hierarchical filtering procedure for generative models of sequences on trees. We provide evidence that vanilla encoder-only transformer architectures can implement the optimal Belief propagation algorithm. We analyze how the transformer layers succeed by focusing on attention maps from models trained with varying degrees of filtering.
arXiv Detail & Related papers (2024-08-27T15:23:09Z)
Learning Syntax Without Planting Trees: Understanding Hierarchical Generalization in Transformers [74.96551626420188]
Transformers trained on natural language data have been shown to learn its hierarchical structure and generalize to sentences with unseen syntactic structures. We investigate sources of inductive bias in transformer models and their training that could cause such generalization behavior to emerge.
arXiv Detail & Related papers (2024-04-25T07:10:29Z)
Generating Hierarchical Structures for Improved Time Series Classification Using Stochastic Splitting Functions [0.0]
This study introduces a novel hierarchical divisive clustering approach with splitting functions (SSFs) to enhance classification performance in multi-class datasets through hierarchical classification (HC) The method has the unique capability of generating hierarchy without requiring explicit information, making it suitable for datasets lacking prior knowledge of hierarchy.
arXiv Detail & Related papers (2023-09-21T10:34:50Z)
How Deep Neural Networks Learn Compositional Data: The Random Hierarchy Model [47.617093812158366]
We introduce the Random Hierarchy Model: a family of synthetic tasks inspired by the hierarchical structure of language and images. We find that deep networks learn the task by developing internal representations invariant to exchanging equivalent groups. Our results indicate how deep networks overcome the curse of dimensionality by building invariant representations.
arXiv Detail & Related papers (2023-07-05T09:11:09Z)
Use All The Labels: A Hierarchical Multi-Label Contrastive Learning Framework [75.79736930414715]
We present a hierarchical multi-label representation learning framework that can leverage all available labels and preserve the hierarchical relationship between classes. We introduce novel hierarchy preserving losses, which jointly apply a hierarchical penalty to the contrastive loss, and enforce the hierarchy constraint.
arXiv Detail & Related papers (2022-04-27T21:41:44Z)
HiStruct+: Improving Extractive Text Summarization with Hierarchical Structure Information [0.6443952406204634]
We propose a novel approach to formulate, extract, encode and inject hierarchical structure information explicitly into an extractive summarization model. Using various experimental settings on three datasets (i.e., CNN/DailyMail, PubMed and arXiv), our HiStruct+ model outperforms a strong baseline collectively.
arXiv Detail & Related papers (2022-03-17T21:49:26Z)
HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression [53.90578309960526]
Large pre-trained language models (PLMs) have shown overwhelming performances compared with traditional neural network methods. We propose a hierarchical relational knowledge distillation (HRKD) method to capture both hierarchical and domain relational information.
arXiv Detail & Related papers (2021-10-16T11:23:02Z)
Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings. We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data. We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
GraphCodeBERT: Pre-training Code Representations with Data Flow [97.00641522327699]
We present GraphCodeBERT, a pre-trained model for programming language that considers the inherent structure of code. We use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables. We evaluate our model on four tasks, including code search, clone detection, code translation, and code refinement.
arXiv Detail & Related papers (2020-09-17T15:25:56Z)
Tree-structured Attention with Hierarchical Accumulation [103.47584968330325]
"Hierarchical Accumulation" encodes parse tree structures into self-attention at constant time complexity. Our approach outperforms SOTA methods in four IWSLT translation tasks and the WMT'14 English-German translation task.
arXiv Detail & Related papers (2020-02-19T08:17:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.