Extracting Meaningful Attention on Source Code: An Empirical Study of
Developer and Neural Model Code Exploration
- URL: http://arxiv.org/abs/2210.05506v1
- Date: Tue, 11 Oct 2022 14:58:58 GMT
- Title: Extracting Meaningful Attention on Source Code: An Empirical Study of
Developer and Neural Model Code Exploration
- Authors: Matteo Paltenghi, Rahul Pandita, Austin Z. Henley, Albert Ziegler
- Abstract summary: This work compares multiple approaches to post-process these valuable attention weights for supporting code exploration.
Specifically, we compare to which extent the transformed attention signal of CodeGen, a large and publicly available pretrained neural model, agrees with how developers look at and explore code.
We also introduce a novel practical application of the attention signal of pre-trained models with completely analytical solutions.
- Score: 4.644827993583995
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The high effectiveness of neural models of code, such as OpenAI Codex and
AlphaCode, suggests coding capabilities of models that are at least comparable
to those of humans. However, previous work has only used these models for their
raw completion, ignoring how the model reasoning, in the form of attention
weights, can be used for other downstream tasks. Disregarding the attention
weights means discarding a considerable portion of what those models compute
when queried. To profit more from the knowledge embedded in these large
pre-trained models, this work compares multiple approaches to post-process
these valuable attention weights for supporting code exploration. Specifically,
we compare to which extent the transformed attention signal of CodeGen, a large
and publicly available pretrained neural model, agrees with how developers look
at and explore code when each answering the same sense-making questions about
code. At the core of our experimental evaluation, we collect, manually
annotate, and open-source a novel eye-tracking dataset comprising 25 developers
answering sense-making questions on code over 92 sessions. We empirically
evaluate five attention-agnostic heuristics and ten attention-based post
processing approaches of the attention signal against our ground truth of
developers exploring code, including the novel concept of follow-up attention
which exhibits the highest agreement. Beyond the dataset contribution and the
empirical study, we also introduce a novel practical application of the
attention signal of pre-trained models with completely analytical solutions,
going beyond how neural models' attention mechanisms have traditionally been
used.
Related papers
- Toward Exploring the Code Understanding Capabilities of Pre-trained Code Generation Models [12.959392500354223]
We pioneer the transfer of knowledge from pre-trained code generation models to code understanding tasks.
We introduce CL4D, a contrastive learning method designed to enhance the representation capabilities of decoder-only models.
arXiv Detail & Related papers (2024-06-18T06:52:14Z) - Automatic Discovery of Visual Circuits [66.99553804855931]
We explore scalable methods for extracting the subgraph of a vision model's computational graph that underlies recognition of a specific visual concept.
We find that our approach extracts circuits that causally affect model output, and that editing these circuits can defend large pretrained models from adversarial attacks.
arXiv Detail & Related papers (2024-04-22T17:00:57Z) - Naturalness of Attention: Revisiting Attention in Code Language Models [3.756550107432323]
Language models for code such as CodeBERT offer the capability to learn advanced source code representation, but their opacity poses barriers to understanding of captured properties.
This study aims to shed some light on the previously ignored factors of the attention mechanism beyond the attention weights.
arXiv Detail & Related papers (2023-11-22T16:34:12Z) - MENTOR: Human Perception-Guided Pretraining for Increased Generalization [5.596752018167751]
We introduce MENTOR (huMan pErceptioN-guided preTraining fOr increased geneRalization)
We train an autoencoder to learn human saliency maps given an input image, without class labels.
We remove the decoder part, add a classification layer on top of the encoder, and fine-tune this new model conventionally.
arXiv Detail & Related papers (2023-10-30T13:50:44Z) - Towards Modeling Human Attention from Eye Movements for Neural Source
Code Summarization [6.435578628605734]
We use eye-tracking data to create a model of human attention.
The model predicts which words in source code are the most important for code summarization.
We observe an improvement in prediction performance of the augmented approach in line with other bio-inspired neural models.
arXiv Detail & Related papers (2023-05-16T19:56:45Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Discrete Key-Value Bottleneck [95.61236311369821]
Deep neural networks perform well on classification tasks where data streams are i.i.d. and labeled data is abundant.
One powerful approach that has addressed this challenge involves pre-training of large encoders on volumes of readily available data, followed by task-specific tuning.
Given a new task, however, updating the weights of these encoders is challenging as a large number of weights needs to be fine-tuned, and as a result, they forget information about the previous tasks.
We propose a model architecture to address this issue, building upon a discrete bottleneck containing pairs of separate and learnable key-value codes.
arXiv Detail & Related papers (2022-07-22T17:52:30Z) - What Makes Good Contrastive Learning on Small-Scale Wearable-based
Tasks? [59.51457877578138]
We study contrastive learning on the wearable-based activity recognition task.
This paper presents an open-source PyTorch library textttCL-HAR, which can serve as a practical tool for researchers.
arXiv Detail & Related papers (2022-02-12T06:10:15Z) - Data-Driven and SE-assisted AI Model Signal-Awareness Enhancement and
Introspection [61.571331422347875]
We propose a data-driven approach to enhance models' signal-awareness.
We combine the SE concept of code complexity with the AI technique of curriculum learning.
We achieve up to 4.8x improvement in model signal awareness.
arXiv Detail & Related papers (2021-11-10T17:58:18Z) - Demystifying Code Summarization Models [5.608277537412537]
We evaluate four prominent code summarization models: extreme summarizer, code2vec, code2seq, and sequence GNN.
Results show that all models base their predictions on syntactic and lexical properties with little to none semantic implication.
We present a novel approach to explaining the predictions of code summarization models through the lens of training data.
arXiv Detail & Related papers (2021-02-09T03:17:46Z) - Rethinking Generalization of Neural Models: A Named Entity Recognition
Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives.
Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models.
As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.