Related papers: EyeTrans: Merging Human and Machine Attention for Neural Code Summarization

EyeTrans: Merging Human and Machine Attention for Neural Code Summarization

URL: http://arxiv.org/abs/2402.14096v3
Date: Thu, 29 Feb 2024 13:33:58 GMT
Title: EyeTrans: Merging Human and Machine Attention for Neural Code Summarization
Authors: Yifan Zhang, Jiliang Li, Zachary Karas, Aakash Bansal, Toby Jia-Jun Li, Collin McMillan, Kevin Leach, Yu Huang
Abstract summary: We develop a method for incorporating human attention into machine attention to enhance neural code summarization. We conduct comprehensive experiments on two code summarization tasks to demonstrate the effectiveness of incorporating human attention into Transformers.
Score: 16.694601606682046
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural code summarization leverages deep learning models to automatically generate brief natural language summaries of code snippets. The development of Transformer models has led to extensive use of attention during model design. While existing work has primarily and almost exclusively focused on static properties of source code and related structural representations like the Abstract Syntax Tree (AST), few studies have considered human attention, that is, where programmers focus while examining and comprehending code. In this paper, we develop a method for incorporating human attention into machine attention to enhance neural code summarization. To facilitate this incorporation and vindicate this hypothesis, we introduce EyeTrans, which consists of three steps: (1) we conduct an extensive eye-tracking human study to collect and pre-analyze data for model training, (2) we devise a data-centric approach to integrate human attention with machine attention in the Transformer architecture, and (3) we conduct comprehensive experiments on two code summarization tasks to demonstrate the effectiveness of incorporating human attention into Transformers. Integrating human attention leads to an improvement of up to 29.91% in Functional Summarization and up to 6.39% in General Code Summarization performance, demonstrating the substantial benefits of this combination. We further explore performance in terms of robustness and efficiency by creating challenging summarization scenarios in which EyeTrans exhibits interesting properties. We also visualize the attention map to depict the simplifying effect of machine attention in the Transformer by incorporating human attention. This work has the potential to propel AI research in software engineering by introducing more human-centered approaches and data.

Related papers

A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning [67.72413262980272]
Pre-trained vision models (PVMs) are fundamental to modern robotics, yet their optimal configuration remains unclear. We develop SlotMIM, a method that induces object-centric representations by introducing a semantic bottleneck. Our approach achieves significant improvements over prior work in image recognition, scene understanding, and robot learning evaluations.
arXiv Detail & Related papers (2025-03-10T06:18:31Z)
Aligning Human and Machine Attention for Enhanced Supervised Learning [2.3311636727756055]
Humans continue to outperform machines in certain learning tasks. It seems plausible that machine performance could be enriched by aligning machine attention with human attention mechanisms. This paper proposes a new approach called Human-Machine Attention Learning (HuMAL)
arXiv Detail & Related papers (2025-02-04T20:44:38Z)
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation [63.87956583202729]
We conceptualize attention as a feature map and apply the convolution operator to mimic the processing methods in computer vision. The novel insight, which can be adapted to various attention-related models, reveals that the current Transformer architecture has the potential for further evolution.
arXiv Detail & Related papers (2024-10-07T07:21:49Z)
Neural-Logic Human-Object Interaction Detection [67.4993347702353]
We present L OGIC HOI, a new HOI detector that leverages neural-logic reasoning and Transformer to infer feasible interactions between entities. Specifically, we modify the self-attention mechanism in vanilla Transformer, enabling it to reason over the human, action, object> triplet and constitute novel interactions. We formulate these two properties in first-order logic and ground them into continuous space to constrain the learning process of our approach, leading to improved performance and zero-shot generalization capabilities.
arXiv Detail & Related papers (2023-11-16T11:47:53Z)
Enhancing Robot Learning through Learned Human-Attention Feature Maps [6.724036710994883]
We think that embedding auxiliary information about focus point into robot learning would enhance efficiency and robustness of the learning process. In this paper, we propose a novel approach to model and emulate the human attention with an approximate prediction model. We test our approach on two learning tasks - object detection and imitation learning.
arXiv Detail & Related papers (2023-08-29T14:23:44Z)
FLatten Transformer: Vision Transformer using Focused Linear Attention [80.61335173752146]
Linear attention offers a much more efficient alternative with its linear complexity. Current linear attention approaches either suffer from significant performance degradation or introduce additional computation overhead. We propose a novel Focused Linear Attention module to achieve both high efficiency and expressiveness.
arXiv Detail & Related papers (2023-08-01T10:37:12Z)
AttentionViz: A Global View of Transformer Attention [60.82904477362676]
We present a new visualization technique designed to help researchers understand the self-attention mechanism in transformers. The main idea behind our method is to visualize a joint embedding of the query and key vectors used by transformer models to compute attention. We create an interactive visualization tool, AttentionViz, based on these joint query-key embeddings.
arXiv Detail & Related papers (2023-05-04T23:46:49Z)
Gaze-based Attention Recognition for Human-Robot Collaboration [0.0]
We present an assembly scenario where a human operator and a cobot collaborate equally to piece together a gearbox. As a first step, we recognize the areas in the workspace that the human operator is paying attention to. We propose a novel deep-learning approach to develop an attention recognition model.
arXiv Detail & Related papers (2023-03-30T11:55:38Z)
Multimodal Vision Transformers with Forced Attention for Behavior Analysis [0.0]
We introduce the Forced Attention (FAt) Transformer which utilize forced attention with a modified backbone for input encoding and a use of additional inputs. FAt Transformers are applied to two downstream tasks: personality recognition and body language recognition. We achieve state-of-the-art results for Udiva v0.5, First Impressions v2 and MPII Group Interaction datasets.
arXiv Detail & Related papers (2022-12-07T21:56:50Z)
Focused Decoding Enables 3D Anatomical Detection by Transformers [64.36530874341666]
We propose a novel Detection Transformer for 3D anatomical structure detection, dubbed Focused Decoder. Focused Decoder leverages information from an anatomical region atlas to simultaneously deploy query anchors and restrict the cross-attention's field of view. We evaluate our proposed approach on two publicly available CT datasets and demonstrate that Focused Decoder not only provides strong detection results and thus alleviates the need for a vast amount of annotated data but also exhibits exceptional and highly intuitive explainability of results via attention weights.
arXiv Detail & Related papers (2022-07-21T22:17:21Z)
Multi-View Fusion Transformer for Sensor-Based Human Activity Recognition [15.845205542668472]
Sensor-based human activity recognition (HAR) aims to recognize human activities based on the availability of rich time-series data collected from multi-modal sensors such as accelerometers and gyroscopes. Recent deep learning methods are focusing on one view of the data, i.e., the temporal view, while shallow methods tend to utilize the hand-craft features for recognition, e.g., the statistics view. We propose a novel method, namely multi-view fusion transformer (MVFT) along with a novel attention mechanism.
arXiv Detail & Related papers (2022-02-16T07:15:22Z)
Cost-effective Interactive Attention Learning with Neural Attention Processes [79.8115563067513]
We propose a novel interactive learning framework which we refer to as Interactive Attention Learning (IAL) IAL is prone to overfitting due to scarcity of human annotations, and requires costly retraining. We tackle these challenges by proposing a sample-efficient attention mechanism and a cost-effective reranking algorithm for instances and features.
arXiv Detail & Related papers (2020-06-09T17:36:41Z)
Human brain activity for machine attention [8.673635963837532]
We are the first to exploit neuroscientific data, namely electroencephalography (EEG), to inform a neural attention model about language processing of the human brain. We devise a method for finding such EEG features to supervise machine attention through combining theoretically motivated cropping with random forest tree splits. We apply these features to regularise attention on relation classification and show that EEG is more informative than strong baselines.
arXiv Detail & Related papers (2020-06-09T08:39:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.