HAConvGNN: Hierarchical Attention Based Convolutional Graph Neural
Network for Code Documentation Generation in Jupyter Notebooks
- URL: http://arxiv.org/abs/2104.01002v1
- Date: Wed, 31 Mar 2021 22:36:41 GMT
- Title: HAConvGNN: Hierarchical Attention Based Convolutional Graph Neural
Network for Code Documentation Generation in Jupyter Notebooks
- Authors: Xuye Liu, Dakuo Wang, April Wang, Lingfei Wu
- Abstract summary: We propose a hierarchical attention-based ConvGNN component to augment the Seq2Seq network.
We build a dataset with publicly available Kaggle notebooks and evaluate our model (HAConvGNN) against baseline models.
- Score: 33.37494243822309
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many data scientists use Jupyter notebook to experiment code, visualize
results, and document rationales or interpretations. The code documentation
generation CDG task in notebooks is related but different from the code
summarization task in software engineering, as one documentation (markdown
cell) may consist of a text (informative summary or indicative rationale) for
multiple code cells. Our work aims to solve the CDG task by encoding the
multiple code cells as separated AST graph structures, for which we propose a
hierarchical attention-based ConvGNN component to augment the Seq2Seq network.
We build a dataset with publicly available Kaggle notebooks and evaluate our
model (HAConvGNN) against baseline models (e.g., Code2Seq or Graph2Seq).
Related papers
- Contextualized Data-Wrangling Code Generation in Computational Notebooks [131.26365849822932]
We propose an automated approach, CoCoMine, to mine data-wrangling code generation examples with clear multi-modal contextual dependency.
We construct CoCoNote, a dataset containing 58,221 examples for Contextualized Data-wrangling Code generation in Notebooks.
Experiment results demonstrate the significance of incorporating data context in data-wrangling code generation.
arXiv Detail & Related papers (2024-09-20T14:49:51Z) - Typhon: Automatic Recommendation of Relevant Code Cells in Jupyter Notebooks [0.3122672716129843]
This paper proposes Typhon, an approach to automatically recommend relevant code cells in Jupyter notebooks.
Typhon tokenizes developers' markdown description cells and looks for the most similar code cells from the database.
We evaluated the Typhon tool on Jupyter notebooks from Kaggle competitions and found that the approach can recommend code cells with moderate accuracy.
arXiv Detail & Related papers (2024-05-15T03:59:59Z) - CodeExp: Explanatory Code Document Generation [94.43677536210465]
Existing code-to-text generation models produce only high-level summaries of code.
We conduct a human study to identify the criteria for high-quality explanatory docstring for code.
We present a multi-stage fine-tuning strategy and baseline models for the task.
arXiv Detail & Related papers (2022-11-25T18:05:44Z) - Doc2Graph: a Task Agnostic Document Understanding Framework based on
Graph Neural Networks [0.965964228590342]
We propose Doc2Graph, a task-agnostic document understanding framework based on a GNN model.
We evaluate our approach on two challenging datasets for key information extraction in form understanding, invoice layout analysis and table detection.
arXiv Detail & Related papers (2022-08-23T19:48:10Z) - Graph Spring Network and Informative Anchor Selection for Session-based
Recommendation [2.6524289609910654]
Session-based recommendation (SBR) aims at predicting the next item for an ongoing anonymous session.
The major challenge of SBR is how to capture richer relations in between items and learn ID-based item embeddings to capture such relations.
We propose a new graph neural network, called Graph Spring Network (GSN), for learning ID-based item embedding on an item graph.
arXiv Detail & Related papers (2022-02-19T02:47:44Z) - Text Classification for Task-based Source Code Related Questions [0.0]
StackOverflow provides solutions in small snippets which provide a complete answer to whatever task question the developer wants to code.
We develop a two-fold deep learning model: Seq2Seq and a binary classifier that takes in the intent (which is in natural language) and code snippets in Python.
We find that the hidden state layer's embeddings perform slightly better than regular standard embeddings from a constructed vocabulary.
arXiv Detail & Related papers (2021-10-31T20:10:21Z) - Deep Graph Matching and Searching for Semantic Code Retrieval [76.51445515611469]
We propose an end-to-end deep graph matching and searching model based on graph neural networks.
We first represent both natural language query texts and programming language code snippets with the unified graph-structured data.
In particular, DGMS not only captures more structural information for individual query texts or code snippets but also learns the fine-grained similarity between them.
arXiv Detail & Related papers (2020-10-24T14:16:50Z) - GraphCodeBERT: Pre-training Code Representations with Data Flow [97.00641522327699]
We present GraphCodeBERT, a pre-trained model for programming language that considers the inherent structure of code.
We use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables.
We evaluate our model on four tasks, including code search, clone detection, code translation, and code refinement.
arXiv Detail & Related papers (2020-09-17T15:25:56Z) - Learning to map source code to software vulnerability using
code-as-a-graph [67.62847721118142]
We explore the applicability of Graph Neural Networks in learning the nuances of source code from a security perspective.
We show that a code-as-graph encoding is more meaningful for vulnerability detection than existing code-as-photo and linear sequence encoding approaches.
arXiv Detail & Related papers (2020-06-15T16:05:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.