Software Language Comprehension using a Program-Derived Semantics Graph
- URL: http://arxiv.org/abs/2004.00768v3
- Date: Fri, 11 Dec 2020 17:48:56 GMT
- Title: Software Language Comprehension using a Program-Derived Semantics Graph
- Authors: Roshni G. Iyer, Yizhou Sun, Wei Wang, Justin Gottschlich
- Abstract summary: We present the program-derived semantics graph, a new structure to capture semantics of code.
The PSG is designed to provide a single structure for capturing program semantics at multiple levels of abstraction.
Although our exploration into the PSG is in its infancy, our early results and architectural analysis indicate it is a promising new research direction to automatically extract program semantics.
- Score: 29.098303489400394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditional code transformation structures, such as abstract syntax trees
(ASTs), conteXtual flow graphs (XFGs), and more generally, compiler
intermediate representations (IRs), may have limitations in extracting
higher-order semantics from code. While work has already begun on higher-order
semantics lifting (e.g., Aroma's simplified parse tree (SPT), verified
lifting's lambda calculi, and Halide's intentional domain specific language
(DSL)), research in this area is still immature. To continue to advance this
research, we present the program-derived semantics graph, a new graphical
structure to capture semantics of code. The PSG is designed to provide a single
structure for capturing program semantics at multiple levels of abstraction.
The PSG may be in a class of emerging structural representations that cannot be
built from a traditional set of predefined rules and instead must be learned.
In this paper, we describe the PSG and its fundamental structural differences
compared to state-of-the-art structures. Although our exploration into the PSG
is in its infancy, our early results and architectural analysis indicate it is
a promising new research direction to automatically extract program semantics.
Related papers
- Physics of Language Models: Part 1, Learning Hierarchical Language Structures [51.68385617116854]
Transformer-based language models are effective but complex, and understanding their inner workings is a significant challenge.
We introduce a family of synthetic CFGs that produce hierarchical rules, capable of generating lengthy sentences.
We demonstrate that generative models like GPT can accurately learn this CFG language and generate sentences based on it.
arXiv Detail & Related papers (2023-05-23T04:28:16Z) - Outline, Then Details: Syntactically Guided Coarse-To-Fine Code
Generation [61.50286000143233]
ChainCoder is a program synthesis language model that generates Python code progressively.
A tailored transformer architecture is leveraged to jointly encode the natural language descriptions and syntactically aligned I/O data samples.
arXiv Detail & Related papers (2023-04-28T01:47:09Z) - LAGr: Label Aligned Graphs for Better Systematic Generalization in
Semantic Parsing [7.2484012208081205]
We show that better systematic generalization can be achieved by producing the meaning representation directly as a graph and not as a sequence.
We propose LAGr, a general framework to produce semantic parses by independently predicting node and edge labels for a complete multi-layer input-aligned graph.
Experiments demonstrate that LAGr achieves significant improvements in systematic generalization upon the baseline seq2seqs in both strongly- and weakly-supervised settings.
arXiv Detail & Related papers (2022-05-19T15:01:37Z) - Graph Adaptive Semantic Transfer for Cross-domain Sentiment
Classification [68.06496970320595]
Cross-domain sentiment classification (CDSC) aims to use the transferable semantics learned from the source domain to predict the sentiment of reviews in the unlabeled target domain.
We present Graph Adaptive Semantic Transfer (GAST) model, an adaptive syntactic graph embedding method that is able to learn domain-invariant semantics from both word sequences and syntactic graphs.
arXiv Detail & Related papers (2022-05-18T07:47:01Z) - Incorporating Constituent Syntax for Coreference Resolution [50.71868417008133]
We propose a graph-based method to incorporate constituent syntactic structures.
We also explore to utilise higher-order neighbourhood information to encode rich structures in constituent trees.
Experiments on the English and Chinese portions of OntoNotes 5.0 benchmark show that our proposed model either beats a strong baseline or achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-02-22T07:40:42Z) - LAGr: Labeling Aligned Graphs for Improving Systematic Generalization in
Semantic Parsing [6.846638912020957]
We show that better systematic generalization can be achieved by producing the meaning representation directly as a graph and not as a sequence.
We propose LAGr, the Labeling Aligned Graphs algorithm that produces semantic parses by predicting node and edge labels for a complete multi-layer input-aligned graph.
arXiv Detail & Related papers (2021-10-14T17:37:04Z) - Hierarchical Poset Decoding for Compositional Generalization in Language [52.13611501363484]
We formalize human language understanding as a structured prediction task where the output is a partially ordered set (poset)
Current encoder-decoder architectures do not take the poset structure of semantics into account properly.
We propose a novel hierarchical poset decoding paradigm for compositional generalization in language.
arXiv Detail & Related papers (2020-10-15T14:34:26Z) - Graph-Structured Referring Expression Reasoning in The Wild [105.95488002374158]
Grounding referring expressions aims to locate in an image an object referred to by a natural language expression.
We propose a scene graph guided modular network (SGMN) to perform reasoning over a semantic graph and a scene graph.
We also propose Ref-Reasoning, a large-scale real-world dataset for structured referring expression reasoning.
arXiv Detail & Related papers (2020-04-19T11:00:30Z) - Context-Aware Parse Trees [18.77504064534521]
We present a new tree structure, heavily influenced by Aroma's SPT, called a emphcontext-aware parse tree (CAPT)
CAPT enhances SPT by providing a richer level of semantic representation.
Our research quantitatively demonstrates the value of our proposed semantically-salient features, enabling a specific CAPT configuration to be 39% more accurate than SPT across the 48,610 programs we analyzed.
arXiv Detail & Related papers (2020-03-24T21:19:14Z) - Vector symbolic architectures for context-free grammars [0.5862282909017474]
Vector symbolic architectures (VSA) are a viable approach for the hyperdimensional representation of symbolic data.
We present a rigorous framework for the representation of phrase structure trees and parse trees of context-free grammars (CFG) in Fock space.
Our approach could leverage the development of VSA for explainable artificial intelligence (XAI) by means of hyperdimensional deep neural computation.
arXiv Detail & Related papers (2020-03-11T09:07:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.