Transformer with Tree-order Encoding for Neural Program Generation
- URL: http://arxiv.org/abs/2206.13354v1
- Date: Mon, 30 May 2022 12:27:48 GMT
- Title: Transformer with Tree-order Encoding for Neural Program Generation
- Authors: Klaudia-Doris Thellmann, Bernhard Stadler, Ricardo Usbeck, Jens
Lehmann
- Abstract summary: We introduce a tree-based positional encoding and a shared natural-language subword vocabulary for Transformers.
Our findings suggest that employing a tree-based positional encoding in combination with a shared natural-language subword vocabulary improves generation performance over sequential positional encodings.
- Score: 8.173517923612426
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While a considerable amount of semantic parsing approaches have employed RNN
architectures for code generation tasks, there have been only few attempts to
investigate the applicability of Transformers for this task. Including
hierarchical information of the underlying programming language syntax has
proven to be effective for code generation. Since the positional encoding of
the Transformer can only represent positions in a flat sequence, we have
extended the encoding scheme to allow the attention mechanism to also attend
over hierarchical positions in the input. Furthermore, we have realized a
decoder based on a restrictive grammar graph model to improve the generation
accuracy and ensure the well-formedness of the generated code. While we did not
surpass the state of the art, our findings suggest that employing a tree-based
positional encoding in combination with a shared natural-language subword
vocabulary improves generation performance over sequential positional
encodings.
Related papers
- Theoretical Analysis of Hierarchical Language Recognition and Generation by Transformers without Positional Encoding [32.01426831450348]
We show that causal masking and a starting token enable Transformers to compute positional information and depth within hierarchical structures.
We demonstrate that Transformers without positional encoding can generate hierarchical languages.
arXiv Detail & Related papers (2024-10-16T09:56:01Z) - Improving Transformers using Faithful Positional Encoding [55.30212768657544]
We propose a new positional encoding method for a neural network architecture called the Transformer.
Unlike the standard sinusoidal positional encoding, our approach has a guarantee of not losing information about the positional order of the input sequence.
arXiv Detail & Related papers (2024-05-15T03:17:30Z) - Decoder-Only or Encoder-Decoder? Interpreting Language Model as a
Regularized Encoder-Decoder [75.03283861464365]
The seq2seq task aims at generating the target sequence based on the given input source sequence.
Traditionally, most of the seq2seq task is resolved by an encoder to encode the source sequence and a decoder to generate the target text.
Recently, a bunch of new approaches have emerged that apply decoder-only language models directly to the seq2seq task.
arXiv Detail & Related papers (2023-04-08T15:44:29Z) - Planning with Large Language Models for Code Generation [100.07232672883897]
Planning-Guided Transformer Decoding (PG-TD) uses a planning algorithm to do lookahead search and guide the Transformer to generate better programs.
We empirically evaluate our framework with several large language models as backbones on public coding challenge benchmarks.
arXiv Detail & Related papers (2023-03-09T18:59:47Z) - Towards More Efficient Insertion Transformer with Fractional Positional
Encoding [44.45401243989363]
Auto-regressive neural sequence models have been shown to be effective across text generation tasks.
Their left-to-right decoding order prevents generation from being parallelized.
Insertion Transformer is an attractive alternative that allows outputting multiple tokens in a single generation step.
arXiv Detail & Related papers (2021-12-12T18:38:27Z) - Less is More: Pre-training a Strong Siamese Encoder Using a Weak Decoder [75.84152924972462]
Many real-world applications use Siamese networks to efficiently match text sequences at scale.
This paper pre-trains language models dedicated to sequence matching in Siamese architectures.
arXiv Detail & Related papers (2021-02-18T08:08:17Z) - On Efficient Training, Controllability and Compositional Generalization
of Insertion-based Language Generators [18.98725770517241]
InsNet is an insertion-based sequence model that can be trained as efficiently as transformer decoders.
We evaluate InsNet on story generation and CleVR-CoGENT captioning.
arXiv Detail & Related papers (2021-02-12T11:05:02Z) - Cross-Thought for Sentence Encoder Pre-training [89.32270059777025]
Cross-Thought is a novel approach to pre-training sequence encoder.
We train a Transformer-based sequence encoder over a large set of short sequences.
Experiments on question answering and textual entailment tasks demonstrate that our pre-trained encoder can outperform state-of-the-art encoders.
arXiv Detail & Related papers (2020-10-07T21:02:41Z) - Bi-Decoder Augmented Network for Neural Machine Translation [108.3931242633331]
We propose a novel Bi-Decoder Augmented Network (BiDAN) for the neural machine translation task.
Since each decoder transforms the representations of the input text into its corresponding language, jointly training with two target ends can make the shared encoder has the potential to produce a language-independent semantic space.
arXiv Detail & Related papers (2020-01-14T02:05:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.