Related papers: Self-Supervised Learning to Prove Equivalence Between Straight-Line Programs via Rewrite Rules

Self-Supervised Learning to Prove Equivalence Between Straight-Line Programs via Rewrite Rules

URL: http://arxiv.org/abs/2109.10476v4
Date: Sun, 9 Jul 2023 02:22:20 GMT
Title: Self-Supervised Learning to Prove Equivalence Between Straight-Line Programs via Rewrite Rules
Authors: Steve Kommrusch, Martin Monperrus and Louis-No\"el Pouchet
Abstract summary: Two programs are equivalent if there exists a sequence of application of rewrite rules that leads to rewriting one program into the other. We propose a neural network architecture based on a transformer model to generate proofs of equivalence between program pairs. Our system, S4Eq, achieves 97% proof success on a curated dataset of 10,000 pairs of equivalent programs.
Score: 9.1570563482476
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We target the problem of automatically synthesizing proofs of semantic equivalence between two programs made of sequences of statements. We represent programs using abstract syntax trees (AST), where a given set of semantics-preserving rewrite rules can be applied on a specific AST pattern to generate a transformed and semantically equivalent program. In our system, two programs are equivalent if there exists a sequence of application of these rewrite rules that leads to rewriting one program into the other. We propose a neural network architecture based on a transformer model to generate proofs of equivalence between program pairs. The system outputs a sequence of rewrites, and the validity of the sequence is simply checked by verifying it can be applied. If no valid sequence is produced by the neural network, the system reports the programs as non-equivalent, ensuring by design no programs may be incorrectly reported as equivalent. Our system is fully implemented for one single grammar which can represent straight-line programs with function calls and multiple types. To efficiently train the system to generate such sequences, we develop an original incremental training technique, named self-supervised sample selection. We extensively study the effectiveness of this novel training approach on proofs of increasing complexity and length. Our system, S4Eq, achieves 97% proof success on a curated dataset of 10,000 pairs of equivalent programs.

Related papers

Complete the Cycle: Reachability Types with Expressive Cyclic References [6.994376737187498]
Reachability Types (RT) are a qualified type system for tracking aliasing and separation in functional and higher-order programming. Previous RT systems have been based on calculi that restrict cyclic dependencies and are shown to be terminating. We extend RT with an expressive cyclic reference type that permits the formation of cyclic dependencies through the store.
arXiv Detail & Related papers (2025-03-10T13:42:02Z)
An Expressive Trace Logic for Recursive Programs [0.36832029288386137]
We present an expressive logic over trace formulas, based on binary state predicates, chop, and least fixed-points. Both, programs and trace formulas, are equipped with a direct-style, fully compositional, denotational semantics. Our results shed light on the correspondence between programming constructs and logical connectives.
arXiv Detail & Related papers (2024-11-20T08:35:29Z)
GPT is becoming a Turing machine: Here are some ways to program it [16.169056235216576]
We show that GPT-3 models can be triggered to execute programs that involve loops. We show that prompts that may not even cover one full task example can trigger algorithmic behaviour.
arXiv Detail & Related papers (2023-03-25T00:43:41Z)
Improved Tree Search for Automatic Program Synthesis [91.3755431537592]
A key element is being able to perform an efficient search in the space of valid programs. Here, we suggest a variant of MCTS that leads to state of the art results on two vastly different DSLs.
arXiv Detail & Related papers (2023-03-13T15:09:52Z)
Hierarchical Phrase-based Sequence-to-Sequence Learning [94.10257313923478]
We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference. Our approach trains two models: a discriminative derivation based on a bracketing grammar whose tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-by-one.
arXiv Detail & Related papers (2022-11-15T05:22:40Z)
Learning from Self-Sampled Correct and Partially-Correct Programs [96.66452896657991]
We propose to let the model perform sampling during training and learn from both self-sampled fully-correct programs and partially-correct programs. We show that our use of self-sampled correct and partially-correct programs can benefit learning and help guide the sampling process. Our proposed method improves the pass@k performance by 3.1% to 12.3% compared to learning from a single reference program with MLE.
arXiv Detail & Related papers (2022-05-28T03:31:07Z)
Discovering Non-monotonic Autoregressive Orderings with Variational Inference [67.27561153666211]
We develop an unsupervised parallelizable learner that discovers high-quality generation orders purely from training data. We implement the encoder as a Transformer with non-causal attention that outputs permutations in one forward pass. Empirical results in language modeling tasks demonstrate that our method is context-aware and discovers orderings that are competitive with or even better than fixed orders.
arXiv Detail & Related papers (2021-10-27T16:08:09Z)
Searching for More Efficient Dynamic Programs [61.79535031840558]
We describe a set of program transformations, a simple metric for assessing the efficiency of a transformed program, and a search procedure to improve this metric. We show that in practice, automated search can find substantial improvements to the initial program.
arXiv Detail & Related papers (2021-09-14T20:52:55Z)
Structured Reordering for Modeling Latent Alignments in Sequence Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations. The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z)
Proving Equivalence Between Complex Expressions Using Graph-to-Sequence Neural Models [0.0]
We develop a graph-to-sequence neural network system for program equivalence. We extensively evaluate our system on a rich multi-type linear algebra expression language. Our machine learning system guarantees correctness for all true negatives, and ensures 0 false positive by design.
arXiv Detail & Related papers (2021-06-01T20:45:42Z)
Equivalence of Dataflow Graphs via Rewrite Rules Using a Graph-to-Sequence Neural Model [0.0]
We formalize the problem of equivalence between two programs as finding a set of semantics-preserving rewrite rules from one into the other. We then develop the first graph-to-sequence neural network system for program equivalence.
arXiv Detail & Related papers (2020-02-17T06:43:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.