Learning to Find Proofs and Theorems by Learning to Refine Search
Strategies
- URL: http://arxiv.org/abs/2205.14229v1
- Date: Fri, 27 May 2022 20:48:40 GMT
- Title: Learning to Find Proofs and Theorems by Learning to Refine Search
Strategies
- Authors: Jonathan Laurent and Andr\'e Platzer
- Abstract summary: An AlphaZero-style agent is self-training to refine a high-level expert strategy expressed as a nondeterministic program.
An analogous teacher agent is self-training to generate tasks of suitable relevance and difficulty for the learner.
- Score: 0.9137554315375919
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new approach to automated theorem proving and deductive program
synthesis where an AlphaZero-style agent is self-training to refine a
high-level expert strategy expressed as a nondeterministic program. An
analogous teacher agent is self-training to generate tasks of suitable
relevance and difficulty for the learner. This allows leveraging minimal
amounts of domain knowledge to tackle problems for which training data is
unavailable or hard to synthesize. We illustrate our approach on the problem of
loop invariant synthesis for imperative programs and using neural networks to
refine both the teacher and solver strategies.
Related papers
- Learning to Steer Markovian Agents under Model Uncertainty [23.603487812521657]
We study how to design additional rewards to steer multi-agent systems towards desired policies.
Motivated by the limitation of existing works, we consider a new category of learning dynamics called emphMarkovian agents
We learn a emphhistory-dependent steering strategy to handle the inherent model uncertainty about the agents' learning dynamics.
arXiv Detail & Related papers (2024-07-14T14:01:38Z) - Contractual Reinforcement Learning: Pulling Arms with Invisible Hands [68.77645200579181]
We propose a theoretical framework for aligning economic interests of different stakeholders in the online learning problems through contract design.
For the planning problem, we design an efficient dynamic programming algorithm to determine the optimal contracts against the far-sighted agent.
For the learning problem, we introduce a generic design of no-regret learning algorithms to untangle the challenges from robust design of contracts to the balance of exploration and exploitation.
arXiv Detail & Related papers (2024-07-01T16:53:00Z) - Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents.
Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - Hierarchically Structured Task-Agnostic Continual Learning [0.0]
We take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle.
We propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths.
Our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms.
arXiv Detail & Related papers (2022-11-14T19:53:15Z) - Outcome-Driven Reinforcement Learning via Variational Inference [95.82770132618862]
We discuss a new perspective on reinforcement learning, recasting it as the problem of inferring actions that achieve desired outcomes, rather than a problem of maximizing rewards.
To solve the resulting outcome-directed inference problem, we establish a novel variational inference formulation that allows us to derive a well-shaped reward function.
We empirically demonstrate that this method eliminates the need to design reward functions and leads to effective goal-directed behaviors.
arXiv Detail & Related papers (2021-04-20T18:16:21Z) - Program Synthesis Guided Reinforcement Learning [34.342362868490525]
Key challenge for reinforcement learning is solving long-horizon planning and control problems.
Recent work has proposed leveraging programs to help guide the learning algorithm in these settings.
We propose an approach that leverages program synthesis to automatically generate the guiding program.
arXiv Detail & Related papers (2021-02-22T16:05:32Z) - BUSTLE: Bottom-Up Program Synthesis Through Learning-Guided Exploration [72.88493072196094]
We present a new synthesis approach that leverages learning to guide a bottom-up search over programs.
In particular, we train a model to prioritize compositions of intermediate values during search conditioned on a set of input-output examples.
We show that the combination of learning and bottom-up search is remarkably effective, even with simple supervised learning approaches.
arXiv Detail & Related papers (2020-07-28T17:46:18Z) - Reinforcement Learning for Variable Selection in a Branch and Bound
Algorithm [0.10499611180329801]
We leverage patterns in real-world instances to learn from scratch a new branching strategy optimised for a given problem.
We propose FMSTS, a novel Reinforcement Learning approach specifically designed for this task.
arXiv Detail & Related papers (2020-05-20T13:15:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.