Proof Flow: Preliminary Study on Generative Flow Network Language Model Tuning for Formal Reasoning
- URL: http://arxiv.org/abs/2410.13224v1
- Date: Thu, 17 Oct 2024 05:10:12 GMT
- Title: Proof Flow: Preliminary Study on Generative Flow Network Language Model Tuning for Formal Reasoning
- Authors: Matthew Ho, Vincent Zhu, Xiaoyin Chen, Moksh Jain, Nikolay Malkin, Edwin Zhang,
- Abstract summary: We present a proof of concept in the domain of formal reasoning, specifically in the Neural Theorem Proving setting.
Unlike classical reward-maximization reinforcement learning, GFlowNets have emerged as a promising approach for sampling compositional objects.
Our early results demonstrate GFlowNet fine-tuning's potential for enhancing model performance in a search setting.
- Score: 11.268313729426627
- License:
- Abstract: Reasoning is a fundamental substrate for solving novel and complex problems. Deliberate efforts in learning and developing frameworks around System 2 reasoning have made great strides, yet problems of sufficient complexity remain largely out of reach for open models. To address this gap, we examine the potential of Generative Flow Networks as a fine-tuning method for LLMs to unlock advanced reasoning capabilities. In this paper, we present a proof of concept in the domain of formal reasoning, specifically in the Neural Theorem Proving (NTP) setting, where proofs specified in a formal language such as Lean can be deterministically and objectively verified. Unlike classical reward-maximization reinforcement learning, which frequently over-exploits high-reward actions and fails to effectively explore the state space, GFlowNets have emerged as a promising approach for sampling compositional objects, improving generalization, and enabling models to maintain diverse hypotheses. Our early results demonstrate GFlowNet fine-tuning's potential for enhancing model performance in a search setting, which is especially relevant given the paradigm shift towards inference time compute scaling and "thinking slowly."
Related papers
- Causality can systematically address the monsters under the bench(marks) [64.36592889550431]
Benchmarks are plagued by various biases, artifacts, or leakage.
Models may behave unreliably due to poorly explored failure modes.
causality offers an ideal framework to systematically address these challenges.
arXiv Detail & Related papers (2025-02-07T17:01:37Z) - BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning [78.63421517563056]
Large Language Models (LLMs) have demonstrated remarkable capabilities in complex reasoning tasks.
We present a unified probabilistic framework that formalizes LLM reasoning through a novel graphical model.
We introduce the Bootstrapping Reinforced Thinking Process (BRiTE) algorithm, which works in two steps.
arXiv Detail & Related papers (2025-01-31T02:39:07Z) - Learning to Generate Research Idea with Dynamic Control [21.30777644522451]
Large language models (LLMs) have shown promise in generating hypotheses and research ideas.
We introduce a novel framework that employs a two-stage approach combiningSupervised Fine-Tuning (SFT) and controllable Reinforcement Learning (RL)
Our framework provides a balanced approach to research ideation, achieving high-quality outcomes by dynamically navigating the trade-offs among novelty, feasibility, and effectiveness.
arXiv Detail & Related papers (2024-12-19T08:28:18Z) - FFHFlow: A Flow-based Variational Approach for Learning Diverse Dexterous Grasps with Shape-Aware Introspection [19.308304984645684]
We introduce a novel model that can generate diverse grasps for a multi-fingered hand.
The proposed idea gains superior performance and higher run-time efficiency against strong baselines.
We also demonstrate substantial benefits of greater diversity for grasping objects in clutter and a confined workspace in the real world.
arXiv Detail & Related papers (2024-07-21T13:33:08Z) - Verbalized Probabilistic Graphical Modeling with Large Language Models [8.961720262676195]
This work introduces a novel Bayesian prompting approach that facilitates training-free Bayesian inference with large language models.
Our results indicate that the model effectively enhances confidence elicitation and text generation quality, demonstrating its potential to improve AI language understanding systems.
arXiv Detail & Related papers (2024-06-08T16:35:31Z) - Learning Planning-based Reasoning by Trajectories Collection and Process Reward Synthesizing [61.98556945939045]
We propose a framework to learn planning-based reasoning through Direct Preference Optimization (DPO) on collected trajectories.
Our results on challenging logical reasoning benchmarks demonstrate the effectiveness of our learning framework.
arXiv Detail & Related papers (2024-02-01T15:18:33Z) - Faithful Explanations of Black-box NLP Models Using LLM-generated
Counterfactuals [67.64770842323966]
Causal explanations of predictions of NLP systems are essential to ensure safety and establish trust.
Existing methods often fall short of explaining model predictions effectively or efficiently.
We propose two approaches for counterfactual (CF) approximation.
arXiv Detail & Related papers (2023-10-01T07:31:04Z) - Learn to Accumulate Evidence from All Training Samples: Theory and
Practice [7.257751371276488]
Evidential deep learning offers a principled and computationally efficient way to turn a deterministic neural network uncertainty-aware.
Existing evidential activation functions create zero evidence regions, which prevent the model to learn from training samples falling into such regions.
A deeper analysis of evidential activation functions based on our theoretical underpinning inspires the design of a novel regularizer.
arXiv Detail & Related papers (2023-06-19T18:27:12Z) - Validation Diagnostics for SBI algorithms based on Normalizing Flows [55.41644538483948]
This work proposes easy to interpret validation diagnostics for multi-dimensional conditional (posterior) density estimators based on NF.
It also offers theoretical guarantees based on results of local consistency.
This work should help the design of better specified models or drive the development of novel SBI-algorithms.
arXiv Detail & Related papers (2022-11-17T15:48:06Z) - Toward Certified Robustness Against Real-World Distribution Shifts [65.66374339500025]
We train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model.
A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations.
We propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement.
arXiv Detail & Related papers (2022-06-08T04:09:13Z) - Prediction-Centric Learning of Independent Cascade Dynamics from Partial
Observations [13.680949377743392]
We address the problem of learning of a spreading model such that the predictions generated from this model are accurate.
We introduce a computationally efficient algorithm, based on a scalable dynamic message-passing approach.
We show that tractable inference from the learned model generates a better prediction of marginal probabilities compared to the original model.
arXiv Detail & Related papers (2020-07-13T17:58:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.