Large Language Model for Science: A Study on P vs. NP
- URL: http://arxiv.org/abs/2309.05689v1
- Date: Mon, 11 Sep 2023 17:49:27 GMT
- Title: Large Language Model for Science: A Study on P vs. NP
- Authors: Qingxiu Dong, Li Dong, Ke Xu, Guangyan Zhou, Yaru Hao, Zhifang Sui,
Furu Wei
- Abstract summary: We use large language models (LLMs) to augment and accelerate research on the P versus NP problem.
Specifically, we propose Socratic reasoning, a general framework that promotes in-depth thinking with LLMs for complex problem-solving.
Our pilot study on the P vs. NP problem shows that GPT-4 successfully produces a proof schema and engages in rigorous reasoning throughout 97 dialogue turns.
- Score: 88.67249044141529
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we use large language models (LLMs) to augment and accelerate
research on the P versus NP problem, one of the most important open problems in
theoretical computer science and mathematics. Specifically, we propose Socratic
reasoning, a general framework that promotes in-depth thinking with LLMs for
complex problem-solving. Socratic reasoning encourages LLMs to recursively
discover, solve, and integrate problems while facilitating self-evaluation and
refinement. Our pilot study on the P vs. NP problem shows that GPT-4
successfully produces a proof schema and engages in rigorous reasoning
throughout 97 dialogue turns, concluding "P $\neq$ NP", which is in alignment
with (Xu and Zhou, 2023). The investigation uncovers novel insights within the
extensive solution space of LLMs, shedding light on LLM for Science.
Related papers
- Can Large Language Models Reason? A Characterization via 3-SAT [11.422434149376478]
Large Language Models (LLMs) have been touted as AI models possessing advanced reasoning abilities.
Recent works have shown that LLMs often bypass true reasoning using shortcuts, sparking skepticism.
We propose an experimental protocol centered on 3-SAT -- the NP-complete problem lying at the core of logical reasoning and constraint satisfaction tasks.
arXiv Detail & Related papers (2024-08-13T21:54:10Z) - Reasoning with Large Language Models, a Survey [2.831296564800826]
This paper reviews the rapidly expanding field of prompt-based reasoning with LLMs.
Our taxonomy identifies different ways to generate, evaluate, and control multi-step reasoning.
We find that self-improvement, self-reflection, and some meta abilities of the reasoning processes are possible through the judicious use of prompts.
arXiv Detail & Related papers (2024-07-16T08:49:35Z) - Reinforcement Learning Problem Solving with Large Language Models [0.0]
Large Language Models (LLMs) have an extensive amount of world knowledge, and this has enabled their application in various domains to improve the performance of Natural Language Processing (NLP) tasks.
This has also facilitated a more accessible paradigm of conversation-based interactions between humans and AI systems to solve intended problems.
We show the practicality of our approach through two detailed case studies for "Research Scientist" and "Legal Matter Intake"
arXiv Detail & Related papers (2024-04-29T12:16:08Z) - Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs [2.3020018305241337]
Distilling explicit chain-of-thought reasoning paths has emerged as an effective method for improving the reasoning abilities of large language models.
We propose a novel approach to distill reasoning abilities from LLMs by leveraging their capacity to explain solutions.
Our experiments demonstrate that learning from explanations enables the Reasoner to more effectively guide program implementation by a Coder.
arXiv Detail & Related papers (2024-04-11T22:19:50Z) - Logic Query of Thoughts: Guiding Large Language Models to Answer Complex Logic Queries with Knowledge Graphs [102.37496443389203]
'Logic-Query-of-Thoughts' (LGOT) is the first of its kind to combine knowledge graph reasoning and large language models.
Our experimental findings demonstrate substantial performance enhancements, with up to 20% improvement over ChatGPT.
arXiv Detail & Related papers (2024-03-17T17:01:45Z) - Self-Discover: Large Language Models Self-Compose Reasoning Structures [136.48389510481758]
We introduce SELF-DISCOVER, a framework for self-discovering task-intrinsic reasoning structures.
SELF-DISCOVER substantially improves GPT-4 and PaLM 2's performance on challenging reasoning benchmarks.
We show that the self-discovered reasoning structures are universally applicable across model families.
arXiv Detail & Related papers (2024-02-06T01:13:53Z) - Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners? [140.9751389452011]
We study the biases of large language models (LLMs) in relation to those known in children when solving arithmetic word problems.
We generate a novel set of word problems for each of these tests, using a neuro-symbolic approach that enables fine-grained control over the problem features.
arXiv Detail & Related papers (2024-01-31T18:48:20Z) - Thought Propagation: An Analogical Approach to Complex Reasoning with Large Language Models [62.96551299003463]
We propose textbftextitThought Propagation (TP) to enhance the complex reasoning ability of Large Language Models.
TP first prompts LLMs to propose and solve a set of analogous problems that are related to the input one.
TP reuses the results of analogous problems to directly yield a new solution or derive a knowledge-intensive plan for execution to amend the initial solution obtained from scratch.
arXiv Detail & Related papers (2023-10-06T01:40:09Z) - NLPBench: Evaluating Large Language Models on Solving NLP Problems [41.01588131136101]
Large language models (LLMs) have shown promise in enhancing the capabilities of natural language processing (NLP)
We present a unique benchmarking dataset, NLPBench, comprising 378 college-level NLP questions spanning various NLP topics sourced from Yale University's prior final exams.
Our evaluation, centered on LLMs such as GPT-3.5/4, PaLM-2, and LLAMA-2, incorporates advanced prompting strategies like the chain-of-thought (CoT) and tree-of-thought (ToT)
arXiv Detail & Related papers (2023-09-27T13:02:06Z) - PAL: Program-aided Language Models [112.94785609781503]
We present Program-Aided Language models (PaL) to understand natural language problems.
PaL offloads the solution step to a programmatic runtime such as a Python interpreter.
We set new state-of-the-art results in all 12 benchmarks.
arXiv Detail & Related papers (2022-11-18T18:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.