ThinkSum: Probabilistic reasoning over sets using large language models
- URL: http://arxiv.org/abs/2210.01293v2
- Date: Fri, 2 Jun 2023 17:25:19 GMT
- Title: ThinkSum: Probabilistic reasoning over sets using large language models
- Authors: Batu Ozturkler, Nikolay Malkin, Zhen Wang, Nebojsa Jojic
- Abstract summary: We propose a two-stage probabilistic inference paradigm, ThinkSum, which reasons over sets of objects or facts in a structured manner.
We demonstrate the possibilities and advantages of ThinkSum on the BIG-bench suite of LLM evaluation tasks.
- Score: 18.123895485602244
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have a substantial capacity for high-level
analogical reasoning: reproducing patterns in linear text that occur in their
training data (zero-shot evaluation) or in the provided context (few-shot
in-context learning). However, recent studies show that even the more advanced
LLMs fail in scenarios that require reasoning over multiple objects or facts
and making sequences of logical deductions. We propose a two-stage
probabilistic inference paradigm, ThinkSum, which reasons over sets of objects
or facts in a structured manner. In the first stage (Think - retrieval of
associations), a LLM is queried in parallel over a set of phrases extracted
from the prompt or an auxiliary model call. In the second stage (Sum -
probabilistic inference or reasoning), the results of these queries are
aggregated to make the final prediction. We demonstrate the possibilities and
advantages of ThinkSum on the BIG-bench suite of LLM evaluation tasks,
achieving improvements over the state of the art using GPT-family models on
thirteen difficult tasks, often with far smaller model variants. We also
compare and contrast ThinkSum with other proposed modifications to direct
prompting of LLMs, such as variants of chain-of-thought prompting. Our results
suggest that because the probabilistic inference in ThinkSum is performed
outside of calls to the LLM, ThinkSum is less sensitive to prompt design,
yields more interpretable predictions, and can be flexibly combined with latent
variable models to extract structured knowledge from LLMs. Overall, our
proposed paradigm represents a promising approach for enhancing the reasoning
capabilities of LLMs.
Related papers
- Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models [6.922021128239465]
Recent advances in AI have been driven by the capabilities of large language models (LLMs)
This paper introduces a framework that is both theoretical and practical, aimed at assessing how effectively LLMs are able to replicate real-world reasoning mechanisms.
arXiv Detail & Related papers (2024-08-15T15:19:11Z) - LAMPO: Large Language Models as Preference Machines for Few-shot Ordinal Classification [34.9210323553677]
We introduce LAMPO, a novel paradigm that leverages Large Language Models (LLMs) for solving few-shot multi-class ordinal classification tasks.
Extensive experiments on seven public datasets demonstrate LAMPO's remarkably competitive performance across a diverse spectrum of applications.
arXiv Detail & Related papers (2024-08-06T15:55:05Z) - Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - Probabilistic Reasoning in Generative Large Language Models [18.983753573277596]
This paper considers the challenges Large Language Models (LLMs) face when reasoning over text that includes information involving uncertainty explicitly quantified via probability values.
We introduce the Bayesian Linguistic Inference dataset (BLInD), a new dataset designed to test the probabilistic reasoning capabilities of LLMs.
We present several prompting strategies that map the problem to different formal representations, including Python code, probabilistic algorithms, and probabilistic logical programming.
arXiv Detail & Related papers (2024-02-14T23:05:44Z) - A Hypothesis-Driven Framework for the Analysis of Self-Rationalising
Models [0.8702432681310401]
We use a Bayesian network to implement a hypothesis about how a task is solved.
The resulting models do not exhibit a strong similarity to GPT-3.5.
We discuss the implications of this as well as the framework's potential to approximate LLM decisions better in future work.
arXiv Detail & Related papers (2024-02-07T12:26:12Z) - Learning to Generate Explainable Stock Predictions using Self-Reflective
Large Language Models [54.21695754082441]
We propose a framework to teach Large Language Models (LLMs) to generate explainable stock predictions.
A reflective agent learns how to explain past stock movements through self-reasoning, while the PPO trainer trains the model to generate the most likely explanations.
Our framework can outperform both traditional deep-learning and LLM methods in prediction accuracy and Matthews correlation coefficient.
arXiv Detail & Related papers (2024-02-06T03:18:58Z) - Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks.
The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human.
These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z) - Explanation-aware Soft Ensemble Empowers Large Language Model In-context
Learning [50.00090601424348]
Large language models (LLMs) have shown remarkable capabilities in various natural language understanding tasks.
We propose EASE, an Explanation-Aware Soft Ensemble framework to empower in-context learning with LLMs.
arXiv Detail & Related papers (2023-11-13T06:13:38Z) - Evaluating and Explaining Large Language Models for Code Using Syntactic
Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code.
At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes.
We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z) - Explaining Emergent In-Context Learning as Kernel Regression [61.57151500616111]
Large language models (LLMs) have initiated a paradigm shift in transfer learning.
In this paper, we investigate the reason why a transformer-based language model can accomplish in-context learning after pre-training.
We find that during ICL, the attention and hidden features in LLMs match the behaviors of a kernel regression.
arXiv Detail & Related papers (2023-05-22T06:45:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.