Related papers: Beyond Words: A Mathematical Framework for Interpreting Large Language Models

Beyond Words: A Mathematical Framework for Interpreting Large Language Models

URL: http://arxiv.org/abs/2311.03033v1
Date: Mon, 6 Nov 2023 11:13:17 GMT
Title: Beyond Words: A Mathematical Framework for Interpreting Large Language Models
Authors: Javier Gonz\'alez and Aditya V. Nori
Abstract summary: Large language models (LLMs) are powerful AI tools that can generate and comprehend natural language text and other complex information. We propose Hex a framework that clarifies key terms and concepts in LLM research, such as hallucinations, alignment, self-verification and chain-of-thought reasoning. We argue that our formal definitions and results are crucial for advancing the discussion on how to build generative AI systems.
Score: 8.534513717370434
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) are powerful AI tools that can generate and comprehend natural language text and other complex information. However, the field lacks a mathematical framework to systematically describe, compare and improve LLMs. We propose Hex a framework that clarifies key terms and concepts in LLM research, such as hallucinations, alignment, self-verification and chain-of-thought reasoning. The Hex framework offers a precise and consistent way to characterize LLMs, identify their strengths and weaknesses, and integrate new findings. Using Hex, we differentiate chain-of-thought reasoning from chain-of-thought prompting and establish the conditions under which they are equivalent. This distinction clarifies the basic assumptions behind chain-of-thought prompting and its implications for methods that use it, such as self-verification and prompt programming. Our goal is to provide a formal framework for LLMs that can help both researchers and practitioners explore new possibilities for generative AI. We do not claim to have a definitive solution, but rather a tool for opening up new research avenues. We argue that our formal definitions and results are crucial for advancing the discussion on how to build generative AI systems that are safe, reliable, fair and robust, especially in domains like healthcare and software engineering.

Related papers

Computational Thinking Reasoning in Large Language Models [69.28428524878885]
Computational Thinking Model (CTM) is a novel framework that incorporates computational thinking paradigms into large language models (LLMs)<n>Live code execution is seamlessly integrated into the reasoning process, allowing CTM to think by computing.<n>CTM outperforms conventional reasoning models and tool-augmented baselines in terms of accuracy, interpretability, and generalizability.
arXiv Detail & Related papers (2025-06-03T09:11:15Z)
Robust Hypothesis Generation: LLM-Automated Language Bias for Inductive Logic Programming [3.641087660577424]
We introduce a novel framework integrating a multi-agent system, powered by Large Language Models (LLMs), with Inductive Logic Programming (ILP)<n>Our system's LLM agents autonomously define a structured symbolic vocabulary (predicates) and relational templates.<n>Experiments in diverse, challenging scenarios validate superior performance, paving a new path for automated, explainable, and verifiable hypothesis generation.
arXiv Detail & Related papers (2025-05-27T17:53:38Z)
The Risks of Using Large Language Models for Text Annotation in Social Science Research [3.276333240221372]
We conduct a systematic evaluation of the promises and risks of using large language models (LLMs) in coding tasks. We propose a framework for social scientists to incorporate LLMs into text annotation, either as the primary coding decision-maker or as a coding assistant.
arXiv Detail & Related papers (2025-03-27T23:33:36Z)
Formalizing Complex Mathematical Statements with LLMs: A Study on Mathematical Definitions [8.135142928659546]
We introduce two novel resources for autoformalization, collecting definitions from Wikipedia (Def_Wiki) and arXiv papers (Def_ArXiv) We evaluate a range of LLMs, analyzing their ability to formalize definitions into Isabelle/HOL. Our findings reveal that definitions present a greater challenge compared to existing benchmarks, such as miniF2F.
arXiv Detail & Related papers (2025-02-17T17:34:48Z)
Language Models as Semiotic Machines: Reconceptualizing AI Language Systems through Structuralist and Post-Structuralist Theories of Language [0.0]
This paper proposes a novel framework for understanding large language models (LLMs) I argue that LLMs should be understood as models of language itself, aligning with Jacques's concept of 'writing' (l'ecriture) I apply Saussure's critique of Saussure to position 'writing' as the object modeled by LLMs, offering a view of the machine's'mind' as a statistical approximation of sign behavior.
arXiv Detail & Related papers (2024-10-16T21:45:54Z)
Proof of Thought : Neurosymbolic Program Synthesis allows Robust and Interpretable Reasoning [1.3003982724617653]
Large Language Models (LLMs) have revolutionized natural language processing, yet they struggle with inconsistent reasoning. This research introduces Proof of Thought, a framework that enhances the reliability and transparency of LLM outputs. Key contributions include a robust type system with sort management for enhanced logical integrity, explicit representation of rules for clear distinction between factual and inferential knowledge.
arXiv Detail & Related papers (2024-09-25T18:35:45Z)
Misinforming LLMs: vulnerabilities, challenges and opportunities [4.54019093815234]
Large Language Models (LLMs) have made significant advances in natural language processing, but their underlying mechanisms are often misunderstood. This paper argues that current LLM architectures are inherently untrustworthy due to their reliance on correlations of sequential patterns of word embedding vectors. Research into combining generative transformer-based models with fact bases and logic programming languages may lead to the development of trustworthy LLMs.
arXiv Detail & Related papers (2024-08-02T10:35:49Z)
Reasoning with Large Language Models, a Survey [2.831296564800826]
This paper reviews the rapidly expanding field of prompt-based reasoning with LLMs. Our taxonomy identifies different ways to generate, evaluate, and control multi-step reasoning. We find that self-improvement, self-reflection, and some meta abilities of the reasoning processes are possible through the judicious use of prompts.
arXiv Detail & Related papers (2024-07-16T08:49:35Z)
Should We Fear Large Language Models? A Structural Analysis of the Human Reasoning System for Elucidating LLM Capabilities and Risks Through the Lens of Heidegger's Philosophy [0.0]
This study investigates the capabilities and risks of Large Language Models (LLMs) It uses the innovative parallels between the statistical patterns of word relationships within LLMs and Martin Heidegger's concepts of "ready-to-hand" and "present-at-hand" Our findings reveal that while LLMs possess the capability for Direct Explicative Reasoning and Pseudo Rational Reasoning, they fall short in authentic rational reasoning and have no creative reasoning capabilities.
arXiv Detail & Related papers (2024-03-05T19:40:53Z)
FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks. We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z)
Efficient Tool Use with Chain-of-Abstraction Reasoning [65.18096363216574]
Large language models (LLMs) need to ground their reasoning to real-world knowledge. There remains challenges for fine-tuning LLM agents to invoke tools in multi-step reasoning problems. We propose a new method for LLMs to better leverage tools in multi-step reasoning.
arXiv Detail & Related papers (2024-01-30T21:53:30Z)
CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark. In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship. We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z)
When Do Program-of-Thoughts Work for Reasoning? [51.2699797837818]
We propose complexity-impacted reasoning score (CIRS) to measure correlation between code and reasoning abilities. Specifically, we use the abstract syntax tree to encode the structural information and calculate logical complexity. Code will be integrated into the EasyInstruct framework at https://github.com/zjunlp/EasyInstruct.
arXiv Detail & Related papers (2023-08-29T17:22:39Z)
From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought [124.40905824051079]
We propose rational meaning construction, a computational framework for language-informed thinking. We frame linguistic meaning as a context-sensitive mapping from natural language into a probabilistic language of thought. We show that LLMs can generate context-sensitive translations that capture pragmatically-appropriate linguistic meanings. We extend our framework to integrate cognitively-motivated symbolic modules.
arXiv Detail & Related papers (2023-06-22T05:14:00Z)
Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners [75.85554779782048]
Large Language Models (LLMs) have excited the natural language and machine learning community over recent years. Despite of numerous successful applications, the underlying mechanism of such in-context capabilities still remains unclear. In this work, we hypothesize that the learned textitsemantics of language tokens do the most heavy lifting during the reasoning process.
arXiv Detail & Related papers (2023-05-24T07:33:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.