Related papers: Language Models Encode Numbers Using Digit Representations in Base 10

Language Models Encode Numbers Using Digit Representations in Base 10

URL: http://arxiv.org/abs/2410.11781v2
Date: Sun, 02 Feb 2025 20:27:54 GMT
Title: Language Models Encode Numbers Using Digit Representations in Base 10
Authors: Amit Arnold Levy, Mor Geva,
Abstract summary: We show that large language models (LLMs) make errors when handling simple numerical problems.<n>LLMs internally represent numbers with individual circular representations per-digit in base 10.<n>This digit-wise representation sheds light on the error patterns of models on tasks involving numerical reasoning.
Score: 12.913172023910203
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) frequently make errors when handling even simple numerical problems, such as comparing two small numbers. A natural hypothesis is that these errors stem from how LLMs represent numbers, and specifically, whether their representations of numbers capture their numeric values. We tackle this question from the observation that LLM errors on numerical tasks are often distributed across the digits of the answer rather than normally around its numeric value. Through a series of probing experiments and causal interventions, we show that LLMs internally represent numbers with individual circular representations per-digit in base 10. This digit-wise representation, as opposed to a value representation, sheds light on the error patterns of models on tasks involving numerical reasoning and could serve as a basis for future studies on analyzing numerical mechanisms in LLMs.

Related papers

Modular Arithmetic: Language Models Solve Math Digit by Digit [9.827634698754014]
We present evidence for the existence of digit-position-specific circuits that Large Language Models (LLMs) use to perform arithmetic tasks.<n>Using Importance Feature and Causal Interventions, we identify and validate the digit-position-specific circuits.<n>Our interventions selectively alter the model's prediction at targeted digit positions, demonstrating the causal role of digit-position circuits in solving arithmetic tasks.
arXiv Detail & Related papers (2025-08-04T15:18:41Z)
Number Representations in LLMs: A Computational Parallel to Human Perception [17.769013342964794]
We investigate whether large language models (LLMs) exhibit a similar logarithmic-like structure in their internal numerical representations. Our findings reveal that the model's numerical representations exhibit sublinear spacing, with distances between values aligning with a logarithmic scale.
arXiv Detail & Related papers (2025-02-22T08:44:29Z)
FoNE: Precise Single-Token Number Embeddings via Fourier Features [51.17846016593835]
We propose a novel method that maps numbers into the embedding space with their Fourier features. FoNE encodes each number as a single token with only two embedding dimensions per digit, effectively capturing numerical values without fragmentation. On 6-digit decimal addition, FoNE requires 64$times$ less data to achieve 99% accuracy than subword and digit-wise embeddings. FoNE is the only method that yields 100% accuracy on over 100,000 test examples for addition, subtraction, and multiplication.
arXiv Detail & Related papers (2025-02-13T19:54:59Z)
Mathematical Reasoning in Large Language Models: Assessing Logical and Arithmetic Errors across Wide Numerical Ranges [0.0]
We introduce GSM-Ranges, a dataset generator that systematically perturbs numerical values in math problems to assess model robustness across varying numerical scales. We also propose a novel grading methodology that distinguishes between logical and non-logical errors, offering a more precise evaluation of reasoning processes beyond computational accuracy.
arXiv Detail & Related papers (2025-02-12T09:53:10Z)
Why Do Large Language Models (LLMs) Struggle to Count Letters? [2.8367942280334493]
Large Language Models (LLMs) have achieved unprecedented performance on many complex tasks. They struggle with other simple tasks, such as counting the occurrences of letters in a word.
arXiv Detail & Related papers (2024-12-19T22:47:08Z)
How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs [69.55103380185612]
We identify numerical precision as a key factor that influences Transformer-based Large Language Models' effectiveness in mathematical tasks. Our results show that Transformers operating with low numerical precision fail to address arithmetic tasks, such as iterated addition and integer multiplication. In contrast, Transformers with standard numerical precision can efficiently handle these tasks with significantly smaller model sizes.
arXiv Detail & Related papers (2024-10-17T17:59:35Z)
NUMCoT: Numerals and Units of Measurement in Chain-of-Thought Reasoning using Large Language Models [37.15662878141497]
We analyze existing Large Language Models (LLMs) on processing of numerals and units of measurement. We first anatomize the reasoning of math word problems to different sub-procedures like numeral conversions from language to numbers and measurement conversions based on units. We further annotate math word problems from ancient Chinese arithmetic works which are challenging in numerals and units of measurement.
arXiv Detail & Related papers (2024-06-05T02:26:14Z)
Analyzing the Role of Semantic Representations in the Era of Large Language Models [104.18157036880287]
We investigate the role of semantic representations in the era of large language models (LLMs) We propose an AMR-driven chain-of-thought prompting method, which we call AMRCoT. We find that it is difficult to predict which input examples AMR may help or hurt on, but errors tend to arise with multi-word expressions.
arXiv Detail & Related papers (2024-05-02T17:32:59Z)
NumeroLogic: Number Encoding for Enhanced LLMs' Numerical Reasoning [27.584258258635945]
Language models struggle with handling numerical data and performing arithmetic operations. We propose a simple adjustment to how numbers are represented by including the count of digits before each number. By requiring the model to consider the number of digits first, it enhances the reasoning process before generating the actual number.
arXiv Detail & Related papers (2024-03-30T19:46:59Z)
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers [68.77382332826167]
Large language models (LLMs) have achieved impressive performance across various mathematical reasoning benchmarks. One essential and frequently occurring evidence is that when the math questions are slightly changed, LLMs can behave incorrectly. This motivates us to evaluate the robustness of LLMs' math reasoning capability by testing a wide range of question variations.
arXiv Detail & Related papers (2024-02-29T15:26:14Z)
Language Models Encode the Value of Numbers Linearly [28.88044346200171]
We study how language models encode the value of numbers, a basic element in math. Experimental results support the existence of encoded number values in large language models. Our research provides evidence that LLMs encode the value of numbers linearly.
arXiv Detail & Related papers (2024-01-08T08:54:22Z)
Positional Description Matters for Transformers Arithmetic [58.4739272381373]
Transformers often falter on arithmetic tasks despite their vast capabilities. We propose several ways to fix the issue, either by modifying the positional encoding directly, or by modifying the representation of the arithmetic task to leverage standard positional encoding differently.
arXiv Detail & Related papers (2023-11-22T00:31:01Z)
Human Behavioral Benchmarking: Numeric Magnitude Comparison Effects in Large Language Models [4.412336603162406]
Large Language Models (LLMs) do not differentially represent numbers, which are pervasive in text. In this work, we investigate how well popular LLMs capture the magnitudes of numbers from a behavioral lens.
arXiv Detail & Related papers (2023-05-18T07:50:44Z)
PAL: Program-aided Language Models [112.94785609781503]
We present Program-Aided Language models (PaL) to understand natural language problems. PaL offloads the solution step to a programmatic runtime such as a Python interpreter. We set new state-of-the-art results in all 12 benchmarks.
arXiv Detail & Related papers (2022-11-18T18:56:13Z)
Reflection of Thought: Inversely Eliciting Numerical Reasoning in Language Models via Solving Linear Systems [42.782260686177395]
We propose a novel method to elicit and exploit the numerical reasoning knowledge hidden in pre-trained language models. We first leverage simple numbers as anchors to probe the implicitly inferred arithmetic expressions from language models. We transform and formulate the task as an analytically solvable linear system.
arXiv Detail & Related papers (2022-10-11T00:57:19Z)
NumGPT: Improving Numeracy Ability of Generative Pre-trained Models [59.931394234642816]
We propose NumGPT, a generative pre-trained model that explicitly models the numerical properties of numbers in texts. Specifically, it leverages a prototype-based numeral embedding to encode the mantissa of the number and an individual embedding to encode the exponent of the number. A numeral-aware loss function is designed to integrate numerals into the pre-training objective of NumGPT.
arXiv Detail & Related papers (2021-09-07T15:06:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.