Related papers: Learning Beyond Pattern Matching? Assaying Mathematical Understanding in LLMs

Learning Beyond Pattern Matching? Assaying Mathematical Understanding in LLMs

URL: http://arxiv.org/abs/2405.15485v1
Date: Fri, 24 May 2024 12:04:54 GMT
Title: Learning Beyond Pattern Matching? Assaying Mathematical Understanding in LLMs
Authors: Siyuan Guo, Aniket Didolkar, Nan Rosemary Ke, Anirudh Goyal, Ferenc Huszár, Bernhard Schölkopf,
Abstract summary: This paper assesses the domain knowledge of LLMs through its understanding of different mathematical skills required to solve problems. Motivated by the use of LLMs as a general scientific assistant, we propose textitNTKEval to assess changes in LLM's probability distribution. Our systematic analysis finds evidence of domain understanding during in-context learning. Certain instruction-tuning leads to similar performance changes irrespective of training on different data, suggesting a lack of domain understanding across different skills.
Score: 58.09253149867228
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We are beginning to see progress in language model assisted scientific discovery. Motivated by the use of LLMs as a general scientific assistant, this paper assesses the domain knowledge of LLMs through its understanding of different mathematical skills required to solve problems. In particular, we look at not just what the pre-trained model already knows, but how it learned to learn from information during in-context learning or instruction-tuning through exploiting the complex knowledge structure within mathematics. Motivated by the Neural Tangent Kernel (NTK), we propose \textit{NTKEval} to assess changes in LLM's probability distribution via training on different kinds of math data. Our systematic analysis finds evidence of domain understanding during in-context learning. By contrast, certain instruction-tuning leads to similar performance changes irrespective of training on different data, suggesting a lack of domain understanding across different skills.

Related papers

KnowledgeSmith: Uncovering Knowledge Updating in LLMs with Model Editing and Unlearning [23.5611669268224]
Knowledge editing and machine unlearning are popular approaches for large language models (LLMs) to stay up-to-date.<n>This paper proposes KnowledgeSmith, a unified framework to systematically understand the updating mechanism of LLMs.
arXiv Detail & Related papers (2025-10-01T00:15:25Z)
Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study [50.065744358362345]
Large language models (LLMs) have shown impressive capabilities across tasks such as mathematics, coding, and reasoning.<n>Yet their learning ability, which is crucial for adapting to dynamic environments and acquiring new knowledge, remains underexplored.
arXiv Detail & Related papers (2025-06-16T13:24:50Z)
Teaching LLMs How to Learn with Contextual Fine-Tuning [9.26781270726841]
We study a novel generalization of instruction tuning, called contextual fine-tuning, to fine-tune LLMs. We empirically demonstrate that this simple yet effective modification improves the ability of LLMs to be fine-tuned rapidly on new datasets.
arXiv Detail & Related papers (2025-03-12T03:45:53Z)
Effective LLM Knowledge Learning via Model Generalization [73.16975077770765]
Large language models (LLMs) are trained on enormous documents that contain extensive world knowledge. It is still not well-understood how knowledge is acquired via autoregressive pre-training. In this paper, we focus on understanding and improving LLM knowledge learning.
arXiv Detail & Related papers (2025-03-05T17:56:20Z)
Sparse Autoencoders Reveal Temporal Difference Learning in Large Language Models [7.115323364355489]
In-context learning, the ability to adapt based on a few examples in the input prompt, is a ubiquitous feature of large language models (LLMs) We first show that Llama $3$ $70$B can solve simple RL problems in-context. We then analyze the residual stream of Llama using Sparse Autoencoders (SAEs) and find representations that closely match temporal difference (TD) errors.
arXiv Detail & Related papers (2024-10-02T06:51:12Z)
Dynamics of Supervised and Reinforcement Learning in the Non-Linear Perceptron [3.069335774032178]
We use a dataset-process approach to derive flow equations describing learning. We characterize the effects of the learning rule (supervised or reinforcement learning, SL/RL) and input-data distribution on the perceptron's learning curve. This approach points a way toward analyzing learning dynamics for more-complex circuit architectures.
arXiv Detail & Related papers (2024-09-05T17:58:28Z)
From Next-Token to Mathematics: The Learning Dynamics of Mathematical Reasoning in Language Models [38.71041354422434]
Large Language Models (LLMs) solely trained on next-token prediction learn to solve a wide range of problems involving mathematical reasoning.<n>We show the first analysis of how mathematical reasoning abilities of several open-weight LLMs develop during pre-training and post-training.
arXiv Detail & Related papers (2024-07-01T01:56:28Z)
Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs [55.317267269115845]
Chain-of-Knowledge (CoK) is a comprehensive framework for knowledge reasoning. CoK includes methodologies for both dataset construction and model learning. We conduct extensive experiments with KnowReason.
arXiv Detail & Related papers (2024-06-30T10:49:32Z)
Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning. This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z)
Towards Reliable Latent Knowledge Estimation in LLMs: In-Context Learning vs. Prompting Based Factual Knowledge Extraction [15.534647327246239]
We propose an approach for estimating the latent knowledge embedded inside large language models (LLMs) We leverage the in-context learning abilities of LLMs to estimate the extent to which an LLM knows the facts stored in a knowledge base.
arXiv Detail & Related papers (2024-04-19T15:40:39Z)
A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches. We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z)
MechGPT, a language-based strategy for mechanics and materials modeling that connects knowledge across scales, disciplines and modalities [0.0]
We use a Large Language Model (LLM) to distill question-answer pairs from raw sources followed by fine-tuning. The resulting MechGPT LLM foundation model is used in a series of computational experiments to explore its capacity for knowledge retrieval, various language tasks, hypothesis generation, and connecting knowledge across disparate areas.
arXiv Detail & Related papers (2023-10-16T14:29:35Z)
Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge [72.63368052592004]
We study LMs' abilities to make inferences based on injected facts (or propagate those facts) We find that existing methods for updating knowledge show little propagation of injected knowledge. Yet, prepending entity definitions in an LM's context improves performance across all settings.
arXiv Detail & Related papers (2023-05-02T17:59:46Z)
Anti-Retroactive Interference for Lifelong Learning [65.50683752919089]
We design a paradigm for lifelong learning based on meta-learning and associative mechanism of the brain. It tackles the problem from two aspects: extracting knowledge and memorizing knowledge. It is theoretically analyzed that the proposed learning paradigm can make the models of different tasks converge to the same optimum.
arXiv Detail & Related papers (2022-08-27T09:27:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.