How does GPT-2 compute greater-than?: Interpreting mathematical
abilities in a pre-trained language model
- URL: http://arxiv.org/abs/2305.00586v5
- Date: Thu, 2 Nov 2023 10:55:18 GMT
- Title: How does GPT-2 compute greater-than?: Interpreting mathematical
abilities in a pre-trained language model
- Authors: Michael Hanna, Ollie Liu and Alexandre Variengien
- Abstract summary: We use mechanistic interpretability techniques to explain the mathematical abilities of GPT-2 small.
We show that GPT-2 small's final multi-layer perceptrons boost the probability of end years greater than the start year.
Our results suggest that GPT-2 small computes greater-than using a complex but general mechanism.
- Score: 52.92472140375308
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pre-trained language models can be surprisingly adept at tasks they were not
explicitly trained on, but how they implement these capabilities is poorly
understood. In this paper, we investigate the basic mathematical abilities
often acquired by pre-trained language models. Concretely, we use mechanistic
interpretability techniques to explain the (limited) mathematical abilities of
GPT-2 small. As a case study, we examine its ability to take in sentences such
as "The war lasted from the year 1732 to the year 17", and predict valid
two-digit end years (years > 32). We first identify a circuit, a small subset
of GPT-2 small's computational graph that computes this task's output. Then, we
explain the role of each circuit component, showing that GPT-2 small's final
multi-layer perceptrons boost the probability of end years greater than the
start year. Finally, we find related tasks that activate our circuit. Our
results suggest that GPT-2 small computes greater-than using a complex but
general mechanism that activates across diverse contexts.
Related papers
- Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks [5.358878931933351]
We study the emergence of in-context learning and skill composition in a collection of modular arithmetic tasks.
Specifically, we consider a finite collection of linear modular functions $z = a, x + b, y ;mathrmmod; p$ labeled by the vector $(a, b) in mathbbZ_p2$.
arXiv Detail & Related papers (2024-06-04T17:59:36Z) - WizardMath: Empowering Mathematical Reasoning for Large Language Models
via Reinforced Evol-Instruct [128.89645483139236]
We present WizardMath, which enhances the mathematical reasoning abilities of Llama-2, by applying our proposed Reinforcement Learning from Evol-Instruct Feedback (RLEIF) method to the domain of math.
Our model even surpasses ChatGPT-3.5, Claude Instant-1, PaLM-2 and Minerva on GSM8k, simultaneously surpasses Text-davinci, PaLM-1 and GPT-3 on MATH.
arXiv Detail & Related papers (2023-08-18T14:23:21Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - Towards Automated Circuit Discovery for Mechanistic Interpretability [7.605075513099429]
This paper systematizes the mechanistic interpretability process they followed.
By varying the dataset, metric, and units under investigation, researchers can understand the functionality of each component.
We propose several algorithms and reproduce previous interpretability results to validate them.
arXiv Detail & Related papers (2023-04-28T17:36:53Z) - Classification of integers based on residue classes via modern deep
learning algorithms [3.6396223542930772]
We tested multiple deep learning architectures and feature engineering approaches on classifying integers based on their residues when divided by small prime numbers.
We also evaluated Automated Machine Learning platforms from Amazon, Google and Microsoft, and found that they failed on this task without appropriately engineered features.
In conclusion, feature engineering remains an important task to improve performance and increase interpretability of machine-learning models.
arXiv Detail & Related papers (2023-04-03T19:53:31Z) - Mathematical Capabilities of ChatGPT [35.71603158908465]
We release two new datasets: GHOSTS and miniGHOSTS.
These are the first natural-language datasets curated by working researchers in mathematics.
We benchmark the models on a range of fine-grained performance metrics.
arXiv Detail & Related papers (2023-01-31T18:59:03Z) - The Unreliability of Explanations in Few-Shot In-Context Learning [50.77996380021221]
We focus on two NLP tasks that involve reasoning over text, namely question answering and natural language inference.
We show that explanations judged as good by humans--those that are logically consistent with the input--usually indicate more accurate predictions.
We present a framework for calibrating model predictions based on the reliability of the explanations.
arXiv Detail & Related papers (2022-05-06T17:57:58Z) - GLaM: Efficient Scaling of Language Models with Mixture-of-Experts [84.33607245023049]
We propose and develop a family of language models named GLaM (Generalist Language Model)
GLaM uses a sparsely activated mixture-of-experts architecture to scale the model capacity while also incurring substantially less training cost compared to dense variants.
It consumes only 1/3 of the energy used to train GPT-3 and requires half of the flops for inference, while still achieving better overall zero-shot and one-shot performance across 29 NLP tasks.
arXiv Detail & Related papers (2021-12-13T18:58:19Z) - Kronecker Decomposition for GPT Compression [8.60086973058282]
GPT is an auto-regressive Transformer-based pre-trained language model which has attracted a lot of attention in the natural language processing (NLP) domain.
Despite the superior performance of GPT, GPT can be very prohibitive for deploying this model on devices with limited computational power or memory.
In this work, we use Kronecker decomposition to compress the linear mappings of the GPT-22 model.
arXiv Detail & Related papers (2021-10-15T15:28:39Z) - MC-BERT: Efficient Language Pre-Training via a Meta Controller [96.68140474547602]
Large-scale pre-training is computationally expensive.
ELECTRA, an early attempt to accelerate pre-training, trains a discriminative model that predicts whether each input token was replaced by a generator.
We propose a novel meta-learning framework, MC-BERT, to achieve better efficiency and effectiveness.
arXiv Detail & Related papers (2020-06-10T09:22:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.