Generative Models as a Complex Systems Science: How can we make sense of
large language model behavior?
- URL: http://arxiv.org/abs/2308.00189v1
- Date: Mon, 31 Jul 2023 22:58:41 GMT
- Title: Generative Models as a Complex Systems Science: How can we make sense of
large language model behavior?
- Authors: Ari Holtzman, Peter West, Luke Zettlemoyer
- Abstract summary: Coaxing out desired behavior from pretrained models, while avoiding undesirable ones, has redefined NLP.
We argue for a systematic effort to decompose language model behavior into categories that explain cross-task performance.
- Score: 75.79305790453654
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Coaxing out desired behavior from pretrained models, while avoiding
undesirable ones, has redefined NLP and is reshaping how we interact with
computers. What was once a scientific engineering discipline-in which building
blocks are stacked one on top of the other-is arguably already a complex
systems science, in which emergent behaviors are sought out to support
previously unimagined use cases.
Despite the ever increasing number of benchmarks that measure task
performance, we lack explanations of what behaviors language models exhibit
that allow them to complete these tasks in the first place. We argue for a
systematic effort to decompose language model behavior into categories that
explain cross-task performance, to guide mechanistic explanations and help
future-proof analytic research.
Related papers
- A Pattern Language for Machine Learning Tasks [0.0]
We view objective functions as constraints on the behaviour of learners.
We develop a formal graphical language that allows us to separate the core tasks of a behaviour from its implementation details.
As proof-of-concept, we design a novel task that enables converting classifiers into generative models we call "manipulators"
arXiv Detail & Related papers (2024-07-02T16:50:27Z) - Anchor function: a type of benchmark functions for studying language
models [18.005251277048178]
We propose the concept of an anchor function to study language models in learning tasks that follow an "anchor-key" pattern.
The anchor function plays a role analogous to that of mice in diabetes research, particularly suitable for academic research.
arXiv Detail & Related papers (2024-01-16T12:10:49Z) - Large Language Models as Analogical Reasoners [155.9617224350088]
Chain-of-thought (CoT) prompting for language models demonstrates impressive performance across reasoning tasks.
We introduce a new prompting approach, analogical prompting, designed to automatically guide the reasoning process of large language models.
arXiv Detail & Related papers (2023-10-03T00:57:26Z) - Robust Graph Representation Learning via Predictive Coding [46.22695915912123]
Predictive coding is a message-passing framework initially developed to model information processing in the brain.
In this work, we build models that rely on the message-passing rule of predictive coding.
We show that the proposed models are comparable to standard ones in terms of performance in both inductive and transductive tasks.
arXiv Detail & Related papers (2022-12-09T03:58:22Z) - A Causal Framework to Quantify the Robustness of Mathematical Reasoning
with Language Models [81.15974174627785]
We study the behavior of language models in terms of robustness and sensitivity to direct interventions in the input space.
Our analysis shows that robustness does not appear to continuously improve as a function of size, but the GPT-3 Davinci models (175B) achieve a dramatic improvement in both robustness and sensitivity compared to all other GPT variants.
arXiv Detail & Related papers (2022-10-21T15:12:37Z) - Learning to Reason With Relational Abstractions [65.89553417442049]
We study how to build stronger reasoning capability in language models using the idea of relational abstractions.
We find that models that are supplied with such sequences as prompts can solve tasks with a significantly higher accuracy.
arXiv Detail & Related papers (2022-10-06T00:27:50Z) - Beyond the Imitation Game: Quantifying and extrapolating the
capabilities of language models [648.3665819567409]
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale.
Big-bench consists of 204 tasks, contributed by 450 authors across 132 institutions.
We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench.
arXiv Detail & Related papers (2022-06-09T17:05:34Z) - A Mathematical Exploration of Why Language Models Help Solve Downstream
Tasks [35.046596668631615]
Autoregressive language models, pretrained using large text corpora to do well on next word prediction, have been successful at solving many downstream tasks.
This paper initiates a mathematical study of this phenomenon for the downstream task of text classification.
arXiv Detail & Related papers (2020-10-07T20:56:40Z) - Text Modular Networks: Learning to Decompose Tasks in the Language of
Existing Models [61.480085460269514]
We propose a framework for building interpretable systems that learn to solve complex tasks by decomposing them into simpler ones solvable by existing models.
We use this framework to build ModularQA, a system that can answer multi-hop reasoning questions by decomposing them into sub-questions answerable by a neural factoid single-span QA model and a symbolic calculator.
arXiv Detail & Related papers (2020-09-01T23:45:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.