A Precis of Language Models are not Models of Language
- URL: http://arxiv.org/abs/2205.07634v1
- Date: Mon, 16 May 2022 12:50:58 GMT
- Title: A Precis of Language Models are not Models of Language
- Authors: Csaba Veres
- Abstract summary: We show that despite their many successes at performing linguistic tasks, Large Neural Language Models are ill-suited as comprehensive models of natural language.
In spite of the often overbearing optimism about AI, modern neural models do not represent a revolution in our understanding of cognition.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Natural Language Processing is one of the leading application areas in the
current resurgence of Artificial Intelligence, spearheaded by Artificial Neural
Networks. We show that despite their many successes at performing linguistic
tasks, Large Neural Language Models are ill-suited as comprehensive models of
natural language. The wider implication is that, in spite of the often
overbearing optimism about AI, modern neural models do not represent a
revolution in our understanding of cognition.
Related papers
- Modeling language contact with the Iterated Learning Model [0.0]
Iterated learning models are agent-based models of language change.
A recently introduced type of iterated learning model, the Semi-Supervised ILM is used to simulate language contact.
arXiv Detail & Related papers (2024-06-11T01:43:23Z) - Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models [74.81091933317882]
We introduce EvolvingQA, a temporally evolving question-answering benchmark designed for training and evaluating LMs on an evolving Wikipedia database.
We uncover that existing continual learning baselines suffer from updating and removing outdated knowledge.
Our work aims to model the dynamic nature of real-world information, suggesting faithful evaluations of the evolution-adaptability of language models.
arXiv Detail & Related papers (2023-11-14T12:12:02Z) - Formal Aspects of Language Modeling [74.16212987886013]
Large language models have become one of the most commonly deployed NLP inventions.
These notes are the accompaniment to the theoretical portion of the ETH Z"urich course on large language models.
arXiv Detail & Related papers (2023-11-07T20:21:42Z) - Diffusion Language Models Can Perform Many Tasks with Scaling and
Instruction-Finetuning [56.03057119008865]
We show that scaling diffusion language models can effectively make them strong language learners.
We build competent diffusion language models at scale by first acquiring knowledge from massive data.
Experiments show that scaling diffusion language models consistently improves performance across downstream language tasks.
arXiv Detail & Related papers (2023-08-23T16:01:12Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - Beyond the limitations of any imaginable mechanism: large language
models and psycholinguistics [0.0]
Large language models provide a model for language.
They are useful as a practical tool, as an illustrative comparative, and philosophical, as a basis for recasting the relationship between language and thought.
arXiv Detail & Related papers (2023-02-28T20:49:38Z) - Deep Learning Models to Study Sentence Comprehension in the Human Brain [0.1503974529275767]
Recent artificial neural networks that process natural language achieve unprecedented performance in tasks requiring sentence-level understanding.
We review works that compare these artificial language models with human brain activity and we assess the extent to which this approach has improved our understanding of the neural processes involved in natural language comprehension.
arXiv Detail & Related papers (2023-01-16T10:31:25Z) - Overcoming Barriers to Skill Injection in Language Modeling: Case Study
in Arithmetic [14.618731441943847]
We develop a novel framework that enables language models to be mathematically proficient while retaining their linguistic prowess.
Specifically, we offer information-theoretic interventions to overcome the catastrophic forgetting of linguistic skills that occurs while injecting non-linguistic skills into language models.
arXiv Detail & Related papers (2022-11-03T18:53:30Z) - Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models.
In detail, we first train neural language models with a novel dependency modeling objective.
We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z) - Language Models are not Models of Language [0.0]
Transfer learning has enabled large deep learning neural networks trained on the language modeling task to vastly improve performance.
We argue that the term language model is misleading because deep learning models are not theoretical models of language.
arXiv Detail & Related papers (2021-12-13T22:39:46Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.