Efficient Induction of Language Models Via Probabilistic Concept
Formation
- URL: http://arxiv.org/abs/2212.11937v1
- Date: Thu, 22 Dec 2022 18:16:58 GMT
- Title: Efficient Induction of Language Models Via Probabilistic Concept
Formation
- Authors: Christopher J. MacLellan, Peter Matsakis, Pat Langley
- Abstract summary: We present a novel approach to the acquisition of language models from corpora.
The framework builds on Cobweb, an early system for constructing taxonomic hierarchies of probabilistic concepts.
We explore three new extensions to Cobweb -- the Word, Leaf, and Path variants.
- Score: 13.632454840363916
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a novel approach to the acquisition of language models
from corpora. The framework builds on Cobweb, an early system for constructing
taxonomic hierarchies of probabilistic concepts that used a tabular,
attribute-value encoding of training cases and concepts, making it unsuitable
for sequential input like language. In response, we explore three new
extensions to Cobweb -- the Word, Leaf, and Path variants. These systems encode
each training case as an anchor word and surrounding context words, and they
store probabilistic descriptions of concepts as distributions over anchor and
context information. As in the original Cobweb, a performance element sorts a
new instance downward through the hierarchy and uses the final node to predict
missing features. Learning is interleaved with performance, updating concept
probabilities and hierarchy structure as classification occurs. Thus, the new
approaches process training cases in an incremental, online manner that it very
different from most methods for statistical language learning. We examine how
well the three variants place synonyms together and keep homonyms apart, their
ability to recall synonyms as a function of training set size, and their
training efficiency. Finally, we discuss related work on incremental learning
and directions for further research.
Related papers
- TRESTLE: A Model of Concept Formation in Structured Domains [4.399333421690168]
We present TRESTLE, an incremental account of probabilistic concept formation in structured domains.
We evaluate TRESTLE's performance on a supervised learning task and an unsupervised clustering task.
arXiv Detail & Related papers (2024-10-14T15:00:43Z) - Incremental and Data-Efficient Concept Formation to Support Masked Word Prediction [0.7260176762955546]
This paper introduces Cobweb4L, a novel approach for efficient language model learning that supports masked word prediction.
We show that Cobweb4L learns rapidly and achieves performance comparable to and even superior to Word2Vec.
arXiv Detail & Related papers (2024-09-19T03:48:31Z) - CoLLEGe: Concept Embedding Generation for Large Language Models [12.812113254812028]
CoLLEGe is a meta-learning framework capable of generating flexible embeddings for new concepts.
We design a series of tasks to test new concept learning in challenging real-world scenarios.
arXiv Detail & Related papers (2024-03-22T17:26:05Z) - A Recursive Bateson-Inspired Model for the Generation of Semantic Formal
Concepts from Spatial Sensory Data [77.34726150561087]
This paper presents a new symbolic-only method for the generation of hierarchical concept structures from complex sensory data.
The approach is based on Bateson's notion of difference as the key to the genesis of an idea or a concept.
The model is able to produce fairly rich yet human-readable conceptual representations without training.
arXiv Detail & Related papers (2023-07-16T15:59:13Z) - Joint Language Semantic and Structure Embedding for Knowledge Graph
Completion [66.15933600765835]
We propose to jointly embed the semantics in the natural language description of the knowledge triplets with their structure information.
Our method embeds knowledge graphs for the completion task via fine-tuning pre-trained language models.
Our experiments on a variety of knowledge graph benchmarks have demonstrated the state-of-the-art performance of our method.
arXiv Detail & Related papers (2022-09-19T02:41:02Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - A Minimalist Dataset for Systematic Generalization of Perception,
Syntax, and Semantics [131.93113552146195]
We present a new dataset, Handwritten arithmetic with INTegers (HINT), to examine machines' capability of learning generalizable concepts.
In HINT, machines are tasked with learning how concepts are perceived from raw signals such as images.
We undertake extensive experiments with various sequence-to-sequence models, including RNNs, Transformers, and GPT-3.
arXiv Detail & Related papers (2021-03-02T01:32:54Z) - Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size.
We propose a fully compositional output embedding layer for language models.
To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z) - On the Learnability of Concepts: With Applications to Comparing Word
Embedding Algorithms [0.0]
We introduce the notion of "concept" as a list of words that have shared semantic content.
We first use this notion to measure the learnability of concepts on pretrained word embeddings.
We then develop a statistical analysis of concept learnability, based on hypothesis testing and ROC curves, in order to compare the relative merits of various embedding algorithms.
arXiv Detail & Related papers (2020-06-17T14:25:36Z) - How Can We Accelerate Progress Towards Human-like Linguistic
Generalization? [22.810889064523167]
The paper describes and critiques the Pretraining-Agnostic Identically Distributed (PAID) evaluation paradigm.
This paradigm consists of three stages: (1) pre-training of a word prediction model on a corpus of arbitrary size; (2) fine-tuning (transfer learning) on a training set representing a classification task; (3) evaluation on a test set drawn from the same distribution as that training set.
arXiv Detail & Related papers (2020-05-03T00:31:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.