Related papers: Incremental and Data-Efficient Concept Formation to Support Masked Word Prediction

Incremental and Data-Efficient Concept Formation to Support Masked Word Prediction

URL: http://arxiv.org/abs/2409.12440v1
Date: Thu, 19 Sep 2024 03:48:31 GMT
Title: Incremental and Data-Efficient Concept Formation to Support Masked Word Prediction
Authors: Xin Lian, Nishant Baglodi, Christopher J. MacLellan,
Abstract summary: This paper introduces Cobweb4L, a novel approach for efficient language model learning that supports masked word prediction. We show that Cobweb4L learns rapidly and achieves performance comparable to and even superior to Word2Vec.
Score: 0.7260176762955546
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper introduces Cobweb4L, a novel approach for efficient language model learning that supports masked word prediction. The approach builds on Cobweb, an incremental system that learns a hierarchy of probabilistic concepts. Each concept stores the frequencies of words that appear in instances tagged with that concept label. The system utilizes an attribute value representation to encode words and their surrounding context into instances. Cobweb4L uses the information theoretic variant of category utility and a new performance mechanism that leverages multiple concepts to generate predictions. We demonstrate that with these extensions it significantly outperforms prior Cobweb performance mechanisms that use only a single node to generate predictions. Further, we demonstrate that Cobweb4L learns rapidly and achieves performance comparable to and even superior to Word2Vec. Next, we show that Cobweb4L and Word2Vec outperform BERT in the same task with less training data. Finally, we discuss future work to make our conclusions more robust and inclusive.

Related papers

Retrieval-Augmented Semantic Parsing: Using Large Language Models to Improve Generalization [6.948555996661213]
We introduce Retrieval-Augmented Semantic Parsing (RASP), a simple yet effective approach that integrates external lexical knowledge into the parsing process. Our experiments show that LLMs outperform previous encoder-decoder baselines for semantic parsing.
arXiv Detail & Related papers (2024-12-13T15:30:20Z)
Spatio-Temporal Side Tuning Pre-trained Foundation Models for Video-based Pedestrian Attribute Recognition [58.79807861739438]
Existing pedestrian recognition (PAR) algorithms are mainly developed based on a static image. We propose to understand human attributes using video frames that can fully use temporal information.
arXiv Detail & Related papers (2024-04-27T14:43:32Z)
Exploring Category Structure with Contextual Language Models and Lexical Semantic Networks [0.0]
We test a wider array of methods for probing CLMs for predicting typicality scores. Our experiments, using BERT, show the importance of using the right type of CLM probes. Results highlight the importance of polysemy in this task.
arXiv Detail & Related papers (2023-02-14T09:57:23Z)
Efficient Induction of Language Models Via Probabilistic Concept Formation [13.632454840363916]
We present a novel approach to the acquisition of language models from corpora. The framework builds on Cobweb, an early system for constructing taxonomic hierarchies of probabilistic concepts. We explore three new extensions to Cobweb -- the Word, Leaf, and Path variants.
arXiv Detail & Related papers (2022-12-22T18:16:58Z)
Progressive Tree-Structured Prototype Network for End-to-End Image Captioning [74.8547752611337]
We propose a novel Progressive Tree-Structured prototype Network (dubbed PTSN) PTSN is the first attempt to narrow down the scope of prediction words with appropriate semantics by modeling the hierarchical textual semantics. Our method achieves a new state-of-the-art performance with 144.2% (single model) and 146.5% (ensemble of 4 models) CIDEr scores on Karpathy' split and 141.4% (c5) and 143.9% (c40) CIDEr scores on the official online test server.
arXiv Detail & Related papers (2022-11-17T11:04:00Z)
DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection [118.36746273425354]
This paper presents a paralleled visual-concept pre-training method for open-world detection by resorting to knowledge enrichment from a designed concept dictionary. By enriching the concepts with their descriptions, we explicitly build the relationships among various concepts to facilitate the open-domain learning. The proposed framework demonstrates strong zero-shot detection performances, e.g., on the LVIS dataset, our DetCLIP-T outperforms GLIP-T by 9.9% mAP and obtains a 13.5% improvement on rare categories.
arXiv Detail & Related papers (2022-09-20T02:01:01Z)
Better Language Model with Hypernym Class Prediction [101.8517004687825]
Class-based language models (LMs) have been long devised to address context sparsity in $n$-gram LMs. In this study, we revisit this approach in the context of neural LMs.
arXiv Detail & Related papers (2022-03-21T01:16:44Z)
Meta-Learning with Variational Semantic Memory for Word Sense Disambiguation [56.830395467247016]
We propose a model of semantic memory for WSD in a meta-learning setting. Our model is based on hierarchical variational inference and incorporates an adaptive memory update rule via a hypernetwork. We show our model advances the state of the art in few-shot WSD, supports effective learning in extremely data scarce scenarios.
arXiv Detail & Related papers (2021-06-05T20:40:01Z)
SpanNer: Named Entity Re-/Recognition as Span Prediction [62.66148736099347]
span prediction model is used for named entity recognition. We experimentally implement 154 systems on 11 datasets, covering three languages. Our model has been deployed into the ExplainaBoard platform.
arXiv Detail & Related papers (2021-06-01T17:11:42Z)
Predictive Representation Learning for Language Modeling [33.08232449211759]
Correlates of secondary information appear in LSTM representations even though they are not part of an emphexplicitly supervised prediction task. We propose Predictive Representation Learning (PRL), which explicitly constrains LSTMs to encode specific predictions.
arXiv Detail & Related papers (2021-05-29T05:03:47Z)
Attention Word Embedding [23.997145283950346]
We introduce the Attention Word Embedding (AWE) model, which integrates the attention mechanism into the CBOW model. We also propose AWE-S, which incorporates subword information. We demonstrate that AWE and AWE-S outperform the state-of-the-art word embedding models both on a variety of word similarity datasets.
arXiv Detail & Related papers (2020-06-01T14:47:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.