Could the Road to Grounded, Neuro-symbolic AI be Paved with Words-as-Classifiers?
- URL: http://arxiv.org/abs/2507.06335v1
- Date: Tue, 08 Jul 2025 18:44:34 GMT
- Title: Could the Road to Grounded, Neuro-symbolic AI be Paved with Words-as-Classifiers?
- Authors: Casey Kennington, David Schlangen,
- Abstract summary: We make the case that one potential path forward in unifying all three semantic fields is paved with the words-as-classifier model.<n>We review that literature, motivate the words-as-classifiers model with an appeal to recent work in cognitive science, and describe a small experiment.
- Score: 16.12935949253909
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Formal, Distributional, and Grounded theories of computational semantics each have their uses and their drawbacks. There has been a shift to ground models of language by adding visual knowledge, and there has been a call to enrich models of language with symbolic methods to gain the benefits from formal, distributional, and grounded theories. In this paper, we attempt to make the case that one potential path forward in unifying all three semantic fields is paved with the words-as-classifier model, a model of word-level grounded semantics that has been incorporated into formalisms and distributional language models in the literature, and it has been well-tested within interactive dialogue settings. We review that literature, motivate the words-as-classifiers model with an appeal to recent work in cognitive science, and describe a small experiment. Finally, we sketch a model of semantics unified through words-as-classifiers.
Related papers
- A Distributional Perspective on Word Learning in Neural Language Models [57.41607944290822]
There are no widely agreed-upon metrics for word learning in language models.<n>We argue that distributional signatures studied in prior work fail to capture key distributional information.<n>We obtain learning trajectories for a selection of small language models we train from scratch.
arXiv Detail & Related papers (2025-02-09T13:15:59Z) - A Grounded Typology of Word Classes [7.201565960962933]
Inspired by information theory, we define "groundedness", an empirical measure of semantic contentfulness.<n>Our measure captures the contentfulness asymmetry between functional (grammatical) and lexical (content) classes across languages.<n>We release a dataset of groundedness scores for 30 languages.
arXiv Detail & Related papers (2024-12-13T18:58:48Z) - Collapsed Language Models Promote Fairness [88.48232731113306]
We find that debiased language models exhibit collapsed alignment between token representations and word embeddings.<n>We design a principled fine-tuning method that can effectively improve fairness in a wide range of debiasing methods.
arXiv Detail & Related papers (2024-10-06T13:09:48Z) - Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - Meaning Representations from Trajectories in Autoregressive Models [106.63181745054571]
We propose to extract meaning representations from autoregressive language models by considering the distribution of all possible trajectories extending an input text.
This strategy is prompt-free, does not require fine-tuning, and is applicable to any pre-trained autoregressive model.
We empirically show that the representations obtained from large models align well with human annotations, outperform other zero-shot and prompt-free methods on semantic similarity tasks, and can be used to solve more complex entailment and containment tasks that standard embeddings cannot handle.
arXiv Detail & Related papers (2023-10-23T04:35:58Z) - Topics in Contextualised Attention Embeddings [7.6650522284905565]
Recent work has demonstrated that conducting clustering on the word-level contextual representations from a language model emulates word clusters that are discovered in latent topics of words from Latent Dirichlet Allocation.
The important question is how such topical word clusters are automatically formed, through clustering, in the language model when it has not been explicitly designed to model latent topics.
Using BERT and DistilBERT, we find that the attention framework plays a key role in modelling such word topic clusters.
arXiv Detail & Related papers (2023-01-11T07:26:19Z) - Entailment Semantics Can Be Extracted from an Ideal Language Model [32.5500309433108]
We prove that entailment judgments between sentences can be extracted from an ideal language model.
We also show entailment judgments can be decoded from the predictions of a language model trained on such Gricean data.
arXiv Detail & Related papers (2022-09-26T04:16:02Z) - Language Acquisition is Embodied, Interactive, Emotive: a Research
Proposal [2.639737913330821]
We review the literature on the role of embodiment and emotion in the interactive setting of spoken dialogue as necessary prerequisites for language learning for human children.
We sketch a model of semantics that leverages current transformer-based models and a word-level grounded model, then explain the robot-dialogue system that will make use of our semantic model.
arXiv Detail & Related papers (2021-05-10T19:40:17Z) - Provable Limitations of Acquiring Meaning from Ungrounded Form: What
will Future Language Models Understand? [87.20342701232869]
We investigate the abilities of ungrounded systems to acquire meaning.
We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence.
We find that assertions enable semantic emulation if all expressions in the language are referentially transparent.
However, if the language uses non-transparent patterns like variable binding, we show that emulation can become an uncomputable problem.
arXiv Detail & Related papers (2021-04-22T01:00:17Z) - Lexical semantic change for Ancient Greek and Latin [61.69697586178796]
Associating a word's correct meaning in its historical context is a central challenge in diachronic research.
We build on a recent computational approach to semantic change based on a dynamic Bayesian mixture model.
We provide a systematic comparison of dynamic Bayesian mixture models for semantic change with state-of-the-art embedding-based models.
arXiv Detail & Related papers (2021-01-22T12:04:08Z) - A Neural Generative Model for Joint Learning Topics and Topic-Specific
Word Embeddings [42.87769996249732]
We propose a novel generative model to explore both local and global context for joint learning topics and topic-specific word embeddings.
The trained model maps words to topic-dependent embeddings, which naturally addresses the issue of word polysemy.
arXiv Detail & Related papers (2020-08-11T13:54:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.