Related papers: Should Semantic Vector Composition be Explicit? Can it be Linear

Should Semantic Vector Composition be Explicit? Can it be Linear

URL: http://arxiv.org/abs/2104.06555v1
Date: Tue, 13 Apr 2021 23:58:26 GMT
Title: Should Semantic Vector Composition be Explicit? Can it be Linear
Authors: Dominic Widdows, Kristen Howell, Trevor Cohen
Abstract summary: Vector representations have become a central element in semantic language modelling. How should the concept wet fish' be represented? This paper surveys this question from two points of view.
Score: 5.6031349532829955
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Vector representations have become a central element in semantic language modelling, leading to mathematical overlaps with many fields including quantum theory. Compositionality is a core goal for such representations: given representations for `wet' and `fish', how should the concept `wet fish' be represented? This position paper surveys this question from two points of view. The first considers the question of whether an explicit mathematical representation can be successful using only tools from within linear algebra, or whether other mathematical tools are needed. The second considers whether semantic vector composition should be explicitly described mathematically, or whether it can be a model-internal side-effect of training a neural network. This paper is intended as a survey and motivation for discussion, and does not claim to give definitive answers to the questions posed. We speculate that these questions are related, and that the nonlinear operators used in implicitly compositional language models may inform explicit compositional modelling.

Related papers

The Origins of Representation Manifolds in Large Language Models [52.68554895844062]
We show that cosine similarity in representation space may encode the intrinsic geometry of a feature through shortest, on-manifold paths.<n>The critical assumptions and predictions of the theory are validated on text embeddings and token activations of large language models.
arXiv Detail & Related papers (2025-05-23T13:31:22Z)
A Complexity-Based Theory of Compositionality [53.025566128892066]
In AI, compositional representations can enable a powerful form of out-of-distribution generalization. Here, we propose a formal definition of compositionality that accounts for and extends our intuitions about compositionality. The definition is conceptually simple, quantitative, grounded in algorithmic information theory, and applicable to any representation.
arXiv Detail & Related papers (2024-10-18T18:37:27Z)
Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference [110.47649327040392]
Given time series data, how can we answer questions like "what will happen in the future?" and "how did we get here?" We show how these questions can have compact, closed form solutions in terms of learned representations.
arXiv Detail & Related papers (2024-03-06T22:27:30Z)
The Linear Representation Hypothesis and the Geometry of Large Language Models [12.387530469788738]
Informally, the 'linear representation hypothesis' is the idea that high-level concepts are represented linearly as directions in some representation space. This paper addresses two closely related questions: What does "linear representation" actually mean? We show how to unify all notions of linear representation using counterfactual pairs.
arXiv Detail & Related papers (2023-11-07T01:59:11Z)
Meaning Representations from Trajectories in Autoregressive Models [106.63181745054571]
We propose to extract meaning representations from autoregressive language models by considering the distribution of all possible trajectories extending an input text. This strategy is prompt-free, does not require fine-tuning, and is applicable to any pre-trained autoregressive model. We empirically show that the representations obtained from large models align well with human annotations, outperform other zero-shot and prompt-free methods on semantic similarity tasks, and can be used to solve more complex entailment and containment tasks that standard embeddings cannot handle.
arXiv Detail & Related papers (2023-10-23T04:35:58Z)
A Categorical Framework of General Intelligence [12.134564449202708]
Since Alan Turing asked this question in 1950, nobody is able to give a direct answer. We introduce a categorical framework towards this goal, with two main results.
arXiv Detail & Related papers (2023-03-08T13:37:01Z)
On the Complexity of Representation Learning in Contextual Linear Bandits [110.84649234726442]
We show that representation learning is fundamentally more complex than linear bandits. In particular, learning with a given set of representations is never simpler than learning with the worst realizable representation in the set.
arXiv Detail & Related papers (2022-12-19T13:08:58Z)
The Many-Worlds Calculus [0.0]
We propose a colored PROP to model computation in this framework. The model can support regular tests, probabilistic and non-deterministic branching, as well as quantum branching. We prove the language to be universal, and the equational theory to be complete with respect to this semantics.
arXiv Detail & Related papers (2022-06-21T10:10:26Z)
Fair Interpretable Learning via Correction Vectors [68.29997072804537]
We propose a new framework for fair representation learning centered around the learning of "correction vectors" The corrections are then simply summed up to the original features, and can therefore be analyzed as an explicit penalty or bonus to each feature. We show experimentally that a fair representation learning problem constrained in such a way does not impact performance.
arXiv Detail & Related papers (2022-01-17T10:59:33Z)
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks [35.046596668631615]
Autoregressive language models, pretrained using large text corpora to do well on next word prediction, have been successful at solving many downstream tasks. This paper initiates a mathematical study of this phenomenon for the downstream task of text classification.
arXiv Detail & Related papers (2020-10-07T20:56:40Z)
Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking [63.49779304362376]
Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models. We introduce a post-hoc method for interpreting the predictions of GNNs which identifies unnecessary edges. We show that we can drop a large proportion of edges without deteriorating the performance of the model.
arXiv Detail & Related papers (2020-10-01T17:51:19Z)
Context-theoretic Semantics for Natural Language: an Algebraic Framework [0.0]
We present a framework for natural language semantics in which words, phrases and sentences are all represented as vectors. We show that the vector representations of words can be considered as elements of an algebra over a field.
arXiv Detail & Related papers (2020-09-22T13:31:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.