Language Modeling with Reduced Densities
- URL: http://arxiv.org/abs/2007.03834v4
- Date: Sat, 27 Nov 2021 15:41:31 GMT
- Title: Language Modeling with Reduced Densities
- Authors: Tai-Danae Bradley and Yiannis Vlassopoulos
- Abstract summary: We show that sequences of symbols from a finite alphabet, such as those found in a corpus of text, form a category enriched over probabilities.
We then address a second fundamental question: How can this information be stored and modeled in a way that preserves the categorical structure?
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work originates from the observation that today's state-of-the-art
statistical language models are impressive not only for their performance, but
also - and quite crucially - because they are built entirely from correlations
in unstructured text data. The latter observation prompts a fundamental
question that lies at the heart of this paper: What mathematical structure
exists in unstructured text data? We put forth enriched category theory as a
natural answer. We show that sequences of symbols from a finite alphabet, such
as those found in a corpus of text, form a category enriched over
probabilities. We then address a second fundamental question: How can this
information be stored and modeled in a way that preserves the categorical
structure? We answer this by constructing a functor from our enriched category
of text to a particular enriched category of reduced density operators. The
latter leverages the Loewner order on positive semidefinite operators, which
can further be interpreted as a toy example of entailment.
Related papers
- Analyzing Text Representations by Measuring Task Alignment [2.198430261120653]
We develop a task alignment score based on hierarchical clustering that measures alignment at different levels of granularity.
Our experiments on text classification validate our hypothesis by showing that task alignment can explain the classification performance of a given representation.
arXiv Detail & Related papers (2023-05-31T11:20:48Z) - How Do Transformers Learn Topic Structure: Towards a Mechanistic
Understanding [56.222097640468306]
We provide mechanistic understanding of how transformers learn "semantic structure"
We show, through a combination of mathematical analysis and experiments on Wikipedia data, that the embedding layer and the self-attention layer encode the topical structure.
arXiv Detail & Related papers (2023-03-07T21:42:17Z) - A Multi-Grained Self-Interpretable Symbolic-Neural Model For
Single/Multi-Labeled Text Classification [29.075766631810595]
We propose a Symbolic-Neural model that can learn to explicitly predict class labels of text spans from a constituency tree.
As the structured language model learns to predict constituency trees in a self-supervised manner, only raw texts and sentence-level labels are required as training data.
Our experiments demonstrate that our approach could achieve good prediction accuracy in downstream tasks.
arXiv Detail & Related papers (2023-03-06T03:25:43Z) - Linear Spaces of Meanings: Compositional Structures in Vision-Language
Models [110.00434385712786]
We investigate compositional structures in data embeddings from pre-trained vision-language models (VLMs)
We first present a framework for understanding compositional structures from a geometric perspective.
We then explain what these structures entail probabilistically in the case of VLM embeddings, providing intuitions for why they arise in practice.
arXiv Detail & Related papers (2023-02-28T08:11:56Z) - What Are You Token About? Dense Retrieval as Distributions Over the
Vocabulary [68.77983831618685]
We propose to interpret the vector representations produced by dual encoders by projecting them into the model's vocabulary space.
We show that the resulting projections contain rich semantic information, and draw connection between them and sparse retrieval.
arXiv Detail & Related papers (2022-12-20T16:03:25Z) - Geometry-Aware Supertagging with Heterogeneous Dynamic Convolutions [0.7868449549351486]
We revisit constructive supertagging from a graph-theoretic perspective.
We propose a framework based on heterogeneous dynamic graph convolutions.
We test our approach on a number of categorial grammar datasets spanning different languages.
arXiv Detail & Related papers (2022-03-23T07:07:11Z) - An enriched category theory of language: from syntax to semantics [0.0]
We model probability distributions on texts as a category enriched over the unit interval.
We then pass to the enriched category of unit interval-valued copresheaves on this syntactical category to find semantic information.
arXiv Detail & Related papers (2021-06-15T05:40:51Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - Context-theoretic Semantics for Natural Language: an Algebraic Framework [0.0]
We present a framework for natural language semantics in which words, phrases and sentences are all represented as vectors.
We show that the vector representations of words can be considered as elements of an algebra over a field.
arXiv Detail & Related papers (2020-09-22T13:31:37Z) - Don't Judge an Object by Its Context: Learning to Overcome Contextual
Bias [113.44471186752018]
Existing models often leverage co-occurrences between objects and their context to improve recognition accuracy.
This work focuses on addressing such contextual biases to improve the robustness of the learnt feature representations.
arXiv Detail & Related papers (2020-01-09T18:31:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.