Principal Components of the Meaning
- URL: http://arxiv.org/abs/2009.08859v1
- Date: Fri, 18 Sep 2020 14:28:32 GMT
- Title: Principal Components of the Meaning
- Authors: Neslihan Suzen, Alexander Gorban, Jeremy Levesley, and Evgeny Mirkes
- Abstract summary: We argue that (lexical) meaning in science can be represented in a 13 dimension Meaning Space.
This space is constructed using principal component analysis (singular decomposition) on the matrix of word category relative information gains.
- Score: 58.720142291102135
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper we argue that (lexical) meaning in science can be represented
in a 13 dimension Meaning Space. This space is constructed using principal
component analysis (singular decomposition) on the matrix of word category
relative information gains, where the categories are those used by the Web of
Science, and the words are taken from a reduced word set from texts in the Web
of Science. We show that this reduced word set plausibly represents all texts
in the corpus, so that the principal component analysis has some objective
meaning with respect to the corpus. We argue that 13 dimensions is adequate to
describe the meaning of scientific texts, and hypothesise about the qualitative
meaning of the principal components.
Related papers
- Measuring Meaning Composition in the Human Brain with Composition Scores from Large Language Models [53.840982361119565]
The Composition Score is a novel model-based metric designed to quantify the degree of meaning composition during sentence comprehension.
Experimental findings show that this metric correlates with brain clusters associated with word frequency, structural processing, and general sensitivity to words.
arXiv Detail & Related papers (2024-03-07T08:44:42Z) - Neighboring Words Affect Human Interpretation of Saliency Explanations [65.29015910991261]
Word-level saliency explanations are often used to communicate feature-attribution in text-based models.
Recent studies found that superficial factors such as word length can distort human interpretation of the communicated saliency scores.
We investigate how the marking of a word's neighboring words affect the explainee's perception of the word's importance in the context of a saliency explanation.
arXiv Detail & Related papers (2023-05-04T09:50:25Z) - An Informational Space Based Semantic Analysis for Scientific Texts [62.997667081978825]
This paper introduces computational methods for semantic analysis and the quantifying the meaning of short scientific texts.
The representation of scientific-specific meaning is standardised by replacing the situation representations, rather than psychological properties.
The research in this paper conducts the base for the geometric representation of the meaning of texts.
arXiv Detail & Related papers (2022-05-31T11:19:32Z) - Semantic Analysis for Automated Evaluation of the Potential Impact of
Research Articles [62.997667081978825]
This paper presents a novel method for vector representation of text meaning based on information theory.
We show how this informational semantics is used for text classification on the basis of the Leicester Scientific Corpus.
We show that an informational approach to representing the meaning of a text has offered a way to effectively predict the scientific impact of research papers.
arXiv Detail & Related papers (2021-04-26T20:37:13Z) - Context-theoretic Semantics for Natural Language: an Algebraic Framework [0.0]
We present a framework for natural language semantics in which words, phrases and sentences are all represented as vectors.
We show that the vector representations of words can be considered as elements of an algebra over a field.
arXiv Detail & Related papers (2020-09-22T13:31:37Z) - Informational Space of Meaning for Scientific Texts [68.8204255655161]
We introduce the Meaning Space, in which the meaning of a word is represented by a vector of Relative Information Gain (RIG) about the subject categories that the text belongs to.
This new approach is applied to construct the Meaning Space based on Leicester Scientific Corpus (LSC) and Leicester Scientific Dictionary-Core (LScDC)
The most informative words are presented for 252 categories. The proposed model based on RIG is shown to have ability to stand out topic-specific words in categories.
arXiv Detail & Related papers (2020-04-28T14:26:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.