An Informational Space Based Semantic Analysis for Scientific Texts
- URL: http://arxiv.org/abs/2205.15696v1
- Date: Tue, 31 May 2022 11:19:32 GMT
- Title: An Informational Space Based Semantic Analysis for Scientific Texts
- Authors: Neslihan Suzen, Alexander N. Gorban, Jeremy Levesley and Evgeny M.
Mirkes
- Abstract summary: This paper introduces computational methods for semantic analysis and the quantifying the meaning of short scientific texts.
The representation of scientific-specific meaning is standardised by replacing the situation representations, rather than psychological properties.
The research in this paper conducts the base for the geometric representation of the meaning of texts.
- Score: 62.997667081978825
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One major problem in Natural Language Processing is the automatic analysis
and representation of human language. Human language is ambiguous and deeper
understanding of semantics and creating human-to-machine interaction have
required an effort in creating the schemes for act of communication and
building common-sense knowledge bases for the 'meaning' in texts. This paper
introduces computational methods for semantic analysis and the quantifying the
meaning of short scientific texts. Computational methods extracting semantic
feature are used to analyse the relations between texts of messages and
'representations of situations' for a newly created large collection of
scientific texts, Leicester Scientific Corpus. The representation of
scientific-specific meaning is standardised by replacing the situation
representations, rather than psychological properties, with the vectors of some
attributes: a list of scientific subject categories that the text belongs to.
First, this paper introduces 'Meaning Space' in which the informational
representation of the meaning is extracted from the occurrence of the word in
texts across the scientific categories, i.e., the meaning of a word is
represented by a vector of Relative Information Gain about the subject
categories. Then, the meaning space is statistically analysed for Leicester
Scientific Dictionary-Core and we investigate 'Principal Components of the
Meaning' to describe the adequate dimensions of the meaning. The research in
this paper conducts the base for the geometric representation of the meaning of
texts.
Related papers
- Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - Semantic Analysis for Automated Evaluation of the Potential Impact of
Research Articles [62.997667081978825]
This paper presents a novel method for vector representation of text meaning based on information theory.
We show how this informational semantics is used for text classification on the basis of the Leicester Scientific Corpus.
We show that an informational approach to representing the meaning of a text has offered a way to effectively predict the scientific impact of research papers.
arXiv Detail & Related papers (2021-04-26T20:37:13Z) - Semantic maps and metrics for science Semantic maps and metrics for
science using deep transformer encoders [1.599072005190786]
Recent advances in natural language understanding driven by deep transformer networks offer new possibilities for mapping science.
Transformer embedding models capture shades of association and connotation that vary across different linguistic contexts.
We report a procedure for encoding scientific documents with these tools, measuring their improvement over static word embeddings.
arXiv Detail & Related papers (2021-04-13T04:12:20Z) - Principal Components of the Meaning [58.720142291102135]
We argue that (lexical) meaning in science can be represented in a 13 dimension Meaning Space.
This space is constructed using principal component analysis (singular decomposition) on the matrix of word category relative information gains.
arXiv Detail & Related papers (2020-09-18T14:28:32Z) - A computational model implementing subjectivity with the 'Room Theory'.
The case of detecting Emotion from Text [68.8204255655161]
This work introduces a new method to consider subjectivity and general context dependency in text analysis.
By using similarity measure between words, we are able to extract the relative relevance of the elements in the benchmark.
This method could be applied to all the cases where evaluating subjectivity is relevant to understand the relative value or meaning of a text.
arXiv Detail & Related papers (2020-05-12T21:26:04Z) - Informational Space of Meaning for Scientific Texts [68.8204255655161]
We introduce the Meaning Space, in which the meaning of a word is represented by a vector of Relative Information Gain (RIG) about the subject categories that the text belongs to.
This new approach is applied to construct the Meaning Space based on Leicester Scientific Corpus (LSC) and Leicester Scientific Dictionary-Core (LScDC)
The most informative words are presented for 252 categories. The proposed model based on RIG is shown to have ability to stand out topic-specific words in categories.
arXiv Detail & Related papers (2020-04-28T14:26:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.