Accessibility and Trajectory-Based Text Characterization
- URL: http://arxiv.org/abs/2201.06665v1
- Date: Mon, 17 Jan 2022 23:33:11 GMT
- Title: Accessibility and Trajectory-Based Text Characterization
- Authors: B\'arbara C. e Souza and Filipi N. Silva and Henrique F. de Arruda and
Luciano da F. Costa and Diego R. Amancio
- Abstract summary: In particular, texts are characterized by a hierarchical structure that can be approached by using multi-scale concepts and methods.
We adopt an extension to the mesoscopic approach to represent text narratives, in which only the recurrent relationships among tagged parts of speech are considered.
The characterization of the texts was then achieved by considering scale-dependent complementary methods: accessibility, symmetry and recurrence signatures.
- Score: 0.6912244027050454
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Several complex systems are characterized by presenting intricate
characteristics extending along many scales. These characterizations are used
in various applications, including text classification, better understanding of
diseases, and comparison between cities, among others. In particular, texts are
also characterized by a hierarchical structure that can be approached by using
multi-scale concepts and methods. The present work aims at developing these
possibilities while focusing on mesoscopic representations of networks. More
specifically, we adopt an extension to the mesoscopic approach to represent
text narratives, in which only the recurrent relationships among tagged parts
of speech are considered to establish connections among sequential pieces of
text (e.g., paragraphs). The characterization of the texts was then achieved by
considering scale-dependent complementary methods: accessibility, symmetry and
recurrence signatures. In order to evaluate the potential of these concepts and
methods, we approached the problem of distinguishing between literary genres
(fiction and non-fiction). A set of 300 books organized into the two genres was
considered and were compared by using the aforementioned approaches. All the
methods were capable of differentiating to some extent between the two genres.
The accessibility and symmetry reflected the narrative asymmetries, while the
recurrence signature provide a more direct indication about the non-sequential
semantic connections taking place along the narrative.
Related papers
- Estimating the Influence of Sequentially Correlated Literary Properties in Textual Classification: A Data-Centric Hypothesis-Testing Approach [4.161155428666988]
Stylometry aims to distinguish authors by analyzing literary traits assumed to reflect semi-conscious choices distinct from elements like genre or theme.
While some literary properties, such as thematic content, are likely to manifest as correlations between adjacent text units, others, like authorial style, may be independent thereof.
We introduce a hypothesis-testing approach to evaluate the influence of sequentially correlated literary properties on text classification.
arXiv Detail & Related papers (2024-11-07T18:28:40Z) - Conjuring Semantic Similarity [59.18714889874088]
The semantic similarity between two textual expressions measures the distance between their latent'meaning'
We propose a novel approach whereby the semantic similarity among textual expressions is based not on other expressions they can be rephrased as, but rather based on the imagery they evoke.
Our method contributes a novel perspective on semantic similarity that not only aligns with human-annotated scores, but also opens up new avenues for the evaluation of text-conditioned generative models.
arXiv Detail & Related papers (2024-10-21T18:51:34Z) - Complex systems approach to natural language [0.0]
Review summarizes the main methodological concepts used in studying natural language from the perspective of complexity science.
Three main complexity-related research trends in quantitative linguistics are covered.
arXiv Detail & Related papers (2024-01-05T12:01:26Z) - How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored.
Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges.
We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z) - A Comparative Study of Sentence Embedding Models for Assessing Semantic
Variation [0.0]
We compare several recent sentence embedding methods via time-series of semantic similarity between successive sentences and matrices of pairwise sentence similarity for multiple books of literature.
We find that most of the sentence embedding methods considered do infer highly correlated patterns of semantic similarity in a given document, but show interesting differences.
arXiv Detail & Related papers (2023-08-08T23:31:10Z) - Tragic and Comical Networks. Clustering Dramatic Genres According to
Structural Properties [0.0]
A growing tradition in the joint field of network studies and drama history produces interpretations from the character networks of the plays.
Our aim is to create a method that is able to cluster texts with similar structures on the basis of the play's well-interpretable and simple properties.
Finding these features is the most important part of our research, as well as establishing the appropriate statistical procedure to calculate the similarities between the texts.
arXiv Detail & Related papers (2023-02-16T12:36:16Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - Relation Clustering in Narrative Knowledge Graphs [71.98234178455398]
relational sentences in the original text are embedded (with SBERT) and clustered in order to merge together semantically similar relations.
Preliminary tests show that such clustering might successfully detect similar relations, and provide a valuable preprocessing for semi-supervised approaches.
arXiv Detail & Related papers (2020-11-27T10:43:04Z) - Contextual Modulation for Relation-Level Metaphor Identification [3.2619536457181075]
We introduce a novel architecture for identifying relation-level metaphoric expressions of certain grammatical relations.
In a methodology inspired by works in visual reasoning, our approach is based on conditioning the neural network computation on the deep contextualised features.
We demonstrate that the proposed architecture achieves state-of-the-art results on benchmark datasets.
arXiv Detail & Related papers (2020-10-12T12:07:02Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - Temporal Embeddings and Transformer Models for Narrative Text
Understanding [72.88083067388155]
We present two approaches to narrative text understanding for character relationship modelling.
The temporal evolution of these relations is described by dynamic word embeddings, that are designed to learn semantic changes over time.
A supervised learning approach based on the state-of-the-art transformer model BERT is used instead to detect static relations between characters.
arXiv Detail & Related papers (2020-03-19T14:23:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.