Modelling Intertextuality with N-gram Embeddings
- URL: http://arxiv.org/abs/2509.06637v2
- Date: Tue, 09 Sep 2025 07:20:13 GMT
- Title: Modelling Intertextuality with N-gram Embeddings
- Authors: Yi Xing,
- Abstract summary: This paper proposes a new quantitative model of intertextuality to enable scalable analysis and network-based insights.<n> Validation on four texts with known degrees of intertextuality, alongside a scalability test on 267 diverse texts, demonstrates the method's effectiveness and efficiency.
- Score: 0.8731440790248101
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Intertextuality is a central tenet in literary studies. It refers to the intricate links between literary texts that are created by various types of references. This paper proposes a new quantitative model of intertextuality to enable scalable analysis and network-based insights: perform pairwise comparisons of the embeddings of n-grams from two texts and average their results as the overall intertextuality. Validation on four texts with known degrees of intertextuality, alongside a scalability test on 267 diverse texts, demonstrates the method's effectiveness and efficiency. Network analysis further reveals centrality and community structures, affirming the approach's success in capturing and quantifying intertextual relationships.
Related papers
- Quantitative Intertextuality from the Digital Humanities Perspective: A Survey [0.46026199514486105]
The connection between texts is referred to as intertextuality in literary theory.<n>Over the past decade, advancements in natural language processing have ushered intertextuality studies into the quantitative age.<n>This paper provides a roadmap for quantitative intertextuality studies, summarizing their data, methods, and applications.
arXiv Detail & Related papers (2025-10-30T23:19:20Z) - Investigating Expert-in-the-Loop LLM Discourse Patterns for Ancient Intertextual Analysis [0.0]
The study demonstrates that large language models can detect direct quotations, allusions, and echoes between texts.
The model struggles with long query passages and the inclusion of false intertextual dependences.
The expert-in-the-loop methodology presented offers a scalable approach for intertextual research.
arXiv Detail & Related papers (2024-09-03T13:23:11Z) - BBScore: A Brownian Bridge Based Metric for Assessing Text Coherence [18.77248934443666]
Coherent texts inherently manifest a sequential and cohesive interplay among sentences.<n>BBScore is a reference-free metric grounded in Brownian bridge theory for assessing text coherence.
arXiv Detail & Related papers (2023-12-28T08:34:17Z) - Rule-Guided Joint Embedding Learning over Knowledge Graphs [2.797512394739081]
We propose a novel model that integrates both contextual and textual signals into entity and relation embeddings.<n>To better utilize context, we introduce two metrics: confidence, computed via a rule-based method, and relatedness, derived from textual representations.
arXiv Detail & Related papers (2023-12-01T19:58:31Z) - How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored.
Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges.
We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z) - An Inclusive Notion of Text [69.36678873492373]
We argue that clarity on the notion of text is crucial for reproducible and generalizable NLP.
We introduce a two-tier taxonomy of linguistic and non-linguistic elements that are available in textual sources and can be used in NLP modeling.
arXiv Detail & Related papers (2022-11-10T14:26:43Z) - Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - TeKo: Text-Rich Graph Neural Networks with External Knowledge [75.91477450060808]
We propose a novel text-rich graph neural network with external knowledge (TeKo)
We first present a flexible heterogeneous semantic network that incorporates high-quality entities.
We then introduce two types of external knowledge, that is, structured triplets and unstructured entity description.
arXiv Detail & Related papers (2022-06-15T02:33:10Z) - Revise and Resubmit: An Intertextual Model of Text-based Collaboration
in Peer Review [52.359007622096684]
Peer review is a key component of the publishing process in most fields of science.
Existing NLP studies focus on the analysis of individual texts.
editorial assistance often requires modeling interactions between pairs of texts.
arXiv Detail & Related papers (2022-04-22T16:39:38Z) - TextEssence: A Tool for Interactive Analysis of Semantic Shifts Between
Corpora [14.844685568451833]
We introduce TextEssence, an interactive system designed to enable comparative analysis of corpora using embeddings.
TextEssence includes visual, neighbor-based, and similarity-based modes of embedding analysis in a lightweight, web-based interface.
arXiv Detail & Related papers (2021-03-19T21:26:28Z) - Improving Machine Reading Comprehension with Contextualized Commonsense
Knowledge [62.46091695615262]
We aim to extract commonsense knowledge to improve machine reading comprehension.
We propose to represent relations implicitly by situating structured knowledge in a context.
We employ a teacher-student paradigm to inject multiple types of contextualized knowledge into a student machine reader.
arXiv Detail & Related papers (2020-09-12T17:20:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.