Quantitative Intertextuality from the Digital Humanities Perspective: A Survey
- URL: http://arxiv.org/abs/2510.27045v1
- Date: Thu, 30 Oct 2025 23:19:20 GMT
- Title: Quantitative Intertextuality from the Digital Humanities Perspective: A Survey
- Authors: Siyu Duan,
- Abstract summary: The connection between texts is referred to as intertextuality in literary theory.<n>Over the past decade, advancements in natural language processing have ushered intertextuality studies into the quantitative age.<n>This paper provides a roadmap for quantitative intertextuality studies, summarizing their data, methods, and applications.
- Score: 0.46026199514486105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The connection between texts is referred to as intertextuality in literary theory, which served as an important theoretical basis in many digital humanities studies. Over the past decade, advancements in natural language processing have ushered intertextuality studies into the quantitative age. Large-scale intertextuality research based on cutting-edge methods has continuously emerged. This paper provides a roadmap for quantitative intertextuality studies, summarizing their data, methods, and applications. Drawing on data from multiple languages and topics, this survey reviews methods from statistics to deep learning. It also summarizes their applications in humanities and social sciences research and the associated platform tools. Driven by advances in computer technology, more precise, diverse, and large-scale intertext studies can be anticipated. Intertextuality holds promise for broader application in interdisciplinary research bridging AI and the humanities.
Related papers
- Large-Scale Multidimensional Knowledge Profiling of Scientific Literature [46.15403461273178]
We compile a unified corpus of more than 100,000 papers from 22 major conferences between 2020 and 2025.<n>Our analysis highlights several notable shifts, including the growth of safety, multimodal reasoning, and agent-oriented studies.<n>These findings provide an evidence-based view of how AI research is evolving and offer a resource for understanding broader trends and identifying emerging directions.
arXiv Detail & Related papers (2026-01-21T16:47:05Z) - Documents Are People and Words Are Items: A Psychometric Approach to Textual Data with Contextual Embeddings [2.1494179586067537]
This research introduces a novel psychometric method for analyzing textual data using large language models.<n>By leveraging contextual embeddings, we transform textual data into response data suitable for psychometric analysis.
arXiv Detail & Related papers (2025-09-10T18:31:37Z) - Modelling Intertextuality with N-gram Embeddings [0.8731440790248101]
This paper proposes a new quantitative model of intertextuality to enable scalable analysis and network-based insights.<n> Validation on four texts with known degrees of intertextuality, alongside a scalability test on 267 diverse texts, demonstrates the method's effectiveness and efficiency.
arXiv Detail & Related papers (2025-09-08T12:54:38Z) - A Survey of Text Representation Methods and Their Genealogy [0.0]
In recent years, with the advent of highly scalable artificial-neural-network-based text representation methods the field of natural language processing has seen unprecedented growth and sophistication.
We provide a survey of current approaches, by arranging them in a genealogy, and by conceptualizing a taxonomy of text representation methods to examine and explain the state-of-the-art.
arXiv Detail & Related papers (2022-11-26T15:22:01Z) - An Inclusive Notion of Text [69.36678873492373]
We argue that clarity on the notion of text is crucial for reproducible and generalizable NLP.
We introduce a two-tier taxonomy of linguistic and non-linguistic elements that are available in textual sources and can be used in NLP modeling.
arXiv Detail & Related papers (2022-11-10T14:26:43Z) - Vision+X: A Survey on Multimodal Learning in the Light of Data [64.03266872103835]
multimodal machine learning that incorporates data from various sources has become an increasingly popular research area.
We analyze the commonness and uniqueness of each data format mainly ranging from vision, audio, text, and motions.
We investigate the existing literature on multimodal learning from both the representation learning and downstream application levels.
arXiv Detail & Related papers (2022-10-05T13:14:57Z) - Faithfulness in Natural Language Generation: A Systematic Survey of
Analysis, Evaluation and Optimization Methods [48.47413103662829]
Natural Language Generation (NLG) has made great progress in recent years due to the development of deep learning techniques such as pre-trained language models.
However, the faithfulness problem that the generated text usually contains unfaithful or non-factual information has become the biggest challenge.
arXiv Detail & Related papers (2022-03-10T08:28:32Z) - Software-Based Dialogue Systems: Survey, Taxonomy and Challenges [4.2763155274587366]
This paper reports a survey of the current state of research of conversational agents through a systematic literature review of secondary studies.
As a result, this research proposes a holistic taxonomy of the different dimensions involved in the conversational agents' field.
arXiv Detail & Related papers (2021-06-21T07:41:44Z) - Positioning yourself in the maze of Neural Text Generation: A
Task-Agnostic Survey [54.34370423151014]
This paper surveys the components of modeling approaches relaying task impacts across various generation tasks such as storytelling, summarization, translation etc.
We present an abstraction of the imperative techniques with respect to learning paradigms, pretraining, modeling approaches, decoding and the key challenges outstanding in the field in each of them.
arXiv Detail & Related papers (2020-10-14T17:54:42Z) - A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.