Unveiling Temporal Trends in 19th Century Literature: An Information Retrieval Approach
- URL: http://arxiv.org/abs/2501.06833v1
- Date: Sun, 12 Jan 2025 15:00:10 GMT
- Title: Unveiling Temporal Trends in 19th Century Literature: An Information Retrieval Approach
- Authors: Suchana Datta, Dwaipayan Roy, Derek Greene, Gerardine Meaney,
- Abstract summary: In English literature, the 19th century witnessed a significant transition in styles, themes, and genres.
This paper explores the evolution of term usage in 19th century English novels through the lens of information retrieval.
- Score: 5.804963603084041
- License:
- Abstract: In English literature, the 19th century witnessed a significant transition in styles, themes, and genres. Consequently, the novels from this period display remarkable diversity. This paper explores these variations by examining the evolution of term usage in 19th century English novels through the lens of information retrieval. By applying a query expansion-based approach to a decade-segmented collection of fiction from the British Library, we examine how related terms vary over time. Our analysis employs multiple standard metrics including Kendall's tau, Jaccard similarity, and Jensen-Shannon divergence to assess overlaps and shifts in expanded query term sets. Our results indicate a significant degree of divergence in the related terms across decades as selected by the query expansion technique, suggesting substantial linguistic and conceptual changes throughout the 19th century novels.
Related papers
- LFED: A Literary Fiction Evaluation Dataset for Large Language Models [58.85989777743013]
We collect 95 literary fictions that are either originally written in Chinese or translated into Chinese, covering a wide range of topics across several centuries.
We define a question taxonomy with 8 question categories to guide the creation of 1,304 questions.
We conduct an in-depth analysis to ascertain how specific attributes of literary fictions (e.g., novel types, character numbers, the year of publication) impact LLM performance in evaluations.
arXiv Detail & Related papers (2024-05-16T15:02:24Z) - Understanding Cross-Lingual Alignment -- A Survey [52.572071017877704]
Cross-lingual alignment is the meaningful similarity of representations across languages in multilingual language models.
We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field.
arXiv Detail & Related papers (2024-04-09T11:39:53Z) - Syntactic Language Change in English and German: Metrics, Parsers, and Convergences [56.47832275431858]
The current paper looks at diachronic trends in syntactic language change in both English and German, using corpora of parliamentary debates from the last c. 160 years.
We base our observations on five dependencys, including the widely used Stanford Core as well as 4 newer alternatives.
We show that changes in syntactic measures seem to be more frequent at the tails of sentence length distributions.
arXiv Detail & Related papers (2024-02-18T11:46:16Z) - A Novel Method for Analysing Racial Bias: Collection of Person Level
References [6.345851712811529]
We propose a novel method to analyze the differences in representation between two groups.
We examine the representation of African Americans and White Americans in books between 1850 to 2000 with the Google Books dataset.
arXiv Detail & Related papers (2023-10-24T14:00:01Z) - Query Expansion Using Contextual Clue Sampling with Language Models [69.51976926838232]
We propose a combination of an effective filtering strategy and fusion of the retrieved documents based on the generation probability of each context.
Our lexical matching based approach achieves a similar top-5/top-20 retrieval accuracy and higher top-100 accuracy compared with the well-established dense retrieval model DPR.
For end-to-end QA, the reader model also benefits from our method and achieves the highest Exact-Match score against several competitive baselines.
arXiv Detail & Related papers (2022-10-13T15:18:04Z) - Temporal Analysis on Topics Using Word2Vec [0.0]
The present study proposes a novel method of trend detection and visualization - more specifically, modeling the change in a topic over time.
The methodology was tested on a group of articles from various media houses present in the 20 Newsgroups dataset.
arXiv Detail & Related papers (2022-09-23T16:51:29Z) - A decomposition of book structure through ousiometric fluctuations in
cumulative word-time [1.181206257787103]
We look at how words change over the course of a book as a function of the number of words, rather than the fraction of the book.
We find that shorter books exhibit only a general trend, while longer books have fluctuations in addition to the general trend.
Our findings suggest that, in the ousiometric sense, longer books are not expanded versions of shorter books, but are more similar in structure to a concatenation of shorter texts.
arXiv Detail & Related papers (2022-08-19T18:17:27Z) - Textual Stylistic Variation: Choices, Genres and Individuals [0.8057441774248633]
This chapter argues for more informed target metrics for the statistical processing of stylistic variation in text collections.
This chapter discusses variation given by genre, and contrasts it to variation occasioned by individual choice.
arXiv Detail & Related papers (2022-05-01T16:39:49Z) - Semantics of European poetry is shaped by conservative forces: The
relationship between poetic meter and meaning in accentual-syllabic verse [0.0]
We provide the first large-scale formal evidence of the persistent association between poetic meter and semantics in 18-19th European literatures.
Our study traces this association through a series of clustering experiments using the abstracted semantic features of 150,000 poems.
arXiv Detail & Related papers (2021-09-15T08:20:01Z) - Lexical semantic change for Ancient Greek and Latin [61.69697586178796]
Associating a word's correct meaning in its historical context is a central challenge in diachronic research.
We build on a recent computational approach to semantic change based on a dynamic Bayesian mixture model.
We provide a systematic comparison of dynamic Bayesian mixture models for semantic change with state-of-the-art embedding-based models.
arXiv Detail & Related papers (2021-01-22T12:04:08Z) - A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.