From exemplar to copy: the scribal appropriation of a Hadewijch
manuscript computationally explored
- URL: http://arxiv.org/abs/2210.14061v4
- Date: Thu, 6 Apr 2023 15:28:58 GMT
- Title: From exemplar to copy: the scribal appropriation of a Hadewijch
manuscript computationally explored
- Authors: Wouter Haverals, Mike Kestemont
- Abstract summary: This study is devoted to two of the oldest known manuscripts in which the oeuvre of the medieval mystical author Hadewijch has been preserved.
It is assumed that the scribe who produced B used A as an exemplar.
We use machine learning to identify the most distinctive features that separate manuscript A from B.
- Score: 1.9798034349981157
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study is devoted to two of the oldest known manuscripts in which the
oeuvre of the medieval mystical author Hadewijch has been preserved: Brussels,
KBR, 2879-2880 (ms. A) and Brussels, KBR, 2877-2878 (ms. B). On the basis of
codicological and contextual arguments, it is assumed that the scribe who
produced B used A as an exemplar. While the similarities in both layout and
content between the two manuscripts are striking, the present article seeks to
identify the differences. After all, regardless of the intention to produce a
copy that closely follows the exemplar, subtle linguistic variation is
apparent. Divergences relate to spelling conventions, but also to the way in
which words are abbreviated (and the extent to which abbreviations occur). The
present study investigates the spelling profiles of the scribes who produced
mss. A and B in a computational way. In the first part of this study, we will
present both manuscripts in more detail, after which we will consider prior
research carried out on scribal profiling. The current study both builds and
expands on Kestemont (2015). Next, we outline the methodology used to analyse
and measure the degree of scribal appropriation that took place when ms. B was
copied off the exemplar ms. A. After this, we will discuss the results
obtained, focusing on the scribal variation that can be found both at the level
of individual words and n-grams. To this end, we use machine learning to
identify the most distinctive features that separate manuscript A from B.
Finally, we look at possible diachronic trends in the appropriation by B's
scribe of his exemplar. We argue that scribal takeovers in the exemplar impacts
the practice of the copying scribe, while transitions to a different content
matter cause little to no effect.
Related papers
- CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation [132.00910067533982]
We introduce CopyBench, a benchmark designed to measure both literal and non-literal copying in LM generations.
We find that, although literal copying is relatively rare, two types of non-literal copying -- event copying and character copying -- occur even in models as small as 7B parameters.
arXiv Detail & Related papers (2024-07-09T17:58:18Z) - Do Pretrained Contextual Language Models Distinguish between Hebrew Homograph Analyses? [12.631897904322676]
We study the extent to which Hebrew homographs can be disambiguated and analyzed using pre-trained language models.
We show that contemporary Hebrew contextualized embeddings outperform non-contextualized embeddings.
We also show that these embeddings are equally effective for homographs of both balanced and skewed distributions.
arXiv Detail & Related papers (2024-05-11T21:50:56Z) - FRACAS: A FRench Annotated Corpus of Attribution relations in newS [0.0]
We present a manually annotated corpus of 1676 newswire texts in French for quotation extraction and source attribution.
We first describe the composition of our corpus and the choices that were made in selecting the data.
We then detail our inter-annotator agreement between the 8 annotators who worked on manual labelling.
arXiv Detail & Related papers (2023-09-19T13:19:54Z) - Same or Different? Diff-Vectors for Authorship Analysis [78.83284164605473]
In classic'' authorship analysis a feature vector represents a document, the value of a feature represents (an increasing function of) the relative frequency of the feature in the document, and the class label represents the author of the document.
Our experiments tackle same-author verification, authorship verification, and closed-set authorship attribution; while DVs are naturally geared for solving the 1st, we also provide two novel methods for solving the 2nd and 3rd.
arXiv Detail & Related papers (2023-01-24T08:48:12Z) - PART: Pre-trained Authorship Representation Transformer [64.78260098263489]
Authors writing documents imprint identifying information within their texts: vocabulary, registry, punctuation, misspellings, or even emoji usage.
Previous works use hand-crafted features or classification tasks to train their authorship models, leading to poor performance on out-of-domain authors.
We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z) - Reinforcing Semantic-Symmetry for Document Summarization [15.113768658584979]
Document summarization condenses a long document into a short version with salient information and accurate semantic descriptions.
This paper introduces a new textbfreinforcing stextbfemantic-textbfsymmetry learning textbfmodel is proposed for document summarization.
A series of experiments have been conducted on two wildly used benchmark datasets CNN/Daily Mail and BigPatent.
arXiv Detail & Related papers (2021-12-14T17:41:37Z) - Image Collation: Matching illustrations in manuscripts [76.21388548732284]
We introduce the task of illustration collation and a large annotated public dataset to evaluate solutions.
We analyze state of the art similarity measures for this task and show that they succeed in simple cases but struggle for large manuscripts.
We show clear evidence that significant performance boosts can be expected by exploiting cycle-consistent correspondences.
arXiv Detail & Related papers (2021-08-18T12:12:14Z) - Revisiting Rashomon: A Comment on "The Two Cultures" [95.81740983484471]
Breiman dubbed the "Rashomon Effect", describing the situation in which there are many models that satisfy predictive accuracy criteria equally well, but process information in substantially different ways.
This phenomenon can make it difficult to draw conclusions or automate decisions based on a model fit to data.
I make connections to recent work in the Machine Learning literature that explore the implications of this issue.
arXiv Detail & Related papers (2021-04-05T20:51:58Z) - Artificial intelligence based writer identification generates new
evidence for the unknown scribes of the Dead Sea Scrolls exemplified by the
Great Isaiah Scroll (1QIsaa) [5.285396202883411]
We use pattern recognition and artificial intelligence techniques to innovate the palaeography of the scrolls regarding writer identification.
Although many scholars believe that 1QIsaa was written by one scribe, we report new evidence for a breaking point in the series of columns in this scroll.
This study sheds new light on the Bible's ancient scribal culture by providing new, tangible evidence that ancient biblical texts were not copied by a single scribe only but that multiple scribes could closely collaborate on one particular manuscript.
arXiv Detail & Related papers (2020-10-27T17:36:18Z) - The Extraordinary Failure of Complement Coercion Crowdsourcing [50.599433903377374]
Crowdsourcing has eased and scaled up the collection of linguistic annotation in recent years.
We aim to collect annotated data for this phenomenon by reducing it to either of two known tasks: Explicit Completion and Natural Language Inference.
In both cases, crowdsourcing resulted in low agreement scores, even though we followed the same methodologies as in previous work.
arXiv Detail & Related papers (2020-10-12T19:04:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.