Hidden Knowledge: Mathematical Methods for the Extraction of the
Fingerprint of Medieval Paper from Digital Images
- URL: http://arxiv.org/abs/2303.03794v1
- Date: Tue, 7 Mar 2023 11:01:19 GMT
- Title: Hidden Knowledge: Mathematical Methods for the Extraction of the
Fingerprint of Medieval Paper from Digital Images
- Authors: Tamara G. Grossmann, Carola-Bibiane Sch\"onlieb, Orietta Da Rold
- Abstract summary: Medieval paper is made with a mould which leaves an indelible imprint on the sheet of paper.
This imprint includes chain lines, laid lines and watermarks which are often visible on the sheet.
Extracting these features allows the identification of paper stock and gives information about chronology, localisation and movement of books and people.
- Score: 1.2891210250935146
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medieval paper, a handmade product, is made with a mould which leaves an
indelible imprint on the sheet of paper. This imprint includes chain lines,
laid lines and watermarks which are often visible on the sheet. Extracting
these features allows the identification of paper stock and gives information
about chronology, localisation and movement of books and people. Most
computational work for feature extraction of paper analysis has so far focused
on radiography or transmitted light images. While these imaging methods provide
clear visualisation for the features of interest, they are expensive and time
consuming in their acquisition and not feasible for smaller institutions.
However, reflected light images of medieval paper manuscripts are abundant and
possibly cheaper in their acquisition. In this paper, we propose algorithms to
detect and extract the laid and chain lines from reflected light images. We
tackle the main drawback of reflected light images, that is, the low contrast
attenuation of lines and intensity jumps due to noise and degradation, by
employing the spectral total variation decomposition and develop methods for
subsequent line extraction. Our results clearly demonstrate the feasibility of
using reflected light images in paper analysis. This work enables the feature
extraction for paper manuscripts that have otherwise not been analysed due to a
lack of appropriate images. We also open the door for paper stock
identification at scale.
Related papers
- MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions [64.89284104414865]
We introduce MagicLens, a series of self-supervised image retrieval models that support open-ended instructions.
MagicLens is built on a key novel insight: image pairs that naturally occur on the same web pages contain a wide range of implicit relations.
MagicLens achieves results comparable with or better than prior best on eight benchmarks of various image retrieval tasks.
arXiv Detail & Related papers (2024-03-28T17:59:20Z) - Efficient Annotation of Medieval Charters [2.6214349237099173]
Diplomatics, the analysis of medieval charters, is a major field of research in which paleography is applied.
We propose an effective and efficient annotation approach for charter segmentation, essentially reducing it to object detection.
We further annotate the data with the physical length in pixels and train regression neural networks to predict it from image patches.
arXiv Detail & Related papers (2023-06-24T22:55:55Z) - Tree-Ring Watermarks: Fingerprints for Diffusion Images that are
Invisible and Robust [55.91987293510401]
Watermarking the outputs of generative models is a crucial technique for tracing copyright and preventing potential harm from AI-generated content.
We introduce a novel technique called Tree-Ring Watermarking that robustly fingerprints diffusion model outputs.
Our watermark is semantically hidden in the image space and is far more robust than watermarking alternatives that are currently deployed.
arXiv Detail & Related papers (2023-05-31T17:00:31Z) - Deep Image Matting: A Comprehensive Survey [85.77905619102802]
This paper presents a review of recent advancements in image matting in the era of deep learning.
We focus on two fundamental sub-tasks: auxiliary input-based image matting and automatic image matting.
We discuss relevant applications of image matting and highlight existing challenges and potential opportunities for future research.
arXiv Detail & Related papers (2023-04-10T15:48:55Z) - Augraphy: A Data Augmentation Library for Document Images [59.457999432618614]
Augraphy is a Python library for constructing data augmentation pipelines.
It provides strategies to produce augmented versions of clean document images that appear to have been altered by standard office operations.
arXiv Detail & Related papers (2022-08-30T22:36:19Z) - Designing An Illumination-Aware Network for Deep Image Relighting [69.750906769976]
We present an Illumination-Aware Network (IAN) which follows the guidance from hierarchical sampling to progressively relight a scene from a single image.
In addition, an Illumination-Aware Residual Block (IARB) is designed to approximate the physical rendering process.
Experimental results show that our proposed method produces better quantitative and qualitative relighting results than previous state-of-the-art methods.
arXiv Detail & Related papers (2022-07-21T16:21:24Z) - Image-based material analysis of ancient historical documents [5.285396202883411]
This study uses images of a famous historical collection, the Dead Sea Scrolls, to propose a novel method to classify the materials of the manuscripts.
A binary classification system employing the transform with a majority voting process is shown to be effective for this classification task.
This pilot study shows a successful classification percentage of up to 97% for a confined amount of manuscripts produced from either parchment or papyrus material.
arXiv Detail & Related papers (2022-03-02T11:39:22Z) - Neural Content Extraction for Poster Generation of Scientific Papers [84.30128728027375]
The problem of poster generation for scientific papers is under-investigated.
Previous studies focus mainly on poster layout and panel composition, while neglecting the importance of content extraction.
To get both textual and visual elements of a poster panel, a neural extractive model is proposed to extract text, figures and tables of a paper section simultaneously.
arXiv Detail & Related papers (2021-12-16T01:19:37Z) - A Survey on Deep learning based Document Image Enhancement [5.279475826661643]
Digitized documents such as scientific articles, tax forms, invoices, contract papers, and historic texts are widely used nowadays.
These images could be degraded or damaged due to various reasons including poor lighting conditions when capturing the image, shadow while scanning them, distortion like noise and blur, aging, ink stain, bleed through, watermark, stamp, etc.
With recent advances in deep learning, many methods are proposed to enhance the quality of these document images.
arXiv Detail & Related papers (2021-12-06T00:24:50Z) - VIPPrint: A Large Scale Dataset of Printed and Scanned Images for
Synthetic Face Images Detection and Source Linking [26.02960434287235]
We present a new dataset composed of a large number of synthetic and natural printed face images.
We verify that state-of-the-art methods to distinguish natural and synthetic face images fail when applied to print and scanned images.
arXiv Detail & Related papers (2021-02-01T13:00:29Z) - Dissecting Image Crops [22.482090207522358]
The elementary operation of cropping underpins nearly every computer vision system.
This paper investigates the subtle traces introduced by this operation.
We study how to detect these traces, and investigate the impact that cropping has on the image distribution.
arXiv Detail & Related papers (2020-11-24T01:33:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.