Historical Printed Ornaments: Dataset and Tasks
- URL: http://arxiv.org/abs/2408.08633v1
- Date: Fri, 16 Aug 2024 09:54:12 GMT
- Title: Historical Printed Ornaments: Dataset and Tasks
- Authors: Sayan Kumar Chaki, Zeynep Sonat Baltaci, Elliot Vincent, Remi Emonet, Fabienne Vial-Bonacci, Christelle Bahier-Porte, Mathieu Aubry, Thierry Fournel,
- Abstract summary: This paper aims to develop the study of historical printed ornaments with modern unsupervised computer vision.
We highlight three complex tasks that are of critical interest to book historians: clustering, element discovery, and unsupervised change localization.
Our Rey's Ornaments dataset is designed to be a representative example of a set of ornaments historians would be interested in.
- Score: 12.309062405756686
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper aims to develop the study of historical printed ornaments with modern unsupervised computer vision. We highlight three complex tasks that are of critical interest to book historians: clustering, element discovery, and unsupervised change localization. For each of these tasks, we introduce an evaluation benchmark, and we adapt and evaluate state-of-the-art models. Our Rey's Ornaments dataset is designed to be a representative example of a set of ornaments historians would be interested in. It focuses on an XVIIIth century bookseller, Marc-Michel Rey, providing a consistent set of ornaments with a wide diversity and representative challenges. Our results highlight the limitations of state-of-the-art models when faced with real data and show simple baselines such as k-means or congealing can outperform more sophisticated approaches on such data. Our dataset and code can be found at https://printed-ornaments.github.io/.
Related papers
- Unlocking Comics: The AI4VA Dataset for Visual Understanding [62.345344799258804]
This paper presents a novel dataset comprising Franco-Belgian comics from the 1950s annotated for tasks including depth estimation, semantic segmentation, saliency detection, and character identification.
It consists of two distinct and consistent styles and incorporates object concepts and labels taken from natural images.
By including such diverse information across styles, this dataset not only holds promise for computational creativity but also offers avenues for the digitization of art and storytelling innovation.
arXiv Detail & Related papers (2024-10-27T14:27:05Z) - CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding [14.22900011952181]
We introduce a novel benchmark, CoMix, designed to evaluate the multi-task capabilities of models in comic analysis.
Our benchmark comprises three existing datasets with expanded annotations to support multi-task evaluation.
To mitigate the over-representation of manga-style data, we have incorporated a new dataset of carefully selected American comic-style books.
arXiv Detail & Related papers (2024-07-04T00:07:50Z) - Contrastive Entity Coreference and Disambiguation for Historical Texts [2.446672595462589]
Existing entity disambiguation methods often fall short in accuracy for historical documents, which are replete with individuals not remembered in contemporary knowledgebases.
This study makes three key contributions to improve cross-document coreference resolution and disambiguation in historical texts.
arXiv Detail & Related papers (2024-06-21T18:22:14Z) - Prompt me a Dataset: An investigation of text-image prompting for
historical image dataset creation using foundation models [0.9065034043031668]
We present a pipeline for image extraction from historical documents using foundation models.
We evaluate text-image prompts and their effectiveness on humanities datasets of varying levels of complexity.
arXiv Detail & Related papers (2023-09-04T15:37:03Z) - Fairness meets Cross-Domain Learning: a new perspective on Models and
Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness.
We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks.
Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z) - Unsupervised Neural Stylistic Text Generation using Transfer learning
and Adapters [66.17039929803933]
We propose a novel transfer learning framework which updates only $0.3%$ of model parameters to learn style specific attributes for response generation.
We learn style specific attributes from the PERSONALITY-CAPTIONS dataset.
arXiv Detail & Related papers (2022-10-07T00:09:22Z) - A Generic Image Retrieval Method for Date Estimation of Historical
Document Collections [0.4588028371034407]
This paper presents a robust date estimation system based in a retrieval approach that generalizes well in front of heterogeneous collections.
We use a ranking loss function named smooth-nDCG to train a Convolutional Neural Network that learns an ordination of documents for each problem.
arXiv Detail & Related papers (2022-04-08T12:30:39Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - Learning Co-segmentation by Segment Swapping for Retrieval and Discovery [67.6609943904996]
The goal of this work is to efficiently identify visually similar patterns from a pair of images.
We generate synthetic training pairs by selecting object segments in an image and copy-pasting them into another image.
We show our approach provides clear improvements for artwork details retrieval on the Brueghel dataset.
arXiv Detail & Related papers (2021-10-29T16:51:16Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z) - KaoKore: A Pre-modern Japanese Art Facial Expression Dataset [8.987910033541239]
We propose a new dataset KaoKore which consists of faces extracted from pre-modern Japanese artwork.
We demonstrate its value as both a dataset for image classification as well as a creative and artistic dataset, which we explore using generative models.
arXiv Detail & Related papers (2020-02-20T07:22:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.