Image Collation: Matching illustrations in manuscripts
- URL: http://arxiv.org/abs/2108.08109v1
- Date: Wed, 18 Aug 2021 12:12:14 GMT
- Title: Image Collation: Matching illustrations in manuscripts
- Authors: Ryad Kaoua, Xi Shen, Alexandra Durr, Stavros Lazaris, David Picard,
Mathieu Aubry
- Abstract summary: We introduce the task of illustration collation and a large annotated public dataset to evaluate solutions.
We analyze state of the art similarity measures for this task and show that they succeed in simple cases but struggle for large manuscripts.
We show clear evidence that significant performance boosts can be expected by exploiting cycle-consistent correspondences.
- Score: 76.21388548732284
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Illustrations are an essential transmission instrument. For an historian, the
first step in studying their evolution in a corpus of similar manuscripts is to
identify which ones correspond to each other. This image collation task is
daunting for manuscripts separated by many lost copies, spreading over
centuries, which might have been completely re-organized and greatly modified
to adapt to novel knowledge or belief and include hundreds of illustrations.
Our contributions in this paper are threefold. First, we introduce the task of
illustration collation and a large annotated public dataset to evaluate
solutions, including 6 manuscripts of 2 different texts with more than 2 000
illustrations and 1 200 annotated correspondences. Second, we analyze state of
the art similarity measures for this task and show that they succeed in simple
cases but struggle for large manuscripts when the illustrations have undergone
very significant changes and are discriminated only by fine details. Finally,
we show clear evidence that significant performance boosts can be expected by
exploiting cycle-consistent correspondences. Our code and data are available on
http://imagine.enpc.fr/~shenx/ImageCollation.
Related papers
- Counterfactual Image Editing [54.21104691749547]
Counterfactual image editing is an important task in generative AI, which asks how an image would look if certain features were different.
We formalize the counterfactual image editing task using formal language, modeling the causal relationships between latent generative factors and images.
We develop an efficient algorithm to generate counterfactual images by leveraging neural causal models.
arXiv Detail & Related papers (2024-02-07T20:55:39Z) - Cones 2: Customizable Image Synthesis with Multiple Subjects [50.54010141032032]
We study how to efficiently represent a particular subject as well as how to appropriately compose different subjects.
By rectifying the activations in the cross-attention map, the layout appoints and separates the location of different subjects in the image.
arXiv Detail & Related papers (2023-05-30T18:00:06Z) - NewsStories: Illustrating articles with visual summaries [49.924916589209374]
We introduce a large-scale multimodal dataset containing over 31M articles, 22M images and 1M videos.
We show that state-of-the-art image-text alignment methods are not robust to longer narratives with multiple images.
We introduce an intuitive baseline that outperforms these methods on zero-shot image-set retrieval by 10% on the GoodNews dataset.
arXiv Detail & Related papers (2022-07-26T17:34:11Z) - Shrinking the Semantic Gap: Spatial Pooling of Local Moment Invariants
for Copy-Move Forgery Detection [7.460203098159187]
Copy-move forgery is a manipulation of copying and pasting specific patches from and to an image, with potentially illegal or unethical uses.
Recent advances in the forensic methods for copy-move forgery have shown increasing success in detection accuracy and robustness.
For images with high self-similarity or strong signal corruption, the existing algorithms often exhibit inefficient processes and unreliable results.
arXiv Detail & Related papers (2022-07-19T09:11:43Z) - Neural Graph Matching for Modification Similarity Applied to Electronic
Document Comparison [0.0]
Document comparison is a common task in the legal and financial industries.
In this paper, we present a novel neural graph matching approach applied to document comparison.
arXiv Detail & Related papers (2022-04-12T02:37:54Z) - Learning Co-segmentation by Segment Swapping for Retrieval and Discovery [67.6609943904996]
The goal of this work is to efficiently identify visually similar patterns from a pair of images.
We generate synthetic training pairs by selecting object segments in an image and copy-pasting them into another image.
We show our approach provides clear improvements for artwork details retrieval on the Brueghel dataset.
arXiv Detail & Related papers (2021-10-29T16:51:16Z) - Spectral Graph-based Features for Recognition of Handwritten Characters:
A Case Study on Handwritten Devanagari Numerals [0.0]
We propose an approach that exploits the robust graph representation and spectral graph embedding concept to represent handwritten characters.
For corroboration of the efficacy of the proposed method, extensive experiments were carried out on the standard handwritten numeral Computer Vision Pattern Recognition, Unit of Indian Statistical Institute Kolkata dataset.
arXiv Detail & Related papers (2020-07-07T08:40:08Z) - IMRAM: Iterative Matching with Recurrent Attention Memory for
Cross-Modal Image-Text Retrieval [105.77562776008459]
Existing methods leverage the attention mechanism to explore such correspondence in a fine-grained manner.
It may be difficult to optimally capture such sophisticated correspondences in existing methods.
We propose an Iterative Matching with Recurrent Attention Memory (IMRAM) method, in which correspondences are captured with multiple steps of alignments.
arXiv Detail & Related papers (2020-03-08T12:24:41Z) - Deep Multimodal Image-Text Embeddings for Automatic Cross-Media
Retrieval [0.0]
We introduce an end-to-end deep multimodal convolutional-recurrent network for learning both vision and language representations simultaneously.
The model learns which pairs are a match (positive) and which ones are a mismatch (negative) using a hinge-based triplet ranking.
arXiv Detail & Related papers (2020-02-23T23:58:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.