Feature Mixing for Writer Retrieval and Identification on Papyri
Fragments
- URL: http://arxiv.org/abs/2306.12939v1
- Date: Thu, 22 Jun 2023 14:55:01 GMT
- Title: Feature Mixing for Writer Retrieval and Identification on Papyri
Fragments
- Authors: Marco Peer and Robert Sablatnig
- Abstract summary: This paper proposes a deep-learning-based approach to writer retrieval and identification for papyri.
We present a novel neural network architecture that combines a residual backbone with a feature mixing stage to improve retrieval performance.
- Score: 0.7614628596146599
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper proposes a deep-learning-based approach to writer retrieval and
identification for papyri, with a focus on identifying fragments associated
with a specific writer and those corresponding to the same image. We present a
novel neural network architecture that combines a residual backbone with a
feature mixing stage to improve retrieval performance, and the final descriptor
is derived from a projection layer. The methodology is evaluated on two
benchmarks: PapyRow, where we achieve a mAP of 26.6 % and 24.9 % on writer and
page retrieval, and HisFragIR20, showing state-of-the-art performance (44.0 %
and 29.3 % mAP). Furthermore, our network has an accuracy of 28.7 % for writer
identification. Additionally, we conduct experiments on the influence of two
binarization techniques on fragments and show that binarizing does not enhance
performance. Our code and models are available to the community.
Related papers
- An Efficient MLP-based Point-guided Segmentation Network for Ore Images
with Ambiguous Boundary [12.258442550351178]
This paper proposes a lightweight framework based on Multi-Layer Perceptron (MLP), which focuses on solving the problem of edge burring.
Our approach achieves a remarkable processing speed of over 27 frames per second with a model size of only 73 MB.
Our method delivers a consistently high level of accuracy, with impressive performance scores of 60.4 and 48.9 in$AP_50box$ and$AP_50mask$ respectively.
arXiv Detail & Related papers (2024-02-27T10:09:29Z) - Detecting and recognizing characters in Greek papyri with YOLOv8, DeiT
and SimCLR [9.7902367664742]
This paper discusses our submission to the ICDAR 2023 Competition on Detection and Recognition of Greek Letters on Papyri'
We used an ensemble of YOLOv8 models to detect and classify individual characters and employed two different approaches for refining the character predictions.
Our submission won the recognition challenge with a mAP of 42.2%, and was runner-up in the detection challenge with a mean average precision (mAP) of 51.4%.
arXiv Detail & Related papers (2024-01-23T06:08:00Z) - Towards Writer Retrieval for Historical Datasets [0.6445605125467572]
unsupervised approach for writer retrieval based on clustering SIFT descriptors detected at keypoint locations.
residual network followed by our proposed NetRVLAD, an encoding layer with reduced complexity.
We show that our approach achieves comparable performance on a modern dataset as well.
arXiv Detail & Related papers (2023-05-09T11:44:44Z) - Pattern Spotting and Image Retrieval in Historical Documents using Deep
Hashing [60.67014034968582]
This paper presents a deep learning approach for image retrieval and pattern spotting in digital collections of historical documents.
Deep learning models are used for feature extraction, considering two distinct variants, which provide either real-valued or binary code representations.
The proposed approach also reduces the search time by up to 200x and the storage cost up to 6,000x when compared to related works.
arXiv Detail & Related papers (2022-08-04T01:39:37Z) - Spatio-temporal Relation Modeling for Few-shot Action Recognition [100.3999454780478]
We propose a few-shot action recognition framework, STRM, which enhances class-specific featureriminability while simultaneously learning higher-order temporal representations.
Our approach achieves an absolute gain of 3.5% in classification accuracy, as compared to the best existing method in the literature.
arXiv Detail & Related papers (2021-12-09T18:59:14Z) - Face Trees for Expression Recognition [13.099925083569333]
We propose an end-to-end architecture for facial expression recognition.
The proposed architecture incorporates two main streams, one focusing on landmark positions to learn the structure of the face, the other focuses on patches around the landmarks to learn texture information.
We conduct extensive experiments on two large-scale publicly available facial expression datasets, AffectNet and FER2013, to evaluate the efficacy of our approach.
arXiv Detail & Related papers (2021-12-05T06:35:12Z) - G-DetKD: Towards General Distillation Framework for Object Detectors via
Contrastive and Semantic-guided Feature Imitation [49.421099172544196]
We propose a novel semantic-guided feature imitation technique, which automatically performs soft matching between feature pairs across all pyramid levels.
We also introduce contrastive distillation to effectively capture the information encoded in the relationship between different feature regions.
Our method consistently outperforms the existing detection KD techniques, and works when (1) components in the framework are used separately and in conjunction.
arXiv Detail & Related papers (2021-08-17T07:44:27Z) - Cross-domain Speech Recognition with Unsupervised Character-level
Distribution Matching [60.8427677151492]
We propose CMatch, a Character-level distribution matching method to perform fine-grained adaptation between each character in two domains.
Experiments on the Libri-Adapt dataset show that our proposed approach achieves 14.39% and 16.50% relative Word Error Rate (WER) reduction on both cross-device and cross-environment ASR.
arXiv Detail & Related papers (2021-04-15T14:36:54Z) - A Replication Study of Dense Passage Retriever [32.192420072129636]
We study the dense passage retriever (DPR) technique proposed by Karpukhin et al. ( 2020) for end-to-end open-domain question answering.
We present a replication study of this work, starting with model checkpoints provided by the authors.
We are able to improve end-to-end question answering effectiveness using exactly the same models as in the original work.
arXiv Detail & Related papers (2021-04-12T18:10:39Z) - Corner Proposal Network for Anchor-free, Two-stage Object Detection [174.59360147041673]
The goal of object detection is to determine the class and location of objects in an image.
This paper proposes a novel anchor-free, two-stage framework which first extracts a number of object proposals.
We demonstrate that these two stages are effective solutions for improving recall and precision.
arXiv Detail & Related papers (2020-07-27T19:04:57Z) - Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning [86.45526827323954]
Weakly-supervised semantic segmentation is a challenging task as no pixel-wise label information is provided for training.
We propose an iterative algorithm to learn such pairwise relations.
We show that the proposed algorithm performs favorably against the state-of-the-art methods.
arXiv Detail & Related papers (2020-02-19T10:32:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.