Related papers: Writer Identification and Writer Retrieval Based on NetVLAD with Re-ranking

Writer Identification and Writer Retrieval Based on NetVLAD with Re-ranking

URL: http://arxiv.org/abs/2012.06186v3
Date: Mon, 22 Feb 2021 18:27:50 GMT
Title: Writer Identification and Writer Retrieval Based on NetVLAD with Re-ranking
Authors: Shervin Rasoulzadeh, Bagher Babaali
Abstract summary: Writer identification and writer retrieval is considered as a challenging problem in the document analysis and recognition field. A novel pipeline is proposed for the problem by employing a unified neural network architecture consisting of the ResNet-20 as a feature extractor. A novel re-ranking strategy is introduced for the task of identification and retrieval based on $k$-reciprocal nearest neighbors.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper addresses writer identification and writer retrieval which is considered as a challenging problem in the document analysis and recognition field. In this work, a novel pipeline is proposed for the problem at hand by employing a unified neural network architecture consisting of the ResNet-20 as a feature extractor and an integrated NetVLAD layer, inspired by the vector of locally aggregated descriptors (VLAD), in the head of the latter part. Having defined this architecture, the triplet semi-hard loss function is used to directly learn an embedding for individual input image patches. Subsequently, generalized max-pooling technique is employed for the aggregation of embedded descriptors of each handwritten image. Also, a novel re-ranking strategy is introduced for the task of identification and retrieval based on $k$-reciprocal nearest neighbors, and it is shown that the pipeline can benefit tremendously from this step. Experimental evaluation has been done on the three publicly available datasets: the ICDAR 2013, CVL, and KHATT datasets. Results indicate that while we perform comparably to the state-of-the-art on the KHATT, our writer identification and writer retrieval pipeline achieves superior performance on the ICDAR 2013 and CVL datasets in terms of mAP.

Related papers

RoIPoly: Vectorized Building Outline Extraction Using Vertex and Logit Embeddings [5.093758132026397]
We propose a novel query-based approach for extracting building outlines from aerial or satellite imagery. We formulate each polygon as a query and constrain the query attention on the most relevant regions of a potential building. We evaluate our method on the vectorized building outline extraction dataset CrowdAI and the 2D floorplan reconstruction dataset Structured3D.
arXiv Detail & Related papers (2024-07-20T16:12:51Z)
Offline Writer Identification Using Convolutional Neural Network Activation Features [6.589323210821262]
Convolutional neural networks (CNNs) have recently become the state-of-the-art tool for large-scale image classification. In this work we propose the use of activation features from CNNs as local descriptors for writer identification. We evaluate our method on two publicly available datasets: the ICDAR 2013 benchmark database and the CVL dataset.
arXiv Detail & Related papers (2024-02-26T21:16:14Z)
PointHR: Exploring High-Resolution Architectures for 3D Point Cloud Segmentation [77.44144260601182]
We explore high-resolution architectures for 3D point cloud segmentation. We propose a unified pipeline named PointHR, which includes a knn-based sequence operator for feature extraction and a differential resampling operator. To evaluate these architectures for dense point cloud analysis, we conduct thorough experiments using S3DIS and ScanNetV2 datasets.
arXiv Detail & Related papers (2023-10-11T09:29:17Z)
CoVR-2: Automatic Data Construction for Composed Video Retrieval [59.854331104466254]
Composed Image Retrieval (CoIR) has recently gained popularity as a task that considers both text and image queries together. We propose a scalable automatic dataset creation methodology that generates triplets given video-caption pairs. We also expand the scope of the task to include composed video retrieval (CoVR)
arXiv Detail & Related papers (2023-08-28T17:55:33Z)
Efficient Match Pair Retrieval for Large-scale UAV Images via Graph Indexed Global Descriptor [9.402103660431791]
This paper proposes an efficient match pair retrieval method and implements an integrated workflow for parallel SfM reconstruction. The proposed solution has been verified using three large-scale datasets.
arXiv Detail & Related papers (2023-07-10T12:41:55Z)
Towards Writer Retrieval for Historical Datasets [0.6445605125467572]
unsupervised approach for writer retrieval based on clustering SIFT descriptors detected at keypoint locations. residual network followed by our proposed NetRVLAD, an encoding layer with reduced complexity. We show that our approach achieves comparable performance on a modern dataset as well.
arXiv Detail & Related papers (2023-05-09T11:44:44Z)
Learning Local Displacements for Point Cloud Completion [93.54286830844134]
We propose a novel approach aimed at object and semantic scene completion from a partial scan represented as a 3D point cloud. Our architecture relies on three novel layers that are used successively within an encoder-decoder structure. We evaluate both architectures on object and indoor scene completion tasks, achieving state-of-the-art performance.
arXiv Detail & Related papers (2022-03-30T18:31:37Z)
Open-Set Recognition: A Good Closed-Set Classifier is All You Need [146.6814176602689]
We show that the ability of a classifier to make the 'none-of-above' decision is highly correlated with its accuracy on the closed-set classes. We use this correlation to boost the performance of the cross-entropy OSR 'baseline' by improving its closed-set accuracy. We also construct new benchmarks which better respect the task of detecting semantic novelty.
arXiv Detail & Related papers (2021-10-12T17:58:59Z)
Deep Structured Instance Graph for Distilling Object Detectors [82.16270736573176]
We present a simple knowledge structure to exploit and encode information inside the detection system to facilitate detector knowledge distillation. We achieve new state-of-the-art results on the challenging COCO object detection task with diverse student-teacher pairs on both one- and two-stage detectors.
arXiv Detail & Related papers (2021-09-27T08:26:00Z)
MD-CSDNetwork: Multi-Domain Cross Stitched Network for Deepfake Detection [80.83725644958633]
Current deepfake generation methods leave discriminative artifacts in the frequency spectrum of fake images and videos. We present a novel approach, termed as MD-CSDNetwork, for combining the features in the spatial and frequency domains to mine a shared discriminative representation.
arXiv Detail & Related papers (2021-09-15T14:11:53Z)
Re-ranking for Writer Identification and Writer Retrieval [8.53463698903858]
We show that a re-ranking step based on k-reciprocal nearest neighbor relationships is advantageous for writer identification. We use these reciprocal relationships in two ways: encode them into new vectors, as originally proposed, or integrate them in terms of query-expansion.
arXiv Detail & Related papers (2020-07-14T15:21:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.