Writer Identification and Writer Retrieval Based on NetVLAD with
Re-ranking
- URL: http://arxiv.org/abs/2012.06186v3
- Date: Mon, 22 Feb 2021 18:27:50 GMT
- Title: Writer Identification and Writer Retrieval Based on NetVLAD with
Re-ranking
- Authors: Shervin Rasoulzadeh, Bagher Babaali
- Abstract summary: Writer identification and writer retrieval is considered as a challenging problem in the document analysis and recognition field.
A novel pipeline is proposed for the problem by employing a unified neural network architecture consisting of the ResNet-20 as a feature extractor.
A novel re-ranking strategy is introduced for the task of identification and retrieval based on $k$-reciprocal nearest neighbors.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper addresses writer identification and writer retrieval which is
considered as a challenging problem in the document analysis and recognition
field. In this work, a novel pipeline is proposed for the problem at hand by
employing a unified neural network architecture consisting of the ResNet-20 as
a feature extractor and an integrated NetVLAD layer, inspired by the vector of
locally aggregated descriptors (VLAD), in the head of the latter part. Having
defined this architecture, the triplet semi-hard loss function is used to
directly learn an embedding for individual input image patches. Subsequently,
generalized max-pooling technique is employed for the aggregation of embedded
descriptors of each handwritten image. Also, a novel re-ranking strategy is
introduced for the task of identification and retrieval based on $k$-reciprocal
nearest neighbors, and it is shown that the pipeline can benefit tremendously
from this step. Experimental evaluation has been done on the three publicly
available datasets: the ICDAR 2013, CVL, and KHATT datasets. Results indicate
that while we perform comparably to the state-of-the-art on the KHATT, our
writer identification and writer retrieval pipeline achieves superior
performance on the ICDAR 2013 and CVL datasets in terms of mAP.
Related papers
- RoIPoly: Vectorized Building Outline Extraction Using Vertex and Logit Embeddings [5.093758132026397]
We propose a novel query-based approach for extracting building outlines from aerial or satellite imagery.
We formulate each polygon as a query and constrain the query attention on the most relevant regions of a potential building.
We evaluate our method on the vectorized building outline extraction dataset CrowdAI and the 2D floorplan reconstruction dataset Structured3D.
arXiv Detail & Related papers (2024-07-20T16:12:51Z) - SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding [56.079013202051094]
We present SegVG, a novel method transfers the box-level annotation as signals to provide an additional pixel-level supervision for Visual Grounding.
This approach allows us to iteratively exploit the annotation as signals for both box-level regression and pixel-level segmentation.
arXiv Detail & Related papers (2024-07-03T15:30:45Z) - Offline Writer Identification Using Convolutional Neural Network
Activation Features [6.589323210821262]
Convolutional neural networks (CNNs) have recently become the state-of-the-art tool for large-scale image classification.
In this work we propose the use of activation features from CNNs as local descriptors for writer identification.
We evaluate our method on two publicly available datasets: the ICDAR 2013 benchmark database and the CVL dataset.
arXiv Detail & Related papers (2024-02-26T21:16:14Z) - PointHR: Exploring High-Resolution Architectures for 3D Point Cloud
Segmentation [77.44144260601182]
We explore high-resolution architectures for 3D point cloud segmentation.
We propose a unified pipeline named PointHR, which includes a knn-based sequence operator for feature extraction and a differential resampling operator.
To evaluate these architectures for dense point cloud analysis, we conduct thorough experiments using S3DIS and ScanNetV2 datasets.
arXiv Detail & Related papers (2023-10-11T09:29:17Z) - Efficient Match Pair Retrieval for Large-scale UAV Images via Graph
Indexed Global Descriptor [9.402103660431791]
This paper proposes an efficient match pair retrieval method and implements an integrated workflow for parallel SfM reconstruction.
The proposed solution has been verified using three large-scale datasets.
arXiv Detail & Related papers (2023-07-10T12:41:55Z) - Towards Writer Retrieval for Historical Datasets [0.6445605125467572]
unsupervised approach for writer retrieval based on clustering SIFT descriptors detected at keypoint locations.
residual network followed by our proposed NetRVLAD, an encoding layer with reduced complexity.
We show that our approach achieves comparable performance on a modern dataset as well.
arXiv Detail & Related papers (2023-05-09T11:44:44Z) - Learning Local Displacements for Point Cloud Completion [93.54286830844134]
We propose a novel approach aimed at object and semantic scene completion from a partial scan represented as a 3D point cloud.
Our architecture relies on three novel layers that are used successively within an encoder-decoder structure.
We evaluate both architectures on object and indoor scene completion tasks, achieving state-of-the-art performance.
arXiv Detail & Related papers (2022-03-30T18:31:37Z) - Open-Set Recognition: A Good Closed-Set Classifier is All You Need [146.6814176602689]
We show that the ability of a classifier to make the 'none-of-above' decision is highly correlated with its accuracy on the closed-set classes.
We use this correlation to boost the performance of the cross-entropy OSR 'baseline' by improving its closed-set accuracy.
We also construct new benchmarks which better respect the task of detecting semantic novelty.
arXiv Detail & Related papers (2021-10-12T17:58:59Z) - Deep Structured Instance Graph for Distilling Object Detectors [82.16270736573176]
We present a simple knowledge structure to exploit and encode information inside the detection system to facilitate detector knowledge distillation.
We achieve new state-of-the-art results on the challenging COCO object detection task with diverse student-teacher pairs on both one- and two-stage detectors.
arXiv Detail & Related papers (2021-09-27T08:26:00Z) - MD-CSDNetwork: Multi-Domain Cross Stitched Network for Deepfake
Detection [80.83725644958633]
Current deepfake generation methods leave discriminative artifacts in the frequency spectrum of fake images and videos.
We present a novel approach, termed as MD-CSDNetwork, for combining the features in the spatial and frequency domains to mine a shared discriminative representation.
arXiv Detail & Related papers (2021-09-15T14:11:53Z) - Re-ranking for Writer Identification and Writer Retrieval [8.53463698903858]
We show that a re-ranking step based on k-reciprocal nearest neighbor relationships is advantageous for writer identification.
We use these reciprocal relationships in two ways: encode them into new vectors, as originally proposed, or integrate them in terms of query-expansion.
arXiv Detail & Related papers (2020-07-14T15:21:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.