Exploiting Multi-Scale Fusion, Spatial Attention and Patch Interaction
Techniques for Text-Independent Writer Identification
- URL: http://arxiv.org/abs/2111.10605v1
- Date: Sat, 20 Nov 2021 14:41:36 GMT
- Title: Exploiting Multi-Scale Fusion, Spatial Attention and Patch Interaction
Techniques for Text-Independent Writer Identification
- Authors: Abhishek Srivastava, Sukalpa Chanda, Umapada Pal
- Abstract summary: In this paper, three different deep learning techniques - spatial attention mechanism, multi-scale feature fusion and patch-based CNN were proposed to capture the difference between each writer's handwriting.
The proposed methods outperforms various state-of-the-art methodologies on word-level and page-level writer identification methods on three publicly available datasets.
- Score: 15.010153819096056
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text independent writer identification is a challenging problem that
differentiates between different handwriting styles to decide the author of the
handwritten text. Earlier writer identification relied on handcrafted features
to reveal pieces of differences between writers. Recent work with the advent of
convolutional neural network, deep learning-based methods have evolved. In this
paper, three different deep learning techniques - spatial attention mechanism,
multi-scale feature fusion and patch-based CNN were proposed to effectively
capture the difference between each writer's handwriting. Our methods are based
on the hypothesis that handwritten text images have specific spatial regions
which are more unique to a writer's style, multi-scale features propagate
characteristic features with respect to individual writers and patch-based
features give more general and robust representations that helps to
discriminate handwriting from different writers. The proposed methods
outperforms various state-of-the-art methodologies on word-level and page-level
writer identification methods on three publicly available datasets - CVL,
Firemaker, CERUG-EN datasets and give comparable performance on the IAM
dataset.
Related papers
- Writer Retrieval and Writer Identification in Greek Papyri [4.44566870214758]
Writer identification refers to the classification of known writers while writer retrieval seeks to find the writer by means of image similarity in a dataset of images.
While automatic writer identification/retrieval methods already provide promising results for many historical document types, papyri data is very challenging due to the fiber structures and severe artifacts.
We investigate several methods and show that a good binarization is key to an improved writer identification in papyri writings.
arXiv Detail & Related papers (2022-12-15T08:42:25Z) - PART: Pre-trained Authorship Representation Transformer [64.78260098263489]
Authors writing documents imprint identifying information within their texts: vocabulary, registry, punctuation, misspellings, or even emoji usage.
Previous works use hand-crafted features or classification tasks to train their authorship models, leading to poor performance on out-of-domain authors.
We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z) - Toward Understanding WordArt: Corner-Guided Transformer for Scene Text
Recognition [63.6608759501803]
We propose to recognize artistic text at three levels.
corner points are applied to guide the extraction of local features inside characters, considering the robustness of corner structures to appearance and shape.
Secondly, we design a character contrastive loss to model the character-level feature, improving the feature representation for character classification.
Thirdly, we utilize Transformer to learn the global feature on image-level and model the global relationship of the corner points.
arXiv Detail & Related papers (2022-07-31T14:11:05Z) - Letter-level Online Writer Identification [86.13203975836556]
We focus on a novel problem, letter-level online writer-id, which requires only a few trajectories of written letters as identification cues.
A main challenge is that a person often writes a letter in different styles from time to time.
We refer to this problem as the variance of online writing styles (Var-O-Styles)
arXiv Detail & Related papers (2021-12-06T07:21:53Z) - CRIS: CLIP-Driven Referring Image Segmentation [71.56466057776086]
We propose an end-to-end CLIP-Driven Referring Image framework (CRIS)
CRIS resorts to vision-language decoding and contrastive learning for achieving the text-to-pixel alignment.
Our proposed framework significantly outperforms the state-of-the-art performance without any post-processing.
arXiv Detail & Related papers (2021-11-30T07:29:08Z) - SmartPatch: Improving Handwritten Word Imitation with Patch
Discriminators [67.54204685189255]
We propose SmartPatch, a new technique increasing the performance of current state-of-the-art methods.
We combine the well-known patch loss with information gathered from the parallel trained handwritten text recognition system.
This leads to a more enhanced local discriminator and results in more realistic and higher-quality generated handwritten words.
arXiv Detail & Related papers (2021-05-21T18:34:21Z) - Neural Text Generation with Part-of-Speech Guided Softmax [82.63394952538292]
We propose using linguistic annotation, i.e., part-of-speech (POS), to guide the text generation.
We show that our proposed methods can generate more diverse text while maintaining comparable quality.
arXiv Detail & Related papers (2021-05-08T08:53:16Z) - MultiGBS: A multi-layer graph approach to biomedical summarization [6.11737116137921]
We propose a domain-specific method that models a document as a multi-layer graph to enable multiple features of the text to be processed at the same time.
The unsupervised method selects sentences from the multi-layer graph based on the MultiRank algorithm and the number of concepts.
The proposed MultiGBS algorithm employs UMLS and extracts the concepts and relationships using different tools such as SemRep, MetaMap, and OGER.
arXiv Detail & Related papers (2020-08-27T04:22:37Z) - A Skip-connected Multi-column Network for Isolated Handwritten Bangla
Character and Digit recognition [12.551285203114723]
We have proposed a non-explicit feature extraction method using a multi-scale multi-column skip convolutional neural network.
Our method is evaluated on four publicly available datasets of isolated handwritten Bangla characters and digits.
arXiv Detail & Related papers (2020-04-27T13:18:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.