Text-independent writer identification using convolutional neural
network
- URL: http://arxiv.org/abs/2009.04877v1
- Date: Thu, 10 Sep 2020 14:18:03 GMT
- Title: Text-independent writer identification using convolutional neural
network
- Authors: Hung Tuan Nguyen, Cuong Tuan Nguyen, Takeya Ino, Bipin Indurkhya,
Masaki Nakagawa
- Abstract summary: We propose an end-to-end deep-learning method for text-independent writer identification.
Our method achieved over 91.81% accuracy to classify writers.
- Score: 8.526559246026162
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The text-independent approach to writer identification does not require the
writer to write some predetermined text. Previous research on text-independent
writer identification has been based on identifying writer-specific features
designed by experts. However, in the last decade, deep learning methods have
been successfully applied to learn features from data automatically. We propose
here an end-to-end deep-learning method for text-independent writer
identification that does not require prior identification of features. A
Convolutional Neural Network (CNN) is trained initially to extract local
features, which represent characteristics of individual handwriting in the
whole character images and their sub-regions. Randomly sampled tuples of images
from the training set are used to train the CNN and aggregate the extracted
local features of images from the tuples to form global features. For every
training epoch, the process of randomly sampling tuples is repeated, which is
equivalent to a large number of training patterns being prepared for training
the CNN for text-independent writer identification. We conducted experiments on
the JEITA-HP database of offline handwritten Japanese character patterns. With
200 characters, our method achieved an accuracy of 99.97% to classify 100
writers. Even when using 50 characters for 100 writers or 100 characters for
400 writers, our method achieved accuracy levels of 92.80% or 93.82%,
respectively. We conducted further experiments on the Firemaker and IAM
databases of offline handwritten English text. Using only one page per writer
to train, our method achieved over 91.81% accuracy to classify 900 writers.
Overall, we achieved a better performance than the previously published best
result based on handcrafted features and clustering algorithms, which
demonstrates the effectiveness of our method for handwritten English text also.
Related papers
- Copy Is All You Need [66.00852205068327]
We formulate text generation as progressively copying text segments from an existing text collection.
Our approach achieves better generation quality according to both automatic and human evaluations.
Our approach attains additional performance gains by simply scaling up to larger text collections.
arXiv Detail & Related papers (2023-07-13T05:03:26Z) - Kurdish Handwritten Character Recognition using Deep Learning Techniques [26.23274417985375]
This paper attempts to design and develop a model that can recognize handwritten characters for Kurdish alphabets using deep learning techniques.
A comprehensive dataset was created for handwritten Kurdish characters, which contains more than 40 thousand images.
The tested results reported a 96% accuracy rate, and training accuracy reported a 97% accuracy rate.
arXiv Detail & Related papers (2022-10-18T16:48:28Z) - PART: Pre-trained Authorship Representation Transformer [64.78260098263489]
Authors writing documents imprint identifying information within their texts: vocabulary, registry, punctuation, misspellings, or even emoji usage.
Previous works use hand-crafted features or classification tasks to train their authorship models, leading to poor performance on out-of-domain authors.
We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z) - Writer Recognition Using Off-line Handwritten Single Block Characters [59.17685450892182]
We use personal identity numbers consisting of the six digits of the date of birth, DoB.
We evaluate two recognition approaches, one based on handcrafted features that compute directional measurements, and another based on deep features from a ResNet50 model.
Results show the presence of identity-related information in a piece of handwritten information as small as six digits with the DoB.
arXiv Detail & Related papers (2022-01-25T23:04:10Z) - Handwriting recognition and automatic scoring for descriptive answers in
Japanese language tests [7.489722641968594]
This paper presents an experiment of automatically scoring handwritten descriptive answers in the trial tests for the new Japanese university entrance examination.
Although all answers have been scored by human examiners, handwritten characters are not labeled.
We present our attempt to adapt deep neural network-based handwriting recognizers trained on a labeled handwriting dataset into this unlabeled answer set.
arXiv Detail & Related papers (2022-01-10T08:47:52Z) - Letter-level Online Writer Identification [86.13203975836556]
We focus on a novel problem, letter-level online writer-id, which requires only a few trajectories of written letters as identification cues.
A main challenge is that a person often writes a letter in different styles from time to time.
We refer to this problem as the variance of online writing styles (Var-O-Styles)
arXiv Detail & Related papers (2021-12-06T07:21:53Z) - Exploiting Multi-Scale Fusion, Spatial Attention and Patch Interaction
Techniques for Text-Independent Writer Identification [15.010153819096056]
In this paper, three different deep learning techniques - spatial attention mechanism, multi-scale feature fusion and patch-based CNN were proposed to capture the difference between each writer's handwriting.
The proposed methods outperforms various state-of-the-art methodologies on word-level and page-level writer identification methods on three publicly available datasets.
arXiv Detail & Related papers (2021-11-20T14:41:36Z) - Writer Identification Using Microblogging Texts for Social Media
Forensics [53.180678723280145]
We evaluate popular stylometric features, widely used in literary analysis, and specific Twitter features like URLs, hashtags, replies or quotes.
We test varying sized author sets and varying amounts of training/test texts per author.
arXiv Detail & Related papers (2020-07-31T00:23:18Z) - Offline Handwritten Chinese Text Recognition with Convolutional Neural
Networks [5.984124397831814]
In this paper, we build the models using only the convolutional neural networks and use CTC as the loss function.
We achieve 6.81% character error rate (CER) on the ICDAR 2013 competition set, which is the best published result without language model correction.
arXiv Detail & Related papers (2020-06-28T14:34:38Z) - Forensic Authorship Analysis of Microblogging Texts Using N-Grams and
Stylometric Features [63.48764893706088]
This work aims at identifying authors of tweet messages, which are limited to 280 characters.
We use for our experiments a self-captured database of 40 users, with 120 to 200 tweets per user.
Results using this small set are promising, with the different features providing a classification accuracy between 92% and 98.5%.
arXiv Detail & Related papers (2020-03-24T19:32:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.