Self-Supervised Representation Learning for Online Handwriting Text
Classification
- URL: http://arxiv.org/abs/2310.06645v1
- Date: Tue, 10 Oct 2023 14:07:49 GMT
- Title: Self-Supervised Representation Learning for Online Handwriting Text
Classification
- Authors: Pouya Mehralian, Bagher BabaAli, Ashena Gorgan Mohammadi
- Abstract summary: We propose the novel Part of Stroke Masking (POSM) as a pretext task for pretraining models to extract informative representations from the online handwriting of individuals in English and Chinese languages.
To evaluate the quality of the extracted representations, we use both intrinsic and extrinsic evaluation methods.
The pretrained models are fine-tuned to achieve state-of-the-art results in tasks such as writer identification, gender classification, and handedness classification.
- Score: 0.8594140167290099
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised learning offers an efficient way of extracting rich
representations from various types of unlabeled data while avoiding the cost of
annotating large-scale datasets. This is achievable by designing a pretext task
to form pseudo labels with respect to the modality and domain of the data.
Given the evolving applications of online handwritten texts, in this study, we
propose the novel Part of Stroke Masking (POSM) as a pretext task for
pretraining models to extract informative representations from the online
handwriting of individuals in English and Chinese languages, along with two
suggested pipelines for fine-tuning the pretrained models. To evaluate the
quality of the extracted representations, we use both intrinsic and extrinsic
evaluation methods. The pretrained models are fine-tuned to achieve
state-of-the-art results in tasks such as writer identification, gender
classification, and handedness classification, also highlighting the
superiority of utilizing the pretrained models over the models trained from
scratch.
Related papers
- Self-Alignment with Instruction Backtranslation [162.02529653768096]
We present a method to build a high quality instruction following language model by automatically labelling human-written text with corresponding instructions.
Our approach, named instruction backtranslation, starts with a language model finetuned on a small amount of seed data, and a given web corpus.
arXiv Detail & Related papers (2023-08-11T17:47:54Z) - Zero-Shot Text Classification via Self-Supervised Tuning [46.9902502503747]
We propose a new paradigm based on self-supervised learning to solve zero-shot text classification tasks.
tuning the language models with unlabeled data, called self-supervised tuning.
Our model outperforms the state-of-the-art baselines on 7 out of 10 tasks.
arXiv Detail & Related papers (2023-05-19T05:47:33Z) - How to Choose Pretrained Handwriting Recognition Models for Single
Writer Fine-Tuning [23.274139396706264]
Recent advancements in Deep Learning-based Handwritten Text Recognition (HTR) have led to models with remarkable performance on modern and historical manuscripts.
Those models struggle to obtain the same performance when applied to manuscripts with peculiar characteristics, such as language, paper support, ink, and author handwriting.
In this paper, we take into account large, real benchmark datasets and synthetic ones obtained with a styled Handwritten Text Generation model.
We give a quantitative indication of the most relevant characteristics of such data for obtaining an HTR model able to effectively transcribe manuscripts in small collections with as little as five real fine-tuning lines
arXiv Detail & Related papers (2023-05-04T07:00:28Z) - PART: Pre-trained Authorship Representation Transformer [64.78260098263489]
Authors writing documents imprint identifying information within their texts: vocabulary, registry, punctuation, misspellings, or even emoji usage.
Previous works use hand-crafted features or classification tasks to train their authorship models, leading to poor performance on out-of-domain authors.
We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z) - Self-Training of Handwritten Word Recognition for Synthetic-to-Real
Adaptation [4.111899441919165]
We propose a self-training approach to train a Handwritten Text Recognition model.
The proposed training scheme uses an initial model trained on synthetic data to make predictions for the unlabeled target dataset.
We evaluate the proposed method on four widely used benchmark datasets and show its effectiveness on closing the gap to a model trained in a fully-supervised manner.
arXiv Detail & Related papers (2022-06-07T09:43:25Z) - Curriculum-Based Self-Training Makes Better Few-Shot Learners for
Data-to-Text Generation [56.98033565736974]
We propose Curriculum-Based Self-Training (CBST) to leverage unlabeled data in a rearranged order determined by the difficulty of text generation.
Our method can outperform fine-tuning and task-adaptive pre-training methods, and achieve state-of-the-art performance in the few-shot setting of data-to-text generation.
arXiv Detail & Related papers (2022-06-06T16:11:58Z) - Lacuna Reconstruction: Self-supervised Pre-training for Low-Resource
Historical Document Transcription [25.76860672652937]
We show a meaningful improvement in recognition accuracy over the same supervised model trained from scratch with as few as 30 line image transcriptions for training.
Our masked language model-style pre-training strategy, where the model is trained to be able to identify the true masked visual representation from distractors sampled from within the same line, encourages learning robust contextualized language representations.
arXiv Detail & Related papers (2021-12-16T08:28:26Z) - Generating More Pertinent Captions by Leveraging Semantics and Style on
Multi-Source Datasets [56.018551958004814]
This paper addresses the task of generating fluent descriptions by training on a non-uniform combination of data sources.
Large-scale datasets with noisy image-text pairs provide a sub-optimal source of supervision.
We propose to leverage and separate semantics and descriptive style through the incorporation of a style token and keywords extracted through a retrieval component.
arXiv Detail & Related papers (2021-11-24T19:00:05Z) - Is BERT a Cross-Disciplinary Knowledge Learner? A Surprising Finding of
Pre-trained Models' Transferability [74.11825654535895]
We investigate whether the power of the models pre-trained on text data, such as BERT, can be transferred to general token sequence classification applications.
We find that even on non-text data, the models pre-trained on text converge faster than the randomly models.
arXiv Detail & Related papers (2021-03-12T09:19:14Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.