Bootstrapping Weakly Supervised Segmentation-free Word Spotting through
HMM-based Alignment
- URL: http://arxiv.org/abs/2003.11087v1
- Date: Tue, 24 Mar 2020 19:41:18 GMT
- Title: Bootstrapping Weakly Supervised Segmentation-free Word Spotting through
HMM-based Alignment
- Authors: Tomas Wilkinson and Carl Nettelblad
- Abstract summary: We propose an approach that utilises transcripts without bounding box annotations to train word spotting models.
This is done through a training-free alignment procedure based on hidden Markov models.
We believe that this will be a significant advance towards a more general use of word spotting, since digital transcription data will already exist for parts of many collections of interest.
- Score: 0.5076419064097732
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent work in word spotting in handwritten documents has yielded impressive
results. This progress has largely been made by supervised learning systems,
which are dependent on manually annotated data, making deployment to new
collections a significant effort. In this paper, we propose an approach that
utilises transcripts without bounding box annotations to train
segmentation-free query-by-string word spotting models, given a partially
trained model. This is done through a training-free alignment procedure based
on hidden Markov models. This procedure creates a tentative mapping between
word region proposals and the transcriptions to automatically create additional
weakly annotated training data, without choosing any single alignment
possibility as the correct one. When only using between 1% and 7% of the fully
annotated training sets for partial convergence, we automatically annotate the
remaining training data and successfully train using it. On all our datasets,
our final trained model then comes within a few mAP% of the performance from a
model trained with the full training set used as ground truth. We believe that
this will be a significant advance towards a more general use of word spotting,
since digital transcription data will already exist for parts of many
collections of interest.
Related papers
- Towards Efficient Active Learning in NLP via Pretrained Representations [1.90365714903665]
Fine-tuning Large Language Models (LLMs) is now a common approach for text classification in a wide range of applications.
We drastically expedite this process by using pretrained representations of LLMs within the active learning loop.
Our strategy yields similar performance to fine-tuning all the way through the active learning loop but is orders of magnitude less computationally expensive.
arXiv Detail & Related papers (2024-02-23T21:28:59Z) - One-bit Supervision for Image Classification: Problem, Solution, and
Beyond [114.95815360508395]
This paper presents one-bit supervision, a novel setting of learning with fewer labels, for image classification.
We propose a multi-stage training paradigm and incorporate negative label suppression into an off-the-shelf semi-supervised learning algorithm.
In multiple benchmarks, the learning efficiency of the proposed approach surpasses that using full-bit, semi-supervised supervision.
arXiv Detail & Related papers (2023-11-26T07:39:00Z) - WC-SBERT: Zero-Shot Text Classification via SBERT with Self-Training for
Wikipedia Categories [5.652290685410878]
Our research focuses on solving the zero-shot text classification problem in NLP.
We propose a novel self-training strategy that uses labels rather than text for training.
Our method achieves state-of-the-art results on both the Yahoo Topic and AG News datasets.
arXiv Detail & Related papers (2023-07-28T04:17:41Z) - Active Self-Training for Weakly Supervised 3D Scene Semantic
Segmentation [17.27850877649498]
We introduce a method for weakly supervised segmentation of 3D scenes that combines self-training and active learning.
We demonstrate that our approach leads to an effective method that provides improvements in scene segmentation over previous works and baselines.
arXiv Detail & Related papers (2022-09-15T06:00:25Z) - Assisted Text Annotation Using Active Learning to Achieve High Quality
with Little Effort [9.379650501033465]
We propose a tool that enables researchers to create large, high-quality, annotated datasets with only a few manual annotations.
We combine an active learning (AL) approach with a pre-trained language model to semi-automatically identify annotation categories.
Our preliminary results show that employing AL strongly reduces the number of annotations for correct classification of even complex and subtle frames.
arXiv Detail & Related papers (2021-12-15T13:14:58Z) - Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model.
In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z) - Jigsaw Clustering for Unsupervised Visual Representation Learning [68.09280490213399]
We propose a new jigsaw clustering pretext task in this paper.
Our method makes use of information from both intra- and inter-images.
It is even comparable to the contrastive learning methods when only half of training batches are used.
arXiv Detail & Related papers (2021-04-01T08:09:26Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck.
We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network.
We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z) - Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm.
We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data.
Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z) - Annotation-free Learning of Deep Representations for Word Spotting using
Synthetic Data and Self Labeling [4.111899441919165]
We present an annotation-free method that still employs machine learning techniques.
We achieve state-of-the-art query-by-example performances.
Our method allows to perform query-by-string, which is usually not the case for other annotation-free methods.
arXiv Detail & Related papers (2020-03-04T10:46:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.