Related papers: Bootstrapping Weakly Supervised Segmentation-free Word Spotting through HMM-based Alignment

Bootstrapping Weakly Supervised Segmentation-free Word Spotting through HMM-based Alignment

URL: http://arxiv.org/abs/2003.11087v1
Date: Tue, 24 Mar 2020 19:41:18 GMT
Title: Bootstrapping Weakly Supervised Segmentation-free Word Spotting through HMM-based Alignment
Authors: Tomas Wilkinson and Carl Nettelblad
Abstract summary: We propose an approach that utilises transcripts without bounding box annotations to train word spotting models. This is done through a training-free alignment procedure based on hidden Markov models. We believe that this will be a significant advance towards a more general use of word spotting, since digital transcription data will already exist for parts of many collections of interest.
Score: 0.5076419064097732
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent work in word spotting in handwritten documents has yielded impressive results. This progress has largely been made by supervised learning systems, which are dependent on manually annotated data, making deployment to new collections a significant effort. In this paper, we propose an approach that utilises transcripts without bounding box annotations to train segmentation-free query-by-string word spotting models, given a partially trained model. This is done through a training-free alignment procedure based on hidden Markov models. This procedure creates a tentative mapping between word region proposals and the transcriptions to automatically create additional weakly annotated training data, without choosing any single alignment possibility as the correct one. When only using between 1% and 7% of the fully annotated training sets for partial convergence, we automatically annotate the remaining training data and successfully train using it. On all our datasets, our final trained model then comes within a few mAP% of the performance from a model trained with the full training set used as ground truth. We believe that this will be a significant advance towards a more general use of word spotting, since digital transcription data will already exist for parts of many collections of interest.

Related papers

A Unified Framework for Semi-Supervised Image Segmentation and Registration [4.220987375928411]
We introduce a novel approach incorporating an image registration model to generate pseudo-labels for the unannotated data. Our method was evaluated on a 2D brain data set, showing excellent performance even using only 1% of the annotated data.
arXiv Detail & Related papers (2025-02-05T14:45:00Z)
Towards Efficient Active Learning in NLP via Pretrained Representations [1.90365714903665]
Fine-tuning Large Language Models (LLMs) is now a common approach for text classification in a wide range of applications. We drastically expedite this process by using pretrained representations of LLMs within the active learning loop. Our strategy yields similar performance to fine-tuning all the way through the active learning loop but is orders of magnitude less computationally expensive.
arXiv Detail & Related papers (2024-02-23T21:28:59Z)
One-bit Supervision for Image Classification: Problem, Solution, and Beyond [114.95815360508395]
This paper presents one-bit supervision, a novel setting of learning with fewer labels, for image classification. We propose a multi-stage training paradigm and incorporate negative label suppression into an off-the-shelf semi-supervised learning algorithm. In multiple benchmarks, the learning efficiency of the proposed approach surpasses that using full-bit, semi-supervised supervision.
arXiv Detail & Related papers (2023-11-26T07:39:00Z)
WC-SBERT: Zero-Shot Text Classification via SBERT with Self-Training for Wikipedia Categories [5.652290685410878]
Our research focuses on solving the zero-shot text classification problem in NLP. We propose a novel self-training strategy that uses labels rather than text for training. Our method achieves state-of-the-art results on both the Yahoo Topic and AG News datasets.
arXiv Detail & Related papers (2023-07-28T04:17:41Z)
Active Self-Training for Weakly Supervised 3D Scene Semantic Segmentation [17.27850877649498]
We introduce a method for weakly supervised segmentation of 3D scenes that combines self-training and active learning. We demonstrate that our approach leads to an effective method that provides improvements in scene segmentation over previous works and baselines.
arXiv Detail & Related papers (2022-09-15T06:00:25Z)
Assisted Text Annotation Using Active Learning to Achieve High Quality with Little Effort [9.379650501033465]
We propose a tool that enables researchers to create large, high-quality, annotated datasets with only a few manual annotations. We combine an active learning (AL) approach with a pre-trained language model to semi-automatically identify annotation categories. Our preliminary results show that employing AL strongly reduces the number of annotations for correct classification of even complex and subtle frames.
arXiv Detail & Related papers (2021-12-15T13:14:58Z)
Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model. In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z)
Jigsaw Clustering for Unsupervised Visual Representation Learning [68.09280490213399]
We propose a new jigsaw clustering pretext task in this paper. Our method makes use of information from both intra- and inter-images. It is even comparable to the contrastive learning methods when only half of training batches are used.
arXiv Detail & Related papers (2021-04-01T08:09:26Z)
Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck. We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network. We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z)
Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm. We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data. Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z)
Annotation-free Learning of Deep Representations for Word Spotting using Synthetic Data and Self Labeling [4.111899441919165]
We present an annotation-free method that still employs machine learning techniques. We achieve state-of-the-art query-by-example performances. Our method allows to perform query-by-string, which is usually not the case for other annotation-free methods.
arXiv Detail & Related papers (2020-03-04T10:46:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.