Related papers: CSSL-MHTR: Continual Self-Supervised Learning for Scalable Multi-script Handwritten Text Recognition

CSSL-MHTR: Continual Self-Supervised Learning for Scalable Multi-script Handwritten Text Recognition

URL: http://arxiv.org/abs/2303.09347v2
Date: Fri, 26 Apr 2024 19:19:22 GMT
Title: CSSL-MHTR: Continual Self-Supervised Learning for Scalable Multi-script Handwritten Text Recognition
Authors: Marwa Dhiaf, Mohamed Ali Souibgui, Kai Wang, Yuyang Liu, Yousri Kessentini, Alicia Fornés, Ahmed Cheikh Rouhou,
Abstract summary: We explore the potential of continual self-supervised learning to alleviate the catastrophic forgetting problem in handwritten text recognition. Our method consists in adding intermediate layers called adapters for each task, and efficiently distilling knowledge from the previous model while learning the current task. We attain state-of-the-art performance on English, Italian and Russian scripts, whilst adding only a few parameters per task.
Score: 16.987008461171065
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Self-supervised learning has recently emerged as a strong alternative in document analysis. These approaches are now capable of learning high-quality image representations and overcoming the limitations of supervised methods, which require a large amount of labeled data. However, these methods are unable to capture new knowledge in an incremental fashion, where data is presented to the model sequentially, which is closer to the realistic scenario. In this paper, we explore the potential of continual self-supervised learning to alleviate the catastrophic forgetting problem in handwritten text recognition, as an example of sequence recognition. Our method consists in adding intermediate layers called adapters for each task, and efficiently distilling knowledge from the previous model while learning the current task. Our proposed framework is efficient in both computation and memory complexity. To demonstrate its effectiveness, we evaluate our method by transferring the learned model to diverse text recognition downstream tasks, including Latin and non-Latin scripts. As far as we know, this is the first application of continual self-supervised learning for handwritten text recognition. We attain state-of-the-art performance on English, Italian and Russian scripts, whilst adding only a few parameters per task. The code and trained models will be publicly available.

Related papers

DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive Learning [24.99797253885887]
We argue that the key to accomplishing this task lies in distinguishing writing styles of different authors. We propose DeTeCtive, a multi-task auxiliary, multi-level contrastive learning framework. Our method is compatible with a range of text encoders.
arXiv Detail & Related papers (2024-10-28T12:34:49Z)
LEGO: Self-Supervised Representation Learning for Scene Text Images [32.21085469233465]
We propose a Local Explicit and Global Order-aware self-supervised representation learning method for scene text images. Inspired by the human cognitive process of learning words, we propose three novel pre-text tasks for LEGO to model sequential, semantic, and structural features. The LEGO recognizer achieves superior or comparable performance compared to state-of-the-art scene text recognition methods on six benchmarks.
arXiv Detail & Related papers (2024-08-04T14:07:14Z)
Exploiting the Semantic Knowledge of Pre-trained Text-Encoders for Continual Learning [70.64617500380287]
Continual learning allows models to learn from new data while retaining previously learned knowledge. The semantic knowledge available in the label information of the images, offers important semantic information that can be related with previously acquired knowledge of semantic classes. We propose integrating semantic guidance within and across tasks by capturing semantic similarity using text embeddings.
arXiv Detail & Related papers (2024-08-02T07:51:44Z)
An end-to-end, interactive Deep Learning based Annotation system for cursive and print English handwritten text [0.0]
We present an innovative, complete end-to-end pipeline, that annotates offline handwritten manuscripts written in both print and cursive English. This novel method involves an architectural combination of a detection system built upon a state-of-the-art text detection model, and a custom made Deep Learning model for the recognition system.
arXiv Detail & Related papers (2023-04-18T00:24:07Z)
Self-Supervised Speech Representation Learning: A Review [105.1545308184483]
Self-supervised representation learning methods promise a single universal model that would benefit a wide variety of tasks and domains. Speech representation learning is experiencing similar progress in three main categories: generative, contrastive, and predictive methods. This review presents approaches for self-supervised speech representation learning and their connection to other research areas.
arXiv Detail & Related papers (2022-05-21T16:52:57Z)
Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents. Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages. We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z)
Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model. In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z)
Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition [4.301658883577544]
We introduce a non-recurrent approach to recognize handwritten text by the use of transformer models. We are able to tackle character recognition as well as to learn language-related dependencies of the character sequences to be decoded.
arXiv Detail & Related papers (2020-05-26T21:15:20Z)
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs. Building upon entity-level masked language models, our first contribution is an entity masking scheme. In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer [64.22926988297685]
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP) In this paper, we explore the landscape of introducing transfer learning techniques for NLP by a unified framework that converts all text-based language problems into a text-to-text format.
arXiv Detail & Related papers (2019-10-23T17:37:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.