Related papers: Online Writer Retrieval with Chinese Handwritten Phrases: A Synergistic Temporal-Frequency Representation Learning Approach

Online Writer Retrieval with Chinese Handwritten Phrases: A Synergistic Temporal-Frequency Representation Learning Approach

URL: http://arxiv.org/abs/2412.11668v1
Date: Mon, 16 Dec 2024 11:19:22 GMT
Title: Online Writer Retrieval with Chinese Handwritten Phrases: A Synergistic Temporal-Frequency Representation Learning Approach
Authors: Peirong Zhang, Lianwen Jin,
Abstract summary: We propose DOLPHIN, a novel retrieval model designed to enhance handwriting representations through synergistic temporal-frequency analysis.<n>We introduce OLIWER, a large-scale online writer retrieval dataset encompassing over 670,000 Chinese handwritten phrases from 1,731 individuals.<n>Our findings emphasize the significance of point sampling frequency and pressure features in improving handwriting representation quality.
Score: 35.50318959678818
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Currently, the prevalence of online handwriting has spurred a critical need for effective retrieval systems to accurately search relevant handwriting instances from specific writers, known as online writer retrieval. Despite the growing demand, this field suffers from a scarcity of well-established methodologies and public large-scale datasets. This paper tackles these challenges with a focus on Chinese handwritten phrases. First, we propose DOLPHIN, a novel retrieval model designed to enhance handwriting representations through synergistic temporal-frequency analysis. For frequency feature learning, we propose the HFGA block, which performs gated cross-attention between the vanilla temporal handwriting sequence and its high-frequency sub-bands to amplify salient writing details. For temporal feature learning, we propose the CAIR block, tailored to promote channel interaction and reduce channel redundancy. Second, to address data deficit, we introduce OLIWER, a large-scale online writer retrieval dataset encompassing over 670,000 Chinese handwritten phrases from 1,731 individuals. Through extensive evaluations, we demonstrate the superior performance of DOLPHIN over existing methods. In addition, we explore cross-domain writer retrieval and reveal the pivotal role of increasing feature alignment in bridging the distributional gap between different handwriting data. Our findings emphasize the significance of point sampling frequency and pressure features in improving handwriting representation quality and retrieval performance. Code and dataset are available at https://github.com/SCUT-DLVCLab/DOLPHIN.

Related papers

PLATTER: A Page-Level Handwritten Text Recognition System for Indic Scripts [20.394597266150534]
We present an end-to-end framework for Page-Level hAndwriTTen TExt Recognition (PLATTER) Secondly, we demonstrate the usage of PLATTER to measure the performance of our language-agnostic HTD model. Finally, we release a Corpus of Handwritten Indic Scripts (CHIPS), a meticulously curated, page-level Indic handwritten OCR dataset.
arXiv Detail & Related papers (2025-02-10T05:50:26Z)
Detecting Document-level Paraphrased Machine Generated Content: Mimicking Human Writing Style and Involving Discourse Features [57.34477506004105]
Machine-generated content poses challenges such as academic plagiarism and the spread of misinformation. We introduce novel methodologies and datasets to overcome these challenges. We propose MhBART, an encoder-decoder model designed to emulate human writing style. We also propose DTransformer, a model that integrates discourse analysis through PDTB preprocessing to encode structural features.
arXiv Detail & Related papers (2024-12-17T08:47:41Z)
Learning Robust Named Entity Recognizers From Noisy Data With Retrieval Augmentation [67.89838237013078]
Named entity recognition (NER) models often struggle with noisy inputs. We propose a more realistic setting in which only noisy text and its NER labels are available. We employ a multi-view training framework that improves robust NER without retrieving text during inference.
arXiv Detail & Related papers (2024-07-26T07:30:41Z)
Cross-lingual Contextualized Phrase Retrieval [63.80154430930898]
We propose a new task formulation of dense retrieval, cross-lingual contextualized phrase retrieval. We train our Cross-lingual Contextualized Phrase Retriever (CCPR) using contrastive learning. On the phrase retrieval task, CCPR surpasses baselines by a significant margin, achieving a top-1 accuracy that is at least 13 points higher.
arXiv Detail & Related papers (2024-03-25T14:46:51Z)
Offline Detection of Misspelled Handwritten Words by Convolving Recognition Model Features with Text Labels [0.0]
We introduce the task of comparing a handwriting image to text. Our model's classification head is trained entirely on synthetic data created using a state-of-the-art generative adversarial network. Such massive performance gains can lead to significant productivity increases in applications utilizing human-in-the-loop automation.
arXiv Detail & Related papers (2023-09-18T21:13:42Z)
Uncovering the Handwritten Text in the Margins: End-to-end Handwritten Text Detection and Recognition [0.840835093659811]
This work presents an end-to-end framework for automatic detection and recognition of handwritten marginalia. It uses data augmentation and transfer learning to overcome training data scarcity. The effectiveness of the proposed framework has been empirically evaluated on the data from early book collections found in the Uppsala University Library in Sweden.
arXiv Detail & Related papers (2023-03-10T14:00:53Z)
A Few Shot Multi-Representation Approach for N-gram Spotting in Historical Manuscripts [1.2930503923129213]
We propose a few-shot learning paradigm for spotting sequences of a few characters (N-gram) We exhibit that recognition of important n-grams could reduce the system's dependency on vocabulary.
arXiv Detail & Related papers (2022-09-21T15:35:02Z)
Letter-level Online Writer Identification [86.13203975836556]
We focus on a novel problem, letter-level online writer-id, which requires only a few trajectories of written letters as identification cues. A main challenge is that a person often writes a letter in different styles from time to time. We refer to this problem as the variance of online writing styles (Var-O-Styles)
arXiv Detail & Related papers (2021-12-06T07:21:53Z)
Sentence-level Online Handwritten Chinese Character Recognition [36.57575120082676]
Single online handwritten Chinese character recognition(single OLHCCR) has achieved prominent performance. In real application scenarios, users always write multiple Chinese characters to form one complete sentence. We propose a simple and straightforward end-to-end network, namely vanilla compositional network(VCN) to tackle the sentence-level OLHCCR. We also propose a novel deep spatial-temporal fusion network(DSTFN) to improve the robustness of sentence-level OLHCCR.
arXiv Detail & Related papers (2021-07-04T14:26:06Z)
Towards an IMU-based Pen Online Handwriting Recognizer [2.6707647984082357]
We present a online handwriting recognition system for word recognition based on inertial measurement units (IMUs) This is obtained by means of a sensor-equipped pen that provides acceleration, angular velocity, and magnetic forces streamed via Bluetooth. Our model combines convolutional and bidirectional LSTM networks, and is trained with the Connectionist Temporal Classification loss.
arXiv Detail & Related papers (2021-05-26T09:47:19Z)
SmartPatch: Improving Handwritten Word Imitation with Patch Discriminators [67.54204685189255]
We propose SmartPatch, a new technique increasing the performance of current state-of-the-art methods. We combine the well-known patch loss with information gathered from the parallel trained handwritten text recognition system. This leads to a more enhanced local discriminator and results in more realistic and higher-quality generated handwritten words.
arXiv Detail & Related papers (2021-05-21T18:34:21Z)
PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network [54.03560668182197]
We propose a novel fully convolutional Point Gathering Network (PGNet) for reading arbitrarily-shaped text in real-time. With a PG-CTC decoder, we gather high-level character classification vectors from two-dimensional space and decode them into text symbols without NMS and RoI operations. Experiments prove that the proposed method achieves competitive accuracy, meanwhile significantly improving the running speed.
arXiv Detail & Related papers (2021-04-12T13:27:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.