Syntax-Aware Network for Handwritten Mathematical Expression Recognition
- URL: http://arxiv.org/abs/2203.01601v2
- Date: Sat, 5 Mar 2022 07:38:15 GMT
- Title: Syntax-Aware Network for Handwritten Mathematical Expression Recognition
- Authors: Ye Yuan, Xiao Liu, Wondimu Dikubab, Hui Liu, Zhilong Ji, Zhongqin Wu,
Xiang Bai
- Abstract summary: Handwritten mathematical expression recognition (HMER) is a challenging task that has many potential applications.
Recent methods for HMER have achieved outstanding performance with an encoder-decoder architecture.
We propose a simple and efficient method for HMER, which is the first to incorporate syntax information into an encoder-decoder network.
- Score: 53.130826547287626
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Handwritten mathematical expression recognition (HMER) is a challenging task
that has many potential applications. Recent methods for HMER have achieved
outstanding performance with an encoder-decoder architecture. However, these
methods adhere to the paradigm that the prediction is made "from one character
to another", which inevitably yields prediction errors due to the complicated
structures of mathematical expressions or crabbed handwritings. In this paper,
we propose a simple and efficient method for HMER, which is the first to
incorporate syntax information into an encoder-decoder network. Specifically,
we present a set of grammar rules for converting the LaTeX markup sequence of
each expression into a parsing tree; then, we model the markup sequence
prediction as a tree traverse process with a deep neural network. In this way,
the proposed method can effectively describe the syntax context of expressions,
avoiding the structure prediction errors of HMER. Experiments on two benchmark
datasets demonstrate that our method achieves significantly better recognition
performance than prior arts. To further validate the effectiveness of our
method, we create a large-scale dataset consisting of 100k handwritten
mathematical expression images acquired from ten thousand writers. The source
code, new dataset, and pre-trained models of this work will be publicly
available.
Related papers
- On Eliciting Syntax from Language Models via Hashing [19.872554909401316]
Unsupervised parsing aims to infer syntactic structure from raw text.
In this paper, we explore the possibility of leveraging this capability to deduce parsing trees from raw text.
We show that our method is effective and efficient enough to acquire high-quality parsing trees from pre-trained language models at a low cost.
arXiv Detail & Related papers (2024-10-05T08:06:19Z) - NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition [80.22784377150465]
Handwritten Mathematical Expression Recognition (HMER) has gained considerable attention in pattern recognition for its diverse applications in document understanding.
This paper makes the first attempt to build a novel bottom-up Non-AutoRegressive Modeling approach for HMER, called NAMER.
NAMER comprises a Visual Aware Tokenizer (VAT) and a Parallel Graph (PGD)
arXiv Detail & Related papers (2024-07-16T04:52:39Z) - PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer [51.260384040953326]
Handwritten Mathematical Expression Recognition (HMER) has wide applications in human-machine interaction scenarios.
We propose a position forest transformer (PosFormer) for HMER, which jointly optimize two tasks: expression recognition and position recognition.
PosFormer consistently outperforms the state-of-the-art methods 2.03%/1.22%/2, 1.83%, and 4.62% gains on datasets.
arXiv Detail & Related papers (2024-07-10T15:42:58Z) - ICAL: Implicit Character-Aided Learning for Enhanced Handwritten Mathematical Expression Recognition [9.389169879626428]
This paper introduces a novel approach, Implicit Character-Aided Learning (ICAL), to mine the global expression information.
By modeling and utilizing implicit character information, ICAL achieves a more accurate and context-aware interpretation of handwritten mathematical expressions.
arXiv Detail & Related papers (2024-05-15T02:03:44Z) - Spatial Attention and Syntax Rule Enhanced Tree Decoder for Offine
Handwritten Mathematical Expression Recognition [12.656673677551778]
We propose a novel model called Spatial Attention and Syntax Rule Enhanced Tree Decoder (SS-TD)
Our model can effectively describe tree structure and increase the accuracy of output expression.
Experiments show that SS-TD achieves better recognition performance than prior models on CROHME 14/16/19 datasets.
arXiv Detail & Related papers (2023-03-13T12:59:53Z) - A Transformer Architecture for Online Gesture Recognition of
Mathematical Expressions [0.0]
Transformer architecture is shown to provide an end-to-end model for building expression trees from online handwritten gestures corresponding to glyph strokes.
The attention mechanism was successfully used to encode, learn and enforce the underlying syntax of expressions.
For the first time, the encoder is fed with unseen online-temporal data tokens potentially forming an infinitely large vocabulary.
arXiv Detail & Related papers (2022-11-04T17:55:55Z) - When Counting Meets HMER: Counting-Aware Network for Handwritten
Mathematical Expression Recognition [57.51793420986745]
We propose an unconventional network for handwritten mathematical expression recognition (HMER) named Counting-Aware Network (CAN)
We design a weakly-supervised counting module that can predict the number of each symbol class without the symbol-level position annotations.
Experiments on the benchmark datasets for HMER validate that both joint optimization and counting results are beneficial for correcting the prediction errors of encoder-decoder models.
arXiv Detail & Related papers (2022-07-23T08:39:32Z) - Fine-Grained Visual Entailment [51.66881737644983]
We propose an extension of this task, where the goal is to predict the logical relationship of fine-grained knowledge elements within a piece of text to an image.
Unlike prior work, our method is inherently explainable and makes logical predictions at different levels of granularity.
We evaluate our method on a new dataset of manually annotated knowledge elements and show that our method achieves 68.18% accuracy at this challenging task.
arXiv Detail & Related papers (2022-03-29T16:09:38Z) - Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents.
Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages.
We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z) - Unsupervised Training Data Generation of Handwritten Formulas using
Generative Adversarial Networks with Self-Attention [3.785514121306353]
We introduce a system that creates a large set of synthesized training examples of mathematical expressions which are derived from documents.
For this purpose, we propose a novel attention-based generative adversarial network to translate rendered equations to handwritten formulas.
The datasets generated by this approach contain hundreds of thousands of formulas, making it ideal for pretraining or the design of more complex models.
arXiv Detail & Related papers (2021-06-17T12:27:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.