A Transformer-based Math Language Model for Handwritten Math Expression
Recognition
- URL: http://arxiv.org/abs/2108.05002v1
- Date: Wed, 11 Aug 2021 03:03:48 GMT
- Title: A Transformer-based Math Language Model for Handwritten Math Expression
Recognition
- Authors: Huy Quang Ung, Cuong Tuan Nguyen, Hung Tuan Nguyen, Thanh-Nghia Truong
and Masaki Nakagawa
- Abstract summary: Math symbols are very similar in the writing style, such as dot and comma or 0, O, and o.
This paper presents a Transformer-based Math Language Model (TMLM)
TMLM achieved the perplexity of 4.42, which outperformed the previous math language models.
- Score: 7.202733269706245
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Handwritten mathematical expressions (HMEs) contain ambiguities in their
interpretations, even for humans sometimes. Several math symbols are very
similar in the writing style, such as dot and comma or 0, O, and o, which is a
challenge for HME recognition systems to handle without using contextual
information. To address this problem, this paper presents a Transformer-based
Math Language Model (TMLM). Based on the self-attention mechanism, the
high-level representation of an input token in a sequence of tokens is computed
by how it is related to the previous tokens. Thus, TMLM can capture long
dependencies and correlations among symbols and relations in a mathematical
expression (ME). We trained the proposed language model using a corpus of
approximately 70,000 LaTeX sequences provided in CROHME 2016. TMLM achieved the
perplexity of 4.42, which outperformed the previous math language models, i.e.,
the N-gram and recurrent neural network-based language models. In addition, we
combine TMLM into a stochastic context-free grammar-based HME recognition
system using a weighting parameter to re-rank the top-10 best candidates. The
expression rates on the testing sets of CROHME 2016 and CROHME 2019 were
improved by 2.97 and 0.83 percentage points, respectively.
Related papers
- NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition [80.22784377150465]
Handwritten Mathematical Expression Recognition (HMER) has gained considerable attention in pattern recognition for its diverse applications in document understanding.
This paper makes the first attempt to build a novel bottom-up Non-AutoRegressive Modeling approach for HMER, called NAMER.
NAMER comprises a Visual Aware Tokenizer (VAT) and a Parallel Graph (PGD)
arXiv Detail & Related papers (2024-07-16T04:52:39Z) - PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer [51.260384040953326]
Handwritten Mathematical Expression Recognition (HMER) has wide applications in human-machine interaction scenarios.
We propose a position forest transformer (PosFormer) for HMER, which jointly optimize two tasks: expression recognition and position recognition.
PosFormer consistently outperforms the state-of-the-art methods 2.03%/1.22%/2, 1.83%, and 4.62% gains on datasets.
arXiv Detail & Related papers (2024-07-10T15:42:58Z) - CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens [49.569695524535454]
We propose to represent speech with supervised semantic tokens, which are derived from a multilingual speech recognition model by inserting vector quantization into the encoder.
Based on the tokens, we further propose a scalable zero-shot TTS synthesizer, CosyVoice, which consists of an LLM for text-to-token generation and a conditional flow matching model for token-to-speech synthesis.
arXiv Detail & Related papers (2024-07-07T15:16:19Z) - ICAL: Implicit Character-Aided Learning for Enhanced Handwritten Mathematical Expression Recognition [9.389169879626428]
This paper introduces a novel approach, Implicit Character-Aided Learning (ICAL), to mine the global expression information.
By modeling and utilizing implicit character information, ICAL achieves a more accurate and context-aware interpretation of handwritten mathematical expressions.
arXiv Detail & Related papers (2024-05-15T02:03:44Z) - Which Syntactic Capabilities Are Statistically Learned by Masked
Language Models for Code? [51.29970742152668]
We highlight relying on accuracy-based measurements may lead to an overestimation of models' capabilities.
To address these issues, we introduce a technique called SyntaxEval in Syntactic Capabilities.
arXiv Detail & Related papers (2024-01-03T02:44:02Z) - Exploring Equation as a Better Intermediate Meaning Representation for
Numerical Reasoning [53.2491163874712]
We use equations as IMRs to solve the numerical reasoning task.
We present a method called Boosting Numerical Reasontextbfing by Decomposing the Generation of Equations (Bridge)
Our method improves the performance by 2.2%, 0.9%, and 1.7% on GSM8K, SVAMP, and Algebra datasets.
arXiv Detail & Related papers (2023-08-21T09:35:33Z) - Offline Handwritten Mathematical Recognition using Adversarial Learning
and Transformers [3.9220281834178463]
offline HMER is often viewed as a much harder problem as compared to online HMER.
In this paper, we purpose a encoder-decoder model that uses paired adversarial learning.
We have been able to improve latest CROHME 2019 test set results by 4% approx.
arXiv Detail & Related papers (2022-08-20T11:45:02Z) - Syntax-Aware Network for Handwritten Mathematical Expression Recognition [53.130826547287626]
Handwritten mathematical expression recognition (HMER) is a challenging task that has many potential applications.
Recent methods for HMER have achieved outstanding performance with an encoder-decoder architecture.
We propose a simple and efficient method for HMER, which is the first to incorporate syntax information into an encoder-decoder network.
arXiv Detail & Related papers (2022-03-03T09:57:19Z) - Mathematical Word Problem Generation from Commonsense Knowledge Graph
and Equations [27.063577644162358]
We develop an end-to-end neural model to generate diverse MWPs in real-world scenarios from commonsense knowledge graph and equations.
The proposed model learns both representations from edge-enhanced Levi graphs of symbolic equations and commonsense knowledge.
Experiments on an educational gold-standard set and a large-scale generated MWP set show that our approach is superior on the MWP generation task.
arXiv Detail & Related papers (2020-10-13T06:31:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.