Learning symbol relation tree for online mathematical expression
recognition
- URL: http://arxiv.org/abs/2105.06084v1
- Date: Thu, 13 May 2021 05:18:17 GMT
- Title: Learning symbol relation tree for online mathematical expression
recognition
- Authors: Thanh-Nghia Truong, Hung Tuan Nguyen, Cuong Tuan Nguyen and Masaki
Nakagawa
- Abstract summary: This paper proposes a method for recognizing online handwritten mathematical expressions (OnHME) by building a symbol relation tree (SRT) directly from a sequence of strokes.
A bidirectional recurrent neural network learns from multiple derived paths of SRT to predict both symbols and spatial relations between symbols using global context.
The recognition system achieves 44.12% and 41.76% expression recognition rates on the Competition on Recognition of Online Handwritten Mathematical expressions (CROHME) 2014 and 2016 testing sets.
- Score: 7.868468656324007
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a method for recognizing online handwritten mathematical
expressions (OnHME) by building a symbol relation tree (SRT) directly from a
sequence of strokes. A bidirectional recurrent neural network learns from
multiple derived paths of SRT to predict both symbols and spatial relations
between symbols using global context. The recognition system has two parts: a
temporal classifier and a tree connector. The temporal classifier produces an
SRT by recognizing an OnHME pattern. The tree connector splits the SRT into
several sub-SRTs. The final SRT is formed by looking up the best combination
among those sub-SRTs. Besides, we adopt a tree sorting method to deal with
various stroke orders. Recognition experiments indicate that the proposed OnHME
recognition system is competitive to other methods. The recognition system
achieves 44.12% and 41.76% expression recognition rates on the Competition on
Recognition of Online Handwritten Mathematical Expressions (CROHME) 2014 and
2016 testing sets.
Related papers
- Complex Event Recognition with Symbolic Register Transducers: Extended Technical Report [51.86861492527722]
We present a system for Complex Event Recognition based on automata.
Our system is based on an automaton model which is a combination of symbolic and register automata.
We show how SRT can be used in CER in order to detect patterns upon streams of events.
arXiv Detail & Related papers (2024-07-03T07:59:13Z) - Scope-enhanced Compositional Semantic Parsing for DRT [52.657454970993086]
We introduce the AMS Theory, a compositional, neurosymbolic semantic neurosymbolic for Discourse Representation (DRT)
We show that the AMS reliably produces well-formed outputs and performs well on DRT parsing, especially on complex sentences.
arXiv Detail & Related papers (2024-07-02T02:50:15Z) - Dual Branch Network Towards Accurate Printed Mathematical Expression
Recognition [27.428642277844972]
A Dual Branch transformer-based Network (DBN) is proposed to learn both local and global context information for accurate Printed Mathematical Expression Recognition.
Our experimental results have demonstrated that DBN can accurately recognize mathematical expressions and has achieved state-of-the-art performance.
arXiv Detail & Related papers (2023-12-14T15:30:34Z) - Semantic Graph Representation Learning for Handwritten Mathematical
Expression Recognition [57.60390958736775]
We propose a simple but efficient method to enhance semantic interaction learning (SIL)
We first construct a semantic graph based on the statistical symbol co-occurrence probabilities.
Then we design a semantic aware module (SAM), which projects the visual and classification feature into semantic space.
Our method achieves better recognition performance than prior arts on both CROHME and HME100K datasets.
arXiv Detail & Related papers (2023-08-21T06:23:41Z) - Graph Neural Networks for Contextual ASR with the Tree-Constrained
Pointer Generator [9.053645441056256]
This paper proposes an innovative method for achieving end-to-end contextual ASR using graph neural network (GNN) encodings.
GNN encodings facilitate lookahead for future word pieces in the process of ASR decoding at each tree node.
The performance of the systems was evaluated using the Librispeech and AMI corpus, following the visual-grounded contextual ASR pipeline.
arXiv Detail & Related papers (2023-05-30T08:20:58Z) - Global Context for improving recognition of Online Handwritten
Mathematical Expressions [7.868468656324007]
We present a temporal classification method for online handwritten mathematical expressions (HMEs)
The method benefits from global context of a deep bidirectional Long Short-term Memory network.
To recognize an online HME, a symbol-level parse tree with Context-Free Grammar is constructed.
arXiv Detail & Related papers (2021-05-21T06:39:47Z) - Knowledge Distillation By Sparse Representation Matching [107.87219371697063]
We propose Sparse Representation Matching (SRM) to transfer intermediate knowledge from one Convolutional Network (CNN) to another by utilizing sparse representation.
We formulate as a neural processing block, which can be efficiently optimized using gradient descent and integrated into any CNN in a plug-and-play manner.
Our experiments demonstrate that is robust to architectural differences between the teacher and student networks, and outperforms other KD techniques across several datasets.
arXiv Detail & Related papers (2021-03-31T11:47:47Z) - Syntactic representation learning for neural network based TTS with
syntactic parse tree traversal [49.05471750563229]
We propose a syntactic representation learning method based on syntactic parse tree to automatically utilize the syntactic structure information.
Experimental results demonstrate the effectiveness of our proposed approach.
For sentences with multiple syntactic parse trees, prosodic differences can be clearly perceived from the synthesized speeches.
arXiv Detail & Related papers (2020-12-13T05:52:07Z) - Emotion recognition by fusing time synchronous and time asynchronous
representations [17.26466867595571]
A novel two-branch neural network model structure is proposed for multimodal emotion recognition.
It consists of a time synchronous branch (TSB) and a time asynchronous branch (TAB)
The two-branch structure achieves state-of-the-art results in 4-way classification with all common test setups.
arXiv Detail & Related papers (2020-10-27T07:14:31Z) - MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT
Prostate Segmentation via Online Sampling [66.01558025094333]
We propose a two-stage framework, with the first stage to quickly localize the prostate region and the second stage to precisely segment the prostate.
We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network.
Our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss.
arXiv Detail & Related papers (2020-05-15T10:37:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.