Dual Branch Network Towards Accurate Printed Mathematical Expression
Recognition
- URL: http://arxiv.org/abs/2312.09030v1
- Date: Thu, 14 Dec 2023 15:30:34 GMT
- Title: Dual Branch Network Towards Accurate Printed Mathematical Expression
Recognition
- Authors: Yuqing Wang, Zhenyu Weng, Zhaokun Zhou, Shuaijian Ji, Zhongjie Ye,
Yuesheng Zhu
- Abstract summary: A Dual Branch transformer-based Network (DBN) is proposed to learn both local and global context information for accurate Printed Mathematical Expression Recognition.
Our experimental results have demonstrated that DBN can accurately recognize mathematical expressions and has achieved state-of-the-art performance.
- Score: 27.428642277844972
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the past years, Printed Mathematical Expression Recognition (PMER) has
progressed rapidly. However, due to the insufficient context information
captured by Convolutional Neural Networks, some mathematical symbols might be
incorrectly recognized or missed. To tackle this problem, in this paper, a Dual
Branch transformer-based Network (DBN) is proposed to learn both local and
global context information for accurate PMER. In our DBN, local and global
features are extracted simultaneously, and a Context Coupling Module (CCM) is
developed to complement the features between the global and local contexts. CCM
adopts an interactive manner so that the coupled context clues are highly
correlated to each expression symbol. Additionally, we design a Dynamic Soft
Target (DST) strategy to utilize the similarities among symbol categories for
reasonable label generation. Our experimental results have demonstrated that
DBN can accurately recognize mathematical expressions and has achieved
state-of-the-art performance.
Related papers
- Compound Expression Recognition via Multi Model Ensemble for the ABAW7 Challenge [6.26485278174662]
Compound Expression Recognition (CER) is vital for effective interpersonal interactions.
In this paper, we propose an ensemble learning-based solution to address this complexity.
Our method demonstrates high accuracy on the RAF-DB datasets and is capable of recognizing expressions in certain portions of the C-EXPR-DB through zero-shot learning.
arXiv Detail & Related papers (2024-07-17T01:59:34Z) - Compound Expression Recognition via Multi Model Ensemble [8.529105068848828]
Compound Expression Recognition plays a crucial role in interpersonal interactions.
We propose a solution based on ensemble learning methods for Compound Expression Recognition.
Our method achieves high accuracy on RAF-DB and is able to recognize expressions through zero-shot on certain portions of C-EXPR-DB.
arXiv Detail & Related papers (2024-03-19T09:30:56Z) - BCLNet: Bilateral Consensus Learning for Two-View Correspondence Pruning [26.400567961735234]
Correspondence pruning aims to establish reliable correspondences between two related images.
Existing approaches often employ a progressive strategy to handle the local and global contexts.
We propose a parallel context learning strategy that involves acquiring bilateral consensus for the two-view correspondence pruning task.
arXiv Detail & Related papers (2024-01-07T11:38:15Z) - Bidirectional Trained Tree-Structured Decoder for Handwritten
Mathematical Expression Recognition [51.66383337087724]
The Handwritten Mathematical Expression Recognition (HMER) task is a critical branch in the field of OCR.
Recent studies have demonstrated that incorporating bidirectional context information significantly improves the performance of HMER models.
We propose the Mirror-Flipped Symbol Layout Tree (MF-SLT) and Bidirectional Asynchronous Training (BAT) structure.
arXiv Detail & Related papers (2023-12-31T09:24:21Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - DenseGAP: Graph-Structured Dense Correspondence Learning with Anchor
Points [15.953570826460869]
Establishing dense correspondence between two images is a fundamental computer vision problem.
We introduce DenseGAP, a new solution for efficient Dense correspondence learning with a Graph-structured neural network conditioned on Anchor Points.
Our method advances the state-of-the-art of correspondence learning on most benchmarks.
arXiv Detail & Related papers (2021-12-13T18:59:30Z) - A Unified Architecture of Semantic Segmentation and Hierarchical
Generative Adversarial Networks for Expression Manipulation [52.911307452212256]
We develop a unified architecture of semantic segmentation and hierarchical GANs.
A unique advantage of our framework is that on forward pass the semantic segmentation network conditions the generative model.
We evaluate our method on two challenging facial expression translation benchmarks, AffectNet and RaFD, and a semantic segmentation benchmark, CelebAMask-HQ.
arXiv Detail & Related papers (2021-12-08T22:06:31Z) - Multi-Level Graph Convolutional Network with Automatic Graph Learning
for Hyperspectral Image Classification [63.56018768401328]
We propose a Multi-level Graph Convolutional Network (GCN) with Automatic Graph Learning method (MGCN-AGL) for HSI classification.
By employing attention mechanism to characterize the importance among spatially neighboring regions, the most relevant information can be adaptively incorporated to make decisions.
Our MGCN-AGL encodes the long range dependencies among image regions based on the expressive representations that have been produced at local level.
arXiv Detail & Related papers (2020-09-19T09:26:20Z) - Disentangled Non-Local Neural Networks [68.92293183542131]
We study the non-local block in depth, where we find that its attention can be split into two terms.
We present the disentangled non-local block, where the two terms are decoupled to facilitate learning for both terms.
arXiv Detail & Related papers (2020-06-11T17:59:22Z) - ENIGMA Anonymous: Symbol-Independent Inference Guiding Machine (system
description) [0.4893345190925177]
We describe an implementation of gradient boosting and neural guidance of saturation-style automated theorem provers.
For the gradient-boosting guidance, we manually create abstracted features by considering arity-based encodings of formulas.
For the neural guidance, we use symbol-independent graph neural networks (GNNs) and their embedding of the terms and clauses.
arXiv Detail & Related papers (2020-02-13T09:44:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.