A Symbolic Character-Aware Model for Solving Geometry Problems
- URL: http://arxiv.org/abs/2308.02823v1
- Date: Sat, 5 Aug 2023 08:56:55 GMT
- Title: A Symbolic Character-Aware Model for Solving Geometry Problems
- Authors: Maizhen Ning, Qiu-Feng Wang, Kaizhu Huang, Xiaowei Huang
- Abstract summary: In the text description, symbolic characters such as "$triangle$ABC" often serve as a bridge to connect the corresponding diagram.
We develop a symbolic character-aware model to fully explore the role of these characters in both text and diagram understanding.
- Score: 18.68829580108664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: AI has made significant progress in solving math problems, but geometry
problems remain challenging due to their reliance on both text and diagrams. In
the text description, symbolic characters such as "$\triangle$ABC" often serve
as a bridge to connect the corresponding diagram. However, by simply tokenizing
symbolic characters into individual letters (e.g., 'A', 'B' and 'C'), existing
works fail to study them explicitly and thus lose the semantic relationship
with the diagram. In this paper, we develop a symbolic character-aware model to
fully explore the role of these characters in both text and diagram
understanding and optimize the model under a multi-modal reasoning framework.
In the text encoder, we propose merging individual symbolic characters to form
one semantic unit along with geometric information from the corresponding
diagram. For the diagram encoder, we pre-train it under a multi-label
classification framework with the symbolic characters as labels. In addition,
we enhance the geometry diagram understanding ability via a self-supervised
learning method under the masked image modeling auxiliary task. By integrating
the proposed model into a general encoder-decoder pipeline for solving geometry
problems, we demonstrate its superiority on two benchmark datasets, including
GeoQA and Geometry3K, with extensive experiments. Specifically, on GeoQA, the
question-solving accuracy is increased from 60.0\% to 64.1\%, achieving a new
state-of-the-art accuracy; on Geometry3K, we reduce the question average
solving steps from 6.9 down to 6.0 with marginally higher solving accuracy.
Related papers
- Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver [11.69164802295844]
We introduce a new framework that integrates visual features, geometric formal language, and natural language representations.
We propose a novel synthetic data approach and create a large-scale geometric dataset, SynthGeo228K, annotated with both formal and natural language captions.
Our framework improves MLLMs' ability to process geometric diagrams and extends their application to open-ended tasks on the formalgeo7k dataset.
arXiv Detail & Related papers (2024-09-06T12:11:06Z) - Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses from Diagram [78.79651421493058]
We propose a neural-symbolic model for plane geometry problem solving (PGPS) with three key steps: modal fusion, reasoning process and knowledge verification.
For reasoning, we design an explicable solution program to describe the geometric reasoning process, and employ a self-limited decoder to generate solution program autoregressively.
We also construct a large-scale geometry problem dataset called PGPS9K, containing fine-grained annotations of textual clauses, solution program and involved knowledge solvers.
arXiv Detail & Related papers (2024-07-10T02:45:22Z) - DiagrammerGPT: Generating Open-Domain, Open-Platform Diagrams via LLM Planning [62.51232333352754]
Text-to-image (T2I) generation has seen significant growth over the past few years.
Despite this, there has been little work on generating diagrams with T2I models.
We present DiagrammerGPT, a novel two-stage text-to-diagram generation framework.
We show that our framework produces more accurate diagrams, outperforming existing T2I models.
arXiv Detail & Related papers (2023-10-18T17:37:10Z) - Heterogeneous Line Graph Transformer for Math Word Problems [21.4761673982334]
This paper describes the design and implementation of a new machine learning model for online learning systems.
We aim at improving the intelligent level of the systems by enabling an automated math word problem solver.
arXiv Detail & Related papers (2022-08-11T05:27:05Z) - GAT-CADNet: Graph Attention Network for Panoptic Symbol Spotting in CAD
Drawings [0.0]
Spotting graphical symbols from the computer-aided design (CAD) drawings is essential to many industrial applications.
By treating each CAD drawing as a graph, we propose a novel graph attention network GAT-CADNet.
The proposed GAT-CADNet is intuitive yet effective and manages to solve the panoptic symbol spotting problem in one consolidated network.
arXiv Detail & Related papers (2022-01-03T13:08:28Z) - IconQA: A New Benchmark for Abstract Diagram Understanding and Visual
Language Reasoning [132.49090098391258]
We introduce a new challenge of Icon Question Answering (IconQA) with the goal of answering a question in an icon image context.
We release IconQA, a large-scale dataset that consists of 107,439 questions and three sub-tasks: multi-image-choice, multi-text-choice, and filling-in-the-blank.
We further release an icon dataset Icon645 which contains 645,687 colored icons on 377 classes.
arXiv Detail & Related papers (2021-10-25T18:52:26Z) - GeoQA: A Geometric Question Answering Benchmark Towards Multimodal
Numerical Reasoning [172.36214872466707]
We focus on solving geometric problems, which requires a comprehensive understanding of textual descriptions, visual diagrams, and theorem knowledge.
We propose a Geometric Question Answering dataset GeoQA, containing 5,010 geometric problems with corresponding annotated programs.
arXiv Detail & Related papers (2021-05-30T12:34:17Z) - Inter-GPS: Interpretable Geometry Problem Solving with Formal Language
and Symbolic Reasoning [123.06420835072225]
We construct a new large-scale benchmark, Geometry3K, consisting of 3,002 geometry problems with dense annotation in formal language.
We propose a novel geometry solving approach with formal language and symbolic reasoning, called Interpretable Geometry Problem solver (Inter-GPS)
Inter-GPS incorporates theorem knowledge as conditional rules and performs symbolic reasoning step by step.
arXiv Detail & Related papers (2021-05-10T07:46:55Z) - TextRay: Contour-based Geometric Modeling for Arbitrary-shaped Scene
Text Detection [20.34326396800748]
We propose an arbitrary-shaped text detection method, namely TextRay, which conducts top-down contour-based geometric modeling and geometric parameter learning.
Experiments on several benchmark datasets demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-08-11T16:52:10Z) - Machine Number Sense: A Dataset of Visual Arithmetic Problems for
Abstract and Relational Reasoning [95.18337034090648]
We propose a dataset, Machine Number Sense (MNS), consisting of visual arithmetic problems automatically generated using a grammar model--And-Or Graph (AOG)
These visual arithmetic problems are in the form of geometric figures.
We benchmark the MNS dataset using four predominant neural network models as baselines in this visual reasoning task.
arXiv Detail & Related papers (2020-04-25T17:14:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.