GeoQA: A Geometric Question Answering Benchmark Towards Multimodal
Numerical Reasoning
- URL: http://arxiv.org/abs/2105.14517v1
- Date: Sun, 30 May 2021 12:34:17 GMT
- Title: GeoQA: A Geometric Question Answering Benchmark Towards Multimodal
Numerical Reasoning
- Authors: Jiaqi Chen, Jianheng Tang, Jinghui Qin, Xiaodan Liang, Lingbo Liu,
Eric P. Xing, Liang Lin
- Abstract summary: We focus on solving geometric problems, which requires a comprehensive understanding of textual descriptions, visual diagrams, and theorem knowledge.
We propose a Geometric Question Answering dataset GeoQA, containing 5,010 geometric problems with corresponding annotated programs.
- Score: 172.36214872466707
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic math problem solving has recently attracted increasing attention as
a long-standing AI benchmark. In this paper, we focus on solving geometric
problems, which requires a comprehensive understanding of textual descriptions,
visual diagrams, and theorem knowledge. However, the existing methods were
highly dependent on handcraft rules and were merely evaluated on small-scale
datasets. Therefore, we propose a Geometric Question Answering dataset GeoQA,
containing 5,010 geometric problems with corresponding annotated programs,
which illustrate the solving process of the given problems. Compared with
another publicly available dataset GeoS, GeoQA is 25 times larger, in which the
program annotations can provide a practical testbed for future research on
explicit and explainable numerical reasoning. Moreover, we introduce a Neural
Geometric Solver (NGS) to address geometric problems by comprehensively parsing
multimodal information and generating interpretable programs. We further add
multiple self-supervised auxiliary tasks on NGS to enhance cross-modal semantic
representation. Extensive experiments on GeoQA validate the effectiveness of
our proposed NGS and auxiliary tasks. However, the results are still
significantly lower than human performance, which leaves large room for future
research. Our benchmark and code are released at
https://github.com/chen-judge/GeoQA .
Related papers
- Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses from Diagram [78.79651421493058]
We propose a neural-symbolic model for plane geometry problem solving (PGPS) with three key steps: modal fusion, reasoning process and knowledge verification.
For reasoning, we design an explicable solution program to describe the geometric reasoning process, and employ a self-limited decoder to generate solution program autoregressively.
We also construct a large-scale geometry problem dataset called PGPS9K, containing fine-grained annotations of textual clauses, solution program and involved knowledge solvers.
arXiv Detail & Related papers (2024-07-10T02:45:22Z) - A Survey of Geometric Graph Neural Networks: Data Structures, Models and
Applications [67.33002207179923]
This paper presents a survey of data structures, models, and applications related to geometric GNNs.
We provide a unified view of existing models from the geometric message passing perspective.
We also summarize the applications as well as the related datasets to facilitate later research for methodology development and experimental evaluation.
arXiv Detail & Related papers (2024-03-01T12:13:04Z) - FGeo-TP: A Language Model-Enhanced Solver for Geometry Problems [1.137457877869062]
We introduce FGeo-TP (Theorem Predictor), which utilizes the language model to predict theorem sequences for solving geometry problems.
Our results demonstrate a significant increase in the problem-solving rate of the language model-enhanced FGeo-TP on the FormalGeo7k dataset.
arXiv Detail & Related papers (2024-02-14T09:44:28Z) - GeomVerse: A Systematic Evaluation of Large Models for Geometric
Reasoning [17.61621287003562]
We evaluate vision language models (VLMs) along various axes through the lens of geometry problems.
We procedurally create a synthetic dataset of geometry questions with controllable difficulty levels along multiple axes.
The empirical results obtained using our benchmark for state-of-the-art VLMs indicate that these models are not as capable in subjects like geometry.
arXiv Detail & Related papers (2023-12-19T15:25:39Z) - G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model [124.68242155098189]
Large language models (LLMs) have shown remarkable proficiency in human-level reasoning and generation capabilities.
G-LLaVA demonstrates exceptional performance in solving geometric problems, significantly outperforming GPT-4-V on the MathVista benchmark with only 7B parameters.
arXiv Detail & Related papers (2023-12-18T17:36:20Z) - UniGeo: Unifying Geometry Logical Reasoning via Reformulating
Mathematical Expression [127.68780714438103]
Two main geometry problems: calculation and proving, are usually treated as two specific tasks.
We construct a large-scale Unified Geometry problem benchmark, UniGeo, which contains 4,998 calculation problems and 9,543 proving problems.
We also present a unified multi-task Geometric Transformer framework, Geoformer, to tackle calculation and proving problems simultaneously.
arXiv Detail & Related papers (2022-12-06T04:37:51Z) - Inter-GPS: Interpretable Geometry Problem Solving with Formal Language
and Symbolic Reasoning [123.06420835072225]
We construct a new large-scale benchmark, Geometry3K, consisting of 3,002 geometry problems with dense annotation in formal language.
We propose a novel geometry solving approach with formal language and symbolic reasoning, called Interpretable Geometry Problem solver (Inter-GPS)
Inter-GPS incorporates theorem knowledge as conditional rules and performs symbolic reasoning step by step.
arXiv Detail & Related papers (2021-05-10T07:46:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.