Related papers: FormalGeo: An Extensible Formalized Framework for Olympiad Geometric Problem Solving

FormalGeo: An Extensible Formalized Framework for Olympiad Geometric Problem Solving

URL: http://arxiv.org/abs/2310.18021v6
Date: Thu, 15 Feb 2024 04:59:55 GMT
Title: FormalGeo: An Extensible Formalized Framework for Olympiad Geometric Problem Solving
Authors: Xiaokai Zhang, Na Zhu, Yiming He, Jia Zou, Qike Huang, Xiaoxiao Jin, Yanjun Guo, Chenyang Mao, Yang Li, Zhe Zhu, Dengfeng Yue, Fangzhen Zhu, Yifan Wang, Yiwen Huang, Runan Wang, Cheng Qin, Zhenbing Zeng, Shaorong Xie, Xiangfeng Luo, Tuo Leng
Abstract summary: This is the first paper in a series of work we have accomplished over the past three years. In this paper, we have constructed a consistent formal plane geometry system. This will serve as a crucial bridge between IMO-level plane geometry challenges and readable AI automated reasoning.
Score: 9.73597821684857
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This is the first paper in a series of work we have accomplished over the past three years. In this paper, we have constructed a consistent formal plane geometry system. This will serve as a crucial bridge between IMO-level plane geometry challenges and readable AI automated reasoning. Within this formal framework, we have been able to seamlessly integrate modern AI models with our formal system. AI is now capable of providing deductive reasoning solutions to IMO-level plane geometry problems, just like handling other natural languages, and these proofs are readable, traceable, and verifiable. We propose the geometry formalization theory (GFT) to guide the development of the geometry formal system. Based on the GFT, we have established the FormalGeo, which consists of 88 geometric predicates and 196 theorems. It can represent, validate, and solve IMO-level geometry problems. we also have crafted the FGPS (formal geometry problem solver) in Python. It serves as both an interactive assistant for verifying problem-solving processes and an automated problem solver. We've annotated the formalgeo7k and formalgeo-imo datasets. The former contains 6,981 (expand to 133,818 through data augmentation) geometry problems, while the latter includes 18 (expand to 2,627 and continuously increasing) IMO-level challenging geometry problems. All annotated problems include detailed formal language descriptions and solutions. Implementation of the formal system and experiments validate the correctness and utility of the GFT. The backward depth-first search method only yields a 2.42% problem-solving failure rate, and we can incorporate deep learning techniques to achieve lower one. The source code of FGPS and datasets are available at https://github.com/BitSecret/FGPS.

Related papers

TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving [66.0201510984171]
We propose a scalable data engine called TrustGeoGen for problem generation. By formal verification, TrustGeoGen produces GeoTrust-200K dataset with guaranteed modality integrity. Experiments reveal the state-of-the-art models achieve only 49.17% accuracy on GeoTrust-test.
arXiv Detail & Related papers (2025-04-22T10:45:23Z)
Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems with Meta In-Context Learning [4.4615747404424395]
Geometry mathematics problems pose significant challenges for large language models (LLMs) We collect a geometry question-answer dataset by sourcing geometric data from Chinese high school education websites, referred to as GeoMath. We propose a Large Multi-modal Model (LMM) framework named Geo-LLaVA, which incorporates retrieval augmentation with supervised fine-tuning (SFT) in the training stage, called meta-training, and employs in-context learning (ICL) during inference to improve performance.
arXiv Detail & Related papers (2024-12-12T07:34:09Z)
Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses from Diagram [78.79651421493058]
We propose a neural-symbolic model for plane geometry problem solving (PGPS) with three key steps: modal fusion, reasoning process and knowledge verification. For reasoning, we design an explicable solution program to describe the geometric reasoning process, and employ a self-limited decoder to generate solution program autoregressively. We also construct a large-scale geometry problem dataset called PGPS9K, containing fine-grained annotations of textual clauses, solution program and involved knowledge solvers.
arXiv Detail & Related papers (2024-07-10T02:45:22Z)
Autoformalizing Euclidean Geometry [74.72212706513318]
We introduce a neuro-symbolic framework for autoformalizing Euclidean geometry. One challenge is that informal proofs rely on diagrams, leaving gaps in texts that are hard to formalize. We provide automatic semantic evaluation for autoformalized theorem statements.
arXiv Detail & Related papers (2024-05-27T14:35:10Z)
FGeo-DRL: Deductive Reasoning for Geometric Problems through Deep Reinforcement Learning [1.137457877869062]
We build a neural-symbolic system, called FGeoDRL, to automatically perform human-like geometric deductive reasoning. The neural part is an AI agent based on reinforcement learning, capable of autonomously learning problem-solving methods. Experiments conducted on the formalgeo7k dataset have achieved a problem-solving success rate of 86.40%.
arXiv Detail & Related papers (2024-02-14T09:48:39Z)
FGeo-TP: A Language Model-Enhanced Solver for Geometry Problems [1.137457877869062]
We introduce FGeo-TP (Theorem Predictor), which utilizes the language model to predict theorem sequences for solving geometry problems. Our results demonstrate a significant increase in the problem-solving rate of the language model-enhanced FGeo-TP on the FormalGeo7k dataset.
arXiv Detail & Related papers (2024-02-14T09:44:28Z)
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model [124.68242155098189]
Large language models (LLMs) have shown remarkable proficiency in human-level reasoning and generation capabilities. G-LLaVA demonstrates exceptional performance in solving geometric problems, significantly outperforming GPT-4-V on the MathVista benchmark with only 7B parameters.
arXiv Detail & Related papers (2023-12-18T17:36:20Z)
UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression [127.68780714438103]
Two main geometry problems: calculation and proving, are usually treated as two specific tasks. We construct a large-scale Unified Geometry problem benchmark, UniGeo, which contains 4,998 calculation problems and 9,543 proving problems. We also present a unified multi-task Geometric Transformer framework, Geoformer, to tackle calculation and proving problems simultaneously.
arXiv Detail & Related papers (2022-12-06T04:37:51Z)
GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning [172.36214872466707]
We focus on solving geometric problems, which requires a comprehensive understanding of textual descriptions, visual diagrams, and theorem knowledge. We propose a Geometric Question Answering dataset GeoQA, containing 5,010 geometric problems with corresponding annotated programs.
arXiv Detail & Related papers (2021-05-30T12:34:17Z)
Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning [123.06420835072225]
We construct a new large-scale benchmark, Geometry3K, consisting of 3,002 geometry problems with dense annotation in formal language. We propose a novel geometry solving approach with formal language and symbolic reasoning, called Interpretable Geometry Problem solver (Inter-GPS) Inter-GPS incorporates theorem knowledge as conditional rules and performs symbolic reasoning step by step.
arXiv Detail & Related papers (2021-05-10T07:46:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.