Proposing and solving olympiad geometry with guided tree search
- URL: http://arxiv.org/abs/2412.10673v1
- Date: Sat, 14 Dec 2024 04:20:47 GMT
- Title: Proposing and solving olympiad geometry with guided tree search
- Authors: Chi Zhang, Jiajun Song, Siyu Li, Yitao Liang, Yuxi Ma, Wei Wang, Yixin Zhu, Song-Chun Zhu,
- Abstract summary: We introduce TongGeometry, a Euclidean geometry system supporting tree-search-based guided problem proposing and solving.
TongGeometry discovers 6.7 billion geometry theorems requiring auxiliary constructions, including 4.1 billion exhibiting geometric symmetry.
TongGeometry solved all International Mathematical Olympiad geometry in IMO-AG-30, outperforming gold medalists for the first time.
- Score: 63.824930029019995
- License:
- Abstract: Mathematics olympiads are prestigious competitions, with problem proposing and solving highly honored. Building artificial intelligence that proposes and solves olympiads presents an unresolved challenge in automated theorem discovery and proving, especially in geometry for its combination of numerical and spatial elements. We introduce TongGeometry, a Euclidean geometry system supporting tree-search-based guided problem proposing and solving. The efficient geometry system establishes the most extensive repository of geometry theorems to date: within the same computational budget as the existing state-of-the-art, TongGeometry discovers 6.7 billion geometry theorems requiring auxiliary constructions, including 4.1 billion exhibiting geometric symmetry. Among them, 10 theorems were proposed to regional mathematical olympiads with 3 of TongGeometry's proposals selected in real competitions, earning spots in a national team qualifying exam or a top civil olympiad in China and the US. Guided by fine-tuned large language models, TongGeometry solved all International Mathematical Olympiad geometry in IMO-AG-30, outperforming gold medalists for the first time. It also surpasses the existing state-of-the-art across a broader spectrum of olympiad-level problems. The full capabilities of the system can be utilized on a consumer-grade machine, making the model more accessible and fostering widespread democratization of its use. By analogy, unlike existing systems that merely solve problems like students, TongGeometry acts like a geometry coach, discovering, presenting, and proving theorems.
Related papers
- Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems with Meta In-Context Learning [4.4615747404424395]
Geometry mathematics problems pose significant challenges for large language models (LLMs)
We collect a geometry question-answer dataset by sourcing geometric data from Chinese high school education websites, referred to as GeoMath.
We propose a Large Multi-modal Model (LMM) framework named Geo-LLaVA, which incorporates retrieval augmentation with supervised fine-tuning (SFT) in the training stage, called meta-training, and employs in-context learning (ICL) during inference to improve performance.
arXiv Detail & Related papers (2024-12-12T07:34:09Z) - Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses from Diagram [78.79651421493058]
We propose a neural-symbolic model for plane geometry problem solving (PGPS) with three key steps: modal fusion, reasoning process and knowledge verification.
For reasoning, we design an explicable solution program to describe the geometric reasoning process, and employ a self-limited decoder to generate solution program autoregressively.
We also construct a large-scale geometry problem dataset called PGPS9K, containing fine-grained annotations of textual clauses, solution program and involved knowledge solvers.
arXiv Detail & Related papers (2024-07-10T02:45:22Z) - Proving Olympiad Algebraic Inequalities without Human Demonstrations [3.3466865213133836]
We propose AIPS, an Algebraic Inequality Proving System capable of autonomously generating complex inequality theorems.
On a test set of 20 Olympiad-level inequality problems, AIPS successfully solved 10, outperforming state-of-the-art methods.
One theorem was selected as a competition problem in a major city 2024 Mathematical Olympiad.
arXiv Detail & Related papers (2024-06-20T11:37:53Z) - FGeo-TP: A Language Model-Enhanced Solver for Geometry Problems [1.137457877869062]
We introduce FGeo-TP (Theorem Predictor), which utilizes the language model to predict theorem sequences for solving geometry problems.
Our results demonstrate a significant increase in the problem-solving rate of the language model-enhanced FGeo-TP on the FormalGeo7k dataset.
arXiv Detail & Related papers (2024-02-14T09:44:28Z) - GAPS: Geometry-Aware Problem Solver [7.9345421580482185]
Geometry problem solving presents a formidable challenge within the NLP community.
Existing approaches often rely on models designed for solving math word problems, neglecting the unique characteristics of geometry math problems.
In this study, we propose the Geometry-Aware Problem Solver (GAPS) model.
GAPS is specifically designed to generate solution programs for geometry math problems of various types.
arXiv Detail & Related papers (2024-01-29T16:48:34Z) - G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model [124.68242155098189]
Large language models (LLMs) have shown remarkable proficiency in human-level reasoning and generation capabilities.
G-LLaVA demonstrates exceptional performance in solving geometric problems, significantly outperforming GPT-4-V on the MathVista benchmark with only 7B parameters.
arXiv Detail & Related papers (2023-12-18T17:36:20Z) - A Rule Based Theorem Prover: an Introduction to Proofs in Secondary
Schools [0.0]
The introduction of automated deduction systems in secondary schools face several bottlenecks.
The dissonance between the outcomes of the geometry automated theorem provers and the normal practice of conjecturing and proving in schools is a major barrier to a wider use of such tools in an educational environment.
A lesson plan is presented, its goal is the introduction of formal demonstration of proving geometric theorems, trying to motivate students to that goal.
arXiv Detail & Related papers (2023-03-10T11:36:10Z) - UniGeo: Unifying Geometry Logical Reasoning via Reformulating
Mathematical Expression [127.68780714438103]
Two main geometry problems: calculation and proving, are usually treated as two specific tasks.
We construct a large-scale Unified Geometry problem benchmark, UniGeo, which contains 4,998 calculation problems and 9,543 proving problems.
We also present a unified multi-task Geometric Transformer framework, Geoformer, to tackle calculation and proving problems simultaneously.
arXiv Detail & Related papers (2022-12-06T04:37:51Z) - GeoQA: A Geometric Question Answering Benchmark Towards Multimodal
Numerical Reasoning [172.36214872466707]
We focus on solving geometric problems, which requires a comprehensive understanding of textual descriptions, visual diagrams, and theorem knowledge.
We propose a Geometric Question Answering dataset GeoQA, containing 5,010 geometric problems with corresponding annotated programs.
arXiv Detail & Related papers (2021-05-30T12:34:17Z) - Inter-GPS: Interpretable Geometry Problem Solving with Formal Language
and Symbolic Reasoning [123.06420835072225]
We construct a new large-scale benchmark, Geometry3K, consisting of 3,002 geometry problems with dense annotation in formal language.
We propose a novel geometry solving approach with formal language and symbolic reasoning, called Interpretable Geometry Problem solver (Inter-GPS)
Inter-GPS incorporates theorem knowledge as conditional rules and performs symbolic reasoning step by step.
arXiv Detail & Related papers (2021-05-10T07:46:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.