Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2
- URL: http://arxiv.org/abs/2502.03544v2
- Date: Fri, 28 Feb 2025 23:59:22 GMT
- Title: Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2
- Authors: Yuri Chervonyi, Trieu H. Trinh, Miroslav Olšák, Xiaomeng Yang, Hoang Nguyen, Marcelo Menegali, Junehyuk Jung, Vikas Verma, Quoc V. Le, Thang Luong,
- Abstract summary: We present AlphaGeometry2, a significantly improved version of AlphaGeometry introduced in Trinh et al. (2024)<n>To achieve this, we first extend the original AlphaGeometry language to tackle harder problems involving movements of objects.<n>This has markedly improved the coverage rate of the AlphaGeometry language on International Math Olympiads (IMO) 2000-2024 geometry problems from 66% to 88%.
- Score: 43.92309838336044
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present AlphaGeometry2, a significantly improved version of AlphaGeometry introduced in Trinh et al. (2024), which has now surpassed an average gold medalist in solving Olympiad geometry problems. To achieve this, we first extend the original AlphaGeometry language to tackle harder problems involving movements of objects, and problems containing linear equations of angles, ratios, and distances. This, together with support for non-constructive problems, has markedly improved the coverage rate of the AlphaGeometry language on International Math Olympiads (IMO) 2000-2024 geometry problems from 66% to 88%. The search process of AlphaGeometry2 has also been greatly improved through the use of Gemini architecture for better language modeling, and a novel knowledge-sharing mechanism that enables effective communication between search trees. Together with further enhancements to the symbolic engine and synthetic data generation, we have significantly boosted the overall solving rate of AlphaGeometry2 to 84% for $\textit{all}$ geometry problems over the last 25 years, compared to 54% previously. AlphaGeometry2 was also part of the system that achieved silver-medal standard at IMO 2024 https://dpmd.ai/imo-silver. Last but not least, we report progress towards using AlphaGeometry2 as a part of a fully automated system that reliably solves geometry problems directly from natural language input.
Related papers
- Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning [66.79506488139707]
Large language model (LLM) agents exhibit strong mathematical problem-solving abilities.<n>In this work, we make the first attempt to build a medalist-level LLM agent for geometry and present InternGeometry.<n> InternGeometry overcomes the limitations in geometry by iteratively proposing propositions and auxiliary constructions, verifying them with a symbolic engine.<n>Built on InternThinker-32B, InternGeometry solves 44 of 50 IMO geometry problems, exceeding the average gold medalist score (40.9), using only 13K training examples.
arXiv Detail & Related papers (2025-12-11T11:05:04Z) - Gold-Medal-Level Olympiad Geometry Solving with Efficient Heuristic Auxiliary Constructions [129.877899436804]
We present a highly efficient method for geometry theorem proving that runs entirely on CPUs without relying on neural network-based inference.<n>Our initial study shows that a simple random strategy for adding auxiliary points can achieve silver-medal level human performance on International Mathematical Olympiad (IMO)<n>We further construct HAGeo-409, a benchmark consisting of 409 geometry problems with human-assessed difficulty levels.
arXiv Detail & Related papers (2025-11-27T01:05:00Z) - Proposing and solving olympiad geometry with guided tree search [63.824930029019995]
We introduce TongGeometry, a Euclidean geometry system supporting tree-search-based guided problem proposing and solving.<n>TongGeometry discovers 6.7 billion geometry theorems requiring auxiliary constructions, including 4.1 billion exhibiting geometric symmetry.<n>TongGeometry solved all International Mathematical Olympiad geometry in IMO-AG-30, outperforming gold medalists for the first time.
arXiv Detail & Related papers (2024-12-14T04:20:47Z) - Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems with Meta In-Context Learning [4.4615747404424395]
Geometry mathematics problems pose significant challenges for large language models (LLMs)
We collect a geometry question-answer dataset by sourcing geometric data from Chinese high school education websites, referred to as GeoMath.
We propose a Large Multi-modal Model (LMM) framework named Geo-LLaVA, which incorporates retrieval augmentation with supervised fine-tuning (SFT) in the training stage, called meta-training, and employs in-context learning (ICL) during inference to improve performance.
arXiv Detail & Related papers (2024-12-12T07:34:09Z) - Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver [11.69164802295844]
We introduce a new framework that integrates visual features, geometric formal language, and natural language representations.
We propose a novel synthetic data approach and create a large-scale geometric dataset, SynthGeo228K, annotated with both formal and natural language captions.
Our framework improves MLLMs' ability to process geometric diagrams and extends their application to open-ended tasks on the formalgeo7k dataset.
arXiv Detail & Related papers (2024-09-06T12:11:06Z) - GOLD: Geometry Problem Solver with Natural Language Description [7.9345421580482185]
We present the Geometry problem sOlver with natural Language Description (GOLD) model.
GOLD enhances the extraction of geometric relations by separately processing symbols and geometric primitives within the diagram.
It converts the extracted relations into natural language descriptions, efficiently utilizing large language models to solve geometry math problems.
arXiv Detail & Related papers (2024-05-01T13:00:51Z) - Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry [16.41436428888792]
We revisit the IMO-AG-30 Challenge introduced with AlphaGeometry and find that Wu's method is surprisingly strong.
Wu's method alone can solve 15 problems, and some of them are not solved by any of the other methods.
We set a new state-of-the-art for automated theorem proving on IMO-AG-30, solving 27 out of 30 problems, the first AI method which outperforms an IMO gold medalist.
arXiv Detail & Related papers (2024-04-09T15:54:00Z) - G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model [124.68242155098189]
Large language models (LLMs) have shown remarkable proficiency in human-level reasoning and generation capabilities.
G-LLaVA demonstrates exceptional performance in solving geometric problems, significantly outperforming GPT-4-V on the MathVista benchmark with only 7B parameters.
arXiv Detail & Related papers (2023-12-18T17:36:20Z) - FormalGeo: An Extensible Formalized Framework for Olympiad Geometric
Problem Solving [9.73597821684857]
This is the first paper in a series of work we have accomplished over the past three years.
In this paper, we have constructed a consistent formal plane geometry system.
This will serve as a crucial bridge between IMO-level plane geometry challenges and readable AI automated reasoning.
arXiv Detail & Related papers (2023-10-27T09:55:12Z) - UniGeo: Unifying Geometry Logical Reasoning via Reformulating
Mathematical Expression [127.68780714438103]
Two main geometry problems: calculation and proving, are usually treated as two specific tasks.
We construct a large-scale Unified Geometry problem benchmark, UniGeo, which contains 4,998 calculation problems and 9,543 proving problems.
We also present a unified multi-task Geometric Transformer framework, Geoformer, to tackle calculation and proving problems simultaneously.
arXiv Detail & Related papers (2022-12-06T04:37:51Z) - Inter-GPS: Interpretable Geometry Problem Solving with Formal Language
and Symbolic Reasoning [123.06420835072225]
We construct a new large-scale benchmark, Geometry3K, consisting of 3,002 geometry problems with dense annotation in formal language.
We propose a novel geometry solving approach with formal language and symbolic reasoning, called Interpretable Geometry Problem solver (Inter-GPS)
Inter-GPS incorporates theorem knowledge as conditional rules and performs symbolic reasoning step by step.
arXiv Detail & Related papers (2021-05-10T07:46:55Z) - DSG-Net: Learning Disentangled Structure and Geometry for 3D Shape
Generation [98.96086261213578]
We introduce DSG-Net, a deep neural network that learns a disentangled structured and geometric mesh representation for 3D shapes.
This supports a range of novel shape generation applications with disentangled control, such as of structure (geometry) while keeping geometry (structure) unchanged.
Our method not only supports controllable generation applications but also produces high-quality synthesized shapes, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2020-08-12T17:06:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.