AutoGeo: Automating Geometric Image Dataset Creation for Enhanced Geometry Understanding
- URL: http://arxiv.org/abs/2409.09039v1
- Date: Wed, 28 Aug 2024 14:49:26 GMT
- Title: AutoGeo: Automating Geometric Image Dataset Creation for Enhanced Geometry Understanding
- Authors: Zihan Huang, Tao Wu, Wang Lin, Shengyu Zhang, Jingyuan Chen, Fei Wu,
- Abstract summary: This paper introduces AutoGeo, a novel approach for automatically generating mathematical geometric images.
By leveraging precisely defined geometric clauses, AutoGeo-100k contains a wide variety of geometric shapes.
Experimental results indicate significant improvements in the model's ability in handling geometric images.
- Score: 18.223835101407637
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: With the rapid advancement of large language models, there has been a growing interest in their capabilities in mathematical reasoning. However, existing research has primarily focused on text-based algebra problems, neglecting the study of geometry due to the lack of high-quality geometric datasets. To address this gap, this paper introduces AutoGeo, a novel approach for automatically generating mathematical geometric images to fulfill the demand for large-scale and diverse geometric datasets. AutoGeo facilitates the creation of AutoGeo-100k, an extensive repository comprising 100k high-quality geometry image-text pairs. By leveraging precisely defined geometric clauses, AutoGeo-100k contains a wide variety of geometric shapes, including lines, polygons, circles, and complex spatial relationships, etc. Furthermore, this paper demonstrates the efficacy of AutoGeo-100k in enhancing the performance of multimodal large language models through fine-tuning. Experimental results indicate significant improvements in the model's ability in handling geometric images, as evidenced by enhanced accuracy in tasks such as geometric captioning and mathematical reasoning. This research not only fills a critical gap in the availability of geometric datasets but also paves the way for the advancement of sophisticated AI-driven tools in education and research. Project page: https://autogeo-official.github.io/.
Related papers
- MagicGeo: Training-Free Text-Guided Geometric Diagram Generation [39.30134393001854]
This paper presents MagicGeo, a training-free framework for generating geometric diagrams from textual descriptions.
MagicGeo formulates the diagram generation process as a coordinate optimization problem, ensuring geometric correctness through a formal language solver, and then employs coordinate-aware generation.
We introduce MagicGeoBench, a benchmark dataset of 220 geometric diagram descriptions, and demonstrate that MagicGeo outperforms current methods in both qualitative and quantitative evaluations.
arXiv Detail & Related papers (2025-02-19T16:20:14Z) - GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training [45.42400674977197]
GeoX is a multi-modal large model focusing on geometric understanding and reasoning tasks.
We introduce unimodal pre-training to develop a diagram encoder and symbol decoder, enhancing the understanding of geometric images and corpora.
We propose a Generator-And-Sampler Transformer (GS-Former) to generate discriminative queries and eliminate uninformative representations from unevenly distributed geometric signals.
arXiv Detail & Related papers (2024-12-16T15:20:03Z) - Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems with Meta In-Context Learning [4.4615747404424395]
Geometry mathematics problems pose significant challenges for large language models (LLMs)
We collect a geometry question-answer dataset by sourcing geometric data from Chinese high school education websites, referred to as GeoMath.
We propose a Large Multi-modal Model (LMM) framework named Geo-LLaVA, which incorporates retrieval augmentation with supervised fine-tuning (SFT) in the training stage, called meta-training, and employs in-context learning (ICL) during inference to improve performance.
arXiv Detail & Related papers (2024-12-12T07:34:09Z) - Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds [18.156807299614503]
We introduce Geometry-Aware Generative Autoencoder (GAGA), a novel framework that combines manifold learning with generative modeling.
GAGA shows competitive performance in simulated and real-world datasets, including a 30% improvement over the state-of-the-art methods in single-cell population-level trajectory inference.
arXiv Detail & Related papers (2024-10-16T17:53:26Z) - GOLD: Geometry Problem Solver with Natural Language Description [7.9345421580482185]
We present the Geometry problem sOlver with natural Language Description (GOLD) model.
GOLD enhances the extraction of geometric relations by separately processing symbols and geometric primitives within the diagram.
It converts the extracted relations into natural language descriptions, efficiently utilizing large language models to solve geometry math problems.
arXiv Detail & Related papers (2024-05-01T13:00:51Z) - A Survey of Geometric Graph Neural Networks: Data Structures, Models and
Applications [67.33002207179923]
This paper presents a survey of data structures, models, and applications related to geometric GNNs.
We provide a unified view of existing models from the geometric message passing perspective.
We also summarize the applications as well as the related datasets to facilitate later research for methodology development and experimental evaluation.
arXiv Detail & Related papers (2024-03-01T12:13:04Z) - Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images [56.86175251327466]
We introduce a novel approach to learn geometries such as depth and surface normal from images while incorporating geometric context.
Our approach extracts geometric context that encodes the geometric variations present in the input image and correlates depth estimation with geometric constraints.
Our method unifies depth and surface normal estimations within a cohesive framework, which enables the generation of high-quality 3D geometry from images.
arXiv Detail & Related papers (2024-02-08T17:57:59Z) - G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model [124.68242155098189]
Large language models (LLMs) have shown remarkable proficiency in human-level reasoning and generation capabilities.
G-LLaVA demonstrates exceptional performance in solving geometric problems, significantly outperforming GPT-4-V on the MathVista benchmark with only 7B parameters.
arXiv Detail & Related papers (2023-12-18T17:36:20Z) - Exploring Data Geometry for Continual Learning [64.4358878435983]
We study continual learning from a novel perspective by exploring data geometry for the non-stationary stream of data.
Our method dynamically expands the geometry of the underlying space to match growing geometric structures induced by new data.
Experiments show that our method achieves better performance than baseline methods designed in Euclidean space.
arXiv Detail & Related papers (2023-04-08T06:35:25Z) - GeoQA: A Geometric Question Answering Benchmark Towards Multimodal
Numerical Reasoning [172.36214872466707]
We focus on solving geometric problems, which requires a comprehensive understanding of textual descriptions, visual diagrams, and theorem knowledge.
We propose a Geometric Question Answering dataset GeoQA, containing 5,010 geometric problems with corresponding annotated programs.
arXiv Detail & Related papers (2021-05-30T12:34:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.