GeoLoom: High-quality Geometric Diagram Generation from Textual Input
- URL: http://arxiv.org/abs/2512.08180v1
- Date: Tue, 09 Dec 2025 02:22:23 GMT
- Title: GeoLoom: High-quality Geometric Diagram Generation from Textual Input
- Authors: Xiaojing Wei, Ting Zhang, Wei He, Jingdong Wang, Hua Huang,
- Abstract summary: We propose GeoLoom, a novel framework for text-to-diagram generation in geometric domains.<n>GeoLoom comprises two core components: an autoformalization module that translates natural language into a generation-oriented formal language GeoLingua.<n>To support this framework, we introduce GeoNF, a dataset aligning natural language geometric descriptions with formal GeoLingua descriptions.
- Score: 41.9055060542649
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-quality geometric diagram generation presents both a challenge and an opportunity: it demands strict spatial accuracy while offering well-defined constraints to guide generation. Inspired by recent advances in geometry problem solving that employ formal languages and symbolic solvers for enhanced correctness and interpretability, we propose GeoLoom, a novel framework for text-to-diagram generation in geometric domains. GeoLoom comprises two core components: an autoformalization module that translates natural language into a specifically designed generation-oriented formal language GeoLingua, and a coordinate solver that maps formal constraints to precise coordinates using the efficient Monte Carlo optimization. To support this framework, we introduce GeoNF, a dataset aligning natural language geometric descriptions with formal GeoLingua descriptions. We further propose a constraint-based evaluation metric that quantifies structural deviation, offering mathematically grounded supervision for iterative refinement. Empirical results demonstrate that GeoLoom significantly outperforms state-of-the-art baselines in structural fidelity, providing a principled foundation for interpretable and scalable diagram generation.
Related papers
- Synthesizing Multimodal Geometry Datasets from Scratch and Enabling Visual Alignment via Plotting Code [27.26235987246201]
Multimodal geometry reasoning requires models to jointly understand visual diagrams and perform structured symbolic inference.<n>We propose a pipeline for complex multimodal geometry problems from scratch and construct a dataset named textbfGeoCode, which decouples problem generation into symbolic seed construction.<n>We further introduce code prediction as an explicit alignment objective, transforming visual understanding into a supervised structured prediction task.
arXiv Detail & Related papers (2026-02-21T07:53:48Z) - GeoFocus: Blending Efficient Global-to-Local Perception for Multimodal Geometry Problem-Solving [55.14836667214487]
GeoFocus is a novel framework comprising two core modules.<n>GeoFocus achieves a 4.7% accuracy improvement over leading specialized models.<n>It demonstrates superior robustness in MATHVERSE under diverse visual conditions.
arXiv Detail & Related papers (2026-02-09T11:15:01Z) - GeoVLMath: Enhancing Geometry Reasoning in Vision-Language Models via Cross-Modal Reward for Auxiliary Line Creation [54.53486231309254]
We present GeoVLMath, an open-source LVLM tailored to auxiliary-line reasoning in solid geometry.<n>We generate textual descriptions of auxiliary-line constructions to better align with the representational strengths of LVLMs.<n>Built on this reward, we present GeoVLMath, an open-source LVLM tailored to auxiliary-line reasoning in solid geometry.
arXiv Detail & Related papers (2025-10-13T05:33:51Z) - GeoRef: Referring Expressions in Geometry via Task Formulation, Synthetic Supervision, and Reinforced MLLM-based Solutions [45.70578816057097]
We introduce the task of Referring Expression (REC) for geometric problems.<n>REC evaluates whether models can localize points, shapes, and spatial relations in diagrams in response to textual prompts.<n>We generate a large-scale synthetic training dataset using a structured geometric formal language.
arXiv Detail & Related papers (2025-09-25T12:00:52Z) - GeoGramBench: Benchmarking the Geometric Program Reasoning in Modern LLMs [7.605833826892782]
We present a benchmark of 500 carefully refined problems organized by a tailored three-level taxonomy that considers geometric complexity rather than traditional mathematical reasoning complexity.<n>Our comprehensive evaluation of 17 frontier LLMs reveals consistent and pronounced deficiencies.<n>These results highlight the unique challenges posed by program-driven spatial reasoning and establish GeoGramBench as a valuable resource for advancing research in symbolic-to-spatial geometric reasoning.
arXiv Detail & Related papers (2025-05-23T09:17:07Z) - MagicGeo: Training-Free Text-Guided Geometric Diagram Generation [39.30134393001854]
This paper presents MagicGeo, a training-free framework for generating geometric diagrams from textual descriptions.<n>MagicGeo formulates the diagram generation process as a coordinate optimization problem, ensuring geometric correctness through a formal language solver, and then employs coordinate-aware generation.<n>We introduce MagicGeoBench, a benchmark dataset of 220 geometric diagram descriptions, and demonstrate that MagicGeo outperforms current methods in both qualitative and quantitative evaluations.
arXiv Detail & Related papers (2025-02-19T16:20:14Z) - GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training [45.42400674977197]
GeoX is a multi-modal large model focusing on geometric understanding and reasoning tasks.<n>We introduce unimodal pre-training to develop a diagram encoder and symbol decoder, enhancing the understanding of geometric images and corpora.<n>We propose a Generator-And-Sampler Transformer (GS-Former) to generate discriminative queries and eliminate uninformative representations from unevenly distributed geometric signals.
arXiv Detail & Related papers (2024-12-16T15:20:03Z) - A Survey of Geometric Graph Neural Networks: Data Structures, Models and Applications [71.809127869349]
This paper formalizes geometric graph as the data structure, on top of which we provide a unified view of existing models from the geometric message passing perspective.<n>We also summarize the applications as well as the related datasets to facilitate later research for methodology development and experimental evaluation.
arXiv Detail & Related papers (2024-03-01T12:13:04Z) - Inter-GPS: Interpretable Geometry Problem Solving with Formal Language
and Symbolic Reasoning [123.06420835072225]
We construct a new large-scale benchmark, Geometry3K, consisting of 3,002 geometry problems with dense annotation in formal language.
We propose a novel geometry solving approach with formal language and symbolic reasoning, called Interpretable Geometry Problem solver (Inter-GPS)
Inter-GPS incorporates theorem knowledge as conditional rules and performs symbolic reasoning step by step.
arXiv Detail & Related papers (2021-05-10T07:46:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.