Related papers: Thinking Outside the Template with Modular GP-GOMEA

Thinking Outside the Template with Modular GP-GOMEA

URL: http://arxiv.org/abs/2505.01262v1
Date: Fri, 02 May 2025 13:29:55 GMT
Title: Thinking Outside the Template with Modular GP-GOMEA
Authors: Joe Harrison, Peter A. N. Bosman, Tanja Alderliesten,
Abstract summary: This paper presents a modular representation for GP-GOMEA that allows multiple trees to be evolved simultaneously that can be used as (functional) subexpressions.<n>We find that our proposed approach generally outperforms single-template GP-GOMEA and can moreover uncover ground-truth expressions underlying synthetic datasets with modular subexpressions at a faster rate than GP-GOMEA without modular subexpressions.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The goal in Symbolic Regression (SR) is to discover expressions that accurately map input to output data. Because often the intent is to understand these expressions, there is a trade-off between accuracy and the interpretability of expressions. GP-GOMEA excels at producing small SR expressions (increasing the potential for interpretability) with high accuracy, but requires a fixed tree template, which limits the types of expressions that can be evolved. This paper presents a modular representation for GP-GOMEA that allows multiple trees to be evolved simultaneously that can be used as (functional) subexpressions. While each tree individually is constrained to a (small) fixed tree template, the final expression, if expanded, can exhibit a much larger structure. Furthermore, the use of subexpressions decomposes the original regression problem and opens the possibility for enhanced interpretability through the piece-wise understanding of small subexpressions. We compare the performance of GP-GOMEA with and without modular templates on a variety of datasets. We find that our proposed approach generally outperforms single-template GP-GOMEA and can moreover uncover ground-truth expressions underlying synthetic datasets with modular subexpressions at a faster rate than GP-GOMEA without modular subexpressions.

Related papers

A Better Multi-Objective GP-GOMEA -- But do we Need it? [0.0]
In Symbolic Regression (SR) achieving a proper balance between accuracy and interpretability remains a key challenge.<n>The Genetic Programming variant of the Gene-pool Optimal Mixing Evolutionary Algorithm (GP-GOMEA) is of particular interest as it achieves state-of-the-art performance using a template that limits the size of expressions.<n>A recently introduced expansion, modular GP-GOMEA, is capable of decomposing expressions using multiple subexpressions, further increasing chances of interpretability.<n>A multi-objective variant of GP-GOMEA exists, which can be used, for instance, to optimize for size and accuracy simultaneously
arXiv Detail & Related papers (2025-07-04T18:54:27Z)
Beyond Message Passing: Neural Graph Pattern Machine [50.78679002846741]
We introduce the Neural Graph Pattern Machine (GPM), a novel framework that bypasses message passing by learning directly from graph substructures.<n>GPM efficiently extracts, encodes, and prioritizes task-relevant graph patterns, offering greater expressivity and improved ability to capture long-range dependencies.
arXiv Detail & Related papers (2025-01-30T20:37:47Z)
Improving Genetic Programming for Symbolic Regression with Equality Graphs [0.0]
We exploit the equality graph to store expressions and their equivalent forms.<n>We adapt the subtree operators to reduce the chances of revisiting expressions.<n>Results show that, for small expressions, this approach improves the performance of a simple GP algorithm to compete with PySR and Operon.
arXiv Detail & Related papers (2025-01-29T18:49:34Z)
Improving the efficiency of GP-GOMEA for higher-arity operators [0.0]
Genetic Programming (GP) can offer a way to evolve inherently interpretable expressions. GP-GOMEA is a form of GP that has been found particularly effective at evolving expressions that are accurate yet of limited size and, thus, promote interpretability. This paper proposes two enhancements to GP-GOMEA: semantic subtree inheritance and greedy child selection.
arXiv Detail & Related papers (2024-02-15T10:20:40Z)
Structure-Unified M-Tree Coding Solver for MathWord Problem [57.825176412485504]
In previous work, models designed by taking into account the properties of the binary tree structure of mathematical expressions at the output side have achieved better performance. In this paper, we propose the Structure-Unified M-Tree Coding Coding (S-UMCr), which applies a tree with any M branches (M-tree) to unify the output structures. Experimental results on the widely used MAWPS and Math23K datasets have demonstrated that SUMC-r not only outperforms several state-of-the-art models but also performs much better under low-resource conditions.
arXiv Detail & Related papers (2022-10-22T12:20:36Z)
GraphQ IR: Unifying Semantic Parsing of Graph Query Language with Intermediate Representation [91.27083732371453]
We propose a unified intermediate representation (IR) for graph query languages, namely GraphQ IR. With the IR's natural-language-like representation that bridges the semantic gap and its formally defined syntax that maintains the graph structure, neural semantic parsing can more effectively convert user queries into GraphQ IR. Our approach can consistently achieve state-of-the-art performance on KQA Pro, Overnight and MetaQA.
arXiv Detail & Related papers (2022-05-24T13:59:53Z)
Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner [56.08919422452905]
We propose an architecture called Iterative Retrieval-Generation Reasoner (IRGR) Our model is able to explain a given hypothesis by systematically generating a step-by-step explanation from textual premises. We outperform existing benchmarks on premise retrieval and entailment tree generation, with around 300% gain in overall correctness.
arXiv Detail & Related papers (2022-05-18T21:52:11Z)
METGEN: A Module-Based Entailment Tree Generation Framework for Answer Explanation [59.33241627273023]
We propose METGEN, a Module-based Entailment Tree GEN framework that has multiple modules and a reasoning controller. Given a question, METGEN can iteratively generate the entailment tree by conducting single-step entailment with separate modules and selecting the reasoning flow with the controller. Experiment results show that METGEN can outperform previous state-of-the-art models with only 9% of the parameters.
arXiv Detail & Related papers (2022-05-05T12:06:02Z)
Fast Interpretable Greedy-Tree Sums [8.268938983372452]
Fast Interpretable Greedy-Tree Sums (FIGS) generalizes the CART algorithm to grow a flexible number of trees in summation. G-FIGS derives CDIs that reflect domain knowledge and enjoy improved specificity (by up to 20% over CART) without sacrificing sensitivity or interpretability. Bagging-FIGS enjoys competitive performance with random forests and XGBoost on real-world datasets.
arXiv Detail & Related papers (2022-01-28T04:50:37Z)
Learning to Synthesize Data for Semantic Parsing [57.190817162674875]
We propose a generative model which models the composition of programs and maps a program to an utterance. Due to the simplicity of PCFG and pre-trained BART, our generative model can be efficiently learned from existing data at hand. We evaluate our method in both in-domain and out-of-domain settings of text-to-Query parsing on the standard benchmarks of GeoQuery and Spider.
arXiv Detail & Related papers (2021-04-12T21:24:02Z)
Closed-Form Expressions for Global and Local Interpretation of Tsetlin Machines with Applications to Explaining High-Dimensional Data [7.05622249909585]
We propose closed-form expressions for understanding why a TM model makes a specific prediction (local interpretability) We also introduce expressions for measuring the importance of feature value ranges for continuous features. For both classification and regression, our evaluation show correspondence with SHAP as well as competitive prediction accuracy in comparison with XGBoost, Explainable Boosting Machines, and Neural Additive Models.
arXiv Detail & Related papers (2020-07-27T21:47:24Z)
Tree-AMP: Compositional Inference with Tree Approximate Message Passing [23.509275850721778]
Tree-AMP is a python package for compositional inference in high-dimensional tree-structured models. The package provides a unifying framework to study several approximate message passing algorithms.
arXiv Detail & Related papers (2020-04-03T13:51:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.