Introns and Templates Matter: Rethinking Linkage in GP-GOMEA
- URL: http://arxiv.org/abs/2602.02311v1
- Date: Mon, 02 Feb 2026 16:42:30 GMT
- Title: Introns and Templates Matter: Rethinking Linkage in GP-GOMEA
- Authors: Johannes Koch, Tanja Alderliesten, Peter A. N. Bosman,
- Abstract summary: In GP-GOMEA, mutual information between node positions in GP trees has so far been used to learn linkage.<n>We propose two new measures for linkage learning, one that explicitly considers introns in mutual information estimates, and one that revisits linkage learning in GP-GOMEA from a grey-box perspective.<n>We find that the newly learned linkage structure closely reflects the template linkage structure, and that explicitly using the template structure yields the best performance overall.
- Score: 1.2234742322758418
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: GP-GOMEA is among the state-of-the-art for symbolic regression, especially when it comes to finding small and potentially interpretable solutions. A key mechanism employed in any GOMEA variant is the exploitation of linkage, the dependencies between variables, to ensure efficient evolution. In GP-GOMEA, mutual information between node positions in GP trees has so far been used to learn linkage. For this, a fixed expression template is used. This however leads to introns for expressions smaller than the full template. As introns have no impact on fitness, their occurrences are not directly linked to selection. Consequently, introns can adversely affect the extent to which mutual information captures dependencies between tree nodes. To overcome this, we propose two new measures for linkage learning, one that explicitly considers introns in mutual information estimates, and one that revisits linkage learning in GP-GOMEA from a grey-box perspective, yielding a measure that needs not to be learned from the population but is derived directly from the template. Across five standard symbolic regression problems, GP-GOMEA achieves substantial improvements using both measures. We also find that the newly learned linkage structure closely reflects the template linkage structure, and that explicitly using the template structure yields the best performance overall.
Related papers
- Aggregation-aware MLP: An Unsupervised Approach for Graph Message-passing [10.93155007218297]
"AMLP" is an unsupervised framework that shifts the paradigm from directly crafting aggregation functions to making adaptive aggregation.<n>Our approach consists of two key steps: First, we utilize a graph reconstruction that facilitates high-order grouping effects, and second, we employ a single-layer network to encode varying degrees of heterophily.
arXiv Detail & Related papers (2025-07-27T04:52:55Z) - Thinking Outside the Template with Modular GP-GOMEA [0.0]
This paper presents a modular representation for GP-GOMEA that allows multiple trees to be evolved simultaneously that can be used as (functional) subexpressions.<n>We find that our proposed approach generally outperforms single-template GP-GOMEA and can moreover uncover ground-truth expressions underlying synthetic datasets with modular subexpressions at a faster rate than GP-GOMEA without modular subexpressions.
arXiv Detail & Related papers (2025-05-02T13:29:55Z) - Beyond Message Passing: Neural Graph Pattern Machine [50.78679002846741]
We introduce the Neural Graph Pattern Machine (GPM), a novel framework that bypasses message passing by learning directly from graph substructures.<n>GPM efficiently extracts, encodes, and prioritizes task-relevant graph patterns, offering greater expressivity and improved ability to capture long-range dependencies.
arXiv Detail & Related papers (2025-01-30T20:37:47Z) - Improving the efficiency of GP-GOMEA for higher-arity operators [0.0]
Genetic Programming (GP) can offer a way to evolve inherently interpretable expressions.
GP-GOMEA is a form of GP that has been found particularly effective at evolving expressions that are accurate yet of limited size and, thus, promote interpretability.
This paper proposes two enhancements to GP-GOMEA: semantic subtree inheritance and greedy child selection.
arXiv Detail & Related papers (2024-02-15T10:20:40Z) - OrthoReg: Improving Graph-regularized MLPs via Orthogonality
Regularization [66.30021126251725]
Graph Neural Networks (GNNs) are currently dominating in modeling graphstructure data.
Graph-regularized networks (GR-MLPs) implicitly inject the graph structure information into model weights, while their performance can hardly match that of GNNs in most tasks.
We show that GR-MLPs suffer from dimensional collapse, a phenomenon in which the largest a few eigenvalues dominate the embedding space.
We propose OrthoReg, a novel GR-MLP model to mitigate the dimensional collapse issue.
arXiv Detail & Related papers (2023-01-31T21:20:48Z) - Graph Spectral Embedding using the Geodesic Betweeness Centrality [76.27138343125985]
We introduce the Graph Sylvester Embedding (GSE), an unsupervised graph representation of local similarity, connectivity, and global structure.
GSE uses the solution of the Sylvester equation to capture both network structure and neighborhood proximity in a single representation.
arXiv Detail & Related papers (2022-05-07T04:11:23Z) - RU-Net: Regularized Unrolling Network for Scene Graph Generation [92.95032610978511]
Scene graph generation (SGG) aims to detect objects and predict the relationships between each pair of objects.
Existing SGG methods usually suffer from several issues, including 1) ambiguous object representations, and 2) low diversity in relationship predictions.
We propose a regularized unrolling network (RU-Net) to address both problems.
arXiv Detail & Related papers (2022-05-03T04:21:15Z) - Incremental Ensemble Gaussian Processes [53.3291389385672]
We propose an incremental ensemble (IE-) GP framework, where an EGP meta-learner employs an it ensemble of GP learners, each having a unique kernel belonging to a prescribed kernel dictionary.
With each GP expert leveraging the random feature-based approximation to perform online prediction and model update with it scalability, the EGP meta-learner capitalizes on data-adaptive weights to synthesize the per-expert predictions.
The novel IE-GP is generalized to accommodate time-varying functions by modeling structured dynamics at the EGP meta-learner and within each GP learner.
arXiv Detail & Related papers (2021-10-13T15:11:25Z) - Structure-aware Interactive Graph Neural Networks for the Prediction of
Protein-Ligand Binding Affinity [52.67037774136973]
Drug discovery often relies on the successful prediction of protein-ligand binding affinity.
Recent advances have shown great promise in applying graph neural networks (GNNs) for better affinity prediction by learning the representations of protein-ligand complexes.
We propose a structure-aware interactive graph neural network (SIGN) which consists of two components: polar-inspired graph attention layers (PGAL) and pairwise interactive pooling (PiPool)
arXiv Detail & Related papers (2021-07-21T03:34:09Z) - Adaptive Universal Generalized PageRank Graph Neural Network [36.850433364139924]
Graph neural networks (GNNs) are designed to exploit both sources of evidence but they do not optimally trade-off their utility.
We introduce a new Generalized PageRank (GPR) GNN architecture that adaptively learns the GPR weights.
GPR-GNN offers significant performance improvement compared to existing techniques on both synthetic and benchmark data.
arXiv Detail & Related papers (2020-06-14T19:27:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.