Related papers: Genetic Programming for Explainable Manifold Learning

Genetic Programming for Explainable Manifold Learning

URL: http://arxiv.org/abs/2403.14139v1
Date: Thu, 21 Mar 2024 05:17:22 GMT
Title: Genetic Programming for Explainable Manifold Learning
Authors: Ben Cravens, Andrew Lensen, Paula Maddigan, Bing Xue,
Abstract summary: We introduce Genetic Programming for Explainable Manifold Learning (GP-EMaL), a novel approach that directly penalises tree complexity. Our new method is able to maintain high manifold quality while significantly enhancing explainability and also allows customisation of complexity measures.
Score: 2.370068482059863
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Manifold learning techniques play a pivotal role in machine learning by revealing lower-dimensional embeddings within high-dimensional data, thus enhancing both the efficiency and interpretability of data analysis by transforming the data into a lower-dimensional representation. However, a notable challenge with current manifold learning methods is their lack of explicit functional mappings, crucial for explainability in many real-world applications. Genetic programming, known for its interpretable functional tree-based models, has emerged as a promising approach to address this challenge. Previous research leveraged multi-objective GP to balance manifold quality against embedding dimensionality, producing functional mappings across a range of embedding sizes. Yet, these mapping trees often became complex, hindering explainability. In response, in this paper, we introduce Genetic Programming for Explainable Manifold Learning (GP-EMaL), a novel approach that directly penalises tree complexity. Our new method is able to maintain high manifold quality while significantly enhancing explainability and also allows customisation of complexity measures, such as symmetry balancing, scaling, and node complexity, catering to diverse application needs. Our experimental analysis demonstrates that GP-EMaL is able to match the performance of the existing approach in most cases, while using simpler, smaller, and more interpretable tree structures. This advancement marks a significant step towards achieving interpretable manifold learning.

Related papers

Understanding In-Context Learning on Structured Manifolds: Bridging Attention to Kernel Methods [48.038668788625465]
In-context learning (ICL) has achieved remarkable success in natural language and vision domains.<n>In this work, we initiate a theoretical study of ICL for regression of H"older functions on manifold.<n>Our findings provide foundational insights into the role of geometry in ICL and novels tools to study ICL of nonlinear models.
arXiv Detail & Related papers (2025-06-12T17:56:26Z)
Manifold Learning with Normalizing Flows: Towards Regularity, Expressivity and Iso-Riemannian Geometry [8.020732438595905]
This work focuses on addressing distortions and modeling errors that can arise in the multi-modal setting.<n>We showcase the effectiveness of the synergy of the proposed approaches in several numerical experiments with both synthetic and real data.
arXiv Detail & Related papers (2025-05-12T21:44:42Z)
Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network. Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z)
Scalable manifold learning by uniform landmark sampling and constrained locally linear embedding [0.6144680854063939]
We propose a scalable manifold learning (scML) method that can manipulate large-scale and high-dimensional data in an efficient manner. We empirically validated the effectiveness of scML on synthetic datasets and real-world benchmarks of different types. scML scales well with increasing data sizes and embedding dimensions, and exhibits promising performance in preserving the global structure.
arXiv Detail & Related papers (2024-01-02T08:43:06Z)
Balancing Explainability-Accuracy of Complex Models [8.402048778245165]
We introduce a new approach for complex models based on the co-relation impact. We propose approaches for both scenarios of independent features and dependent features. We provide an upper bound of the complexity of our proposed approach for the dependent features.
arXiv Detail & Related papers (2023-05-23T14:20:38Z)
Rethinking Complex Queries on Knowledge Graphs with Neural Link Predictors [58.340159346749964]
We propose a new neural-symbolic method to support end-to-end learning using complex queries with provable reasoning capability. We develop a new dataset containing ten new types of queries with features that have never been considered. Our method outperforms previous methods significantly in the new dataset and also surpasses previous methods in the existing dataset at the same time.
arXiv Detail & Related papers (2023-04-14T11:35:35Z)
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation. We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z)
Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points. The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains. We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z)
Dist2Cycle: A Simplicial Neural Network for Homology Localization [66.15805004725809]
Simplicial complexes can be viewed as high dimensional generalizations of graphs that explicitly encode multi-way ordered relations. We propose a graph convolutional model for learning functions parametrized by the $k$-homological features of simplicial complexes.
arXiv Detail & Related papers (2021-10-28T14:59:41Z)
Genetic Programming for Manifold Learning: Preserving Local Topology [5.226724669049025]
We propose a new approach to using genetic programming for manifold learning, which preserves local topology. This is expected to significantly improve performance on tasks where local neighbourhood structure (topology) is paramount.
arXiv Detail & Related papers (2021-08-23T03:48:48Z)
Scalable Gaussian Processes for Data-Driven Design using Big Data with Categorical Factors [14.337297795182181]
Gaussian processes (GP) have difficulties in accommodating big datasets, categorical inputs, and multiple responses. We propose a GP model that utilizes latent variables and functions obtained through variational inference to address the aforementioned challenges simultaneously. Our approach is demonstrated for machine learning of ternary oxide materials and topology optimization of a multiscale compliant mechanism.
arXiv Detail & Related papers (2021-06-26T02:17:23Z)
A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference. Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)
Multi-Objective Genetic Programming for Manifold Learning: Balancing Quality and Dimensionality [4.4181317696554325]
State-of-the-art manifold learning algorithms are opaque in how they perform this transformation. We introduce a multi-objective approach that automatically balances the competing objectives of manifold quality and dimensionality. Our proposed approach is competitive with a range of baseline and state-of-the-art manifold learning methods.
arXiv Detail & Related papers (2020-01-05T23:24:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.