Genetic Programming for Explainable Manifold Learning
- URL: http://arxiv.org/abs/2403.14139v1
- Date: Thu, 21 Mar 2024 05:17:22 GMT
- Title: Genetic Programming for Explainable Manifold Learning
- Authors: Ben Cravens, Andrew Lensen, Paula Maddigan, Bing Xue,
- Abstract summary: We introduce Genetic Programming for Explainable Manifold Learning (GP-EMaL), a novel approach that directly penalises tree complexity.
Our new method is able to maintain high manifold quality while significantly enhancing explainability and also allows customisation of complexity measures.
- Score: 2.370068482059863
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Manifold learning techniques play a pivotal role in machine learning by revealing lower-dimensional embeddings within high-dimensional data, thus enhancing both the efficiency and interpretability of data analysis by transforming the data into a lower-dimensional representation. However, a notable challenge with current manifold learning methods is their lack of explicit functional mappings, crucial for explainability in many real-world applications. Genetic programming, known for its interpretable functional tree-based models, has emerged as a promising approach to address this challenge. Previous research leveraged multi-objective GP to balance manifold quality against embedding dimensionality, producing functional mappings across a range of embedding sizes. Yet, these mapping trees often became complex, hindering explainability. In response, in this paper, we introduce Genetic Programming for Explainable Manifold Learning (GP-EMaL), a novel approach that directly penalises tree complexity. Our new method is able to maintain high manifold quality while significantly enhancing explainability and also allows customisation of complexity measures, such as symmetry balancing, scaling, and node complexity, catering to diverse application needs. Our experimental analysis demonstrates that GP-EMaL is able to match the performance of the existing approach in most cases, while using simpler, smaller, and more interpretable tree structures. This advancement marks a significant step towards achieving interpretable manifold learning.
Related papers
- Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Scalable manifold learning by uniform landmark sampling and constrained
locally linear embedding [0.6144680854063939]
We propose a scalable manifold learning (scML) method that can manipulate large-scale and high-dimensional data in an efficient manner.
We empirically validated the effectiveness of scML on synthetic datasets and real-world benchmarks of different types.
scML scales well with increasing data sizes and embedding dimensions, and exhibits promising performance in preserving the global structure.
arXiv Detail & Related papers (2024-01-02T08:43:06Z) - Balancing Explainability-Accuracy of Complex Models [8.402048778245165]
We introduce a new approach for complex models based on the co-relation impact.
We propose approaches for both scenarios of independent features and dependent features.
We provide an upper bound of the complexity of our proposed approach for the dependent features.
arXiv Detail & Related papers (2023-05-23T14:20:38Z) - Rethinking Complex Queries on Knowledge Graphs with Neural Link Predictors [58.340159346749964]
We propose a new neural-symbolic method to support end-to-end learning using complex queries with provable reasoning capability.
We develop a new dataset containing ten new types of queries with features that have never been considered.
Our method outperforms previous methods significantly in the new dataset and also surpasses previous methods in the existing dataset at the same time.
arXiv Detail & Related papers (2023-04-14T11:35:35Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - Dist2Cycle: A Simplicial Neural Network for Homology Localization [66.15805004725809]
Simplicial complexes can be viewed as high dimensional generalizations of graphs that explicitly encode multi-way ordered relations.
We propose a graph convolutional model for learning functions parametrized by the $k$-homological features of simplicial complexes.
arXiv Detail & Related papers (2021-10-28T14:59:41Z) - Genetic Programming for Manifold Learning: Preserving Local Topology [5.226724669049025]
We propose a new approach to using genetic programming for manifold learning, which preserves local topology.
This is expected to significantly improve performance on tasks where local neighbourhood structure (topology) is paramount.
arXiv Detail & Related papers (2021-08-23T03:48:48Z) - Scalable Gaussian Processes for Data-Driven Design using Big Data with
Categorical Factors [14.337297795182181]
Gaussian processes (GP) have difficulties in accommodating big datasets, categorical inputs, and multiple responses.
We propose a GP model that utilizes latent variables and functions obtained through variational inference to address the aforementioned challenges simultaneously.
Our approach is demonstrated for machine learning of ternary oxide materials and topology optimization of a multiscale compliant mechanism.
arXiv Detail & Related papers (2021-06-26T02:17:23Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z) - Multi-Objective Genetic Programming for Manifold Learning: Balancing
Quality and Dimensionality [4.4181317696554325]
State-of-the-art manifold learning algorithms are opaque in how they perform this transformation.
We introduce a multi-objective approach that automatically balances the competing objectives of manifold quality and dimensionality.
Our proposed approach is competitive with a range of baseline and state-of-the-art manifold learning methods.
arXiv Detail & Related papers (2020-01-05T23:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.