Leveraging Knowlegde Graphs for Interpretable Feature Generation
- URL: http://arxiv.org/abs/2406.00544v1
- Date: Sat, 1 Jun 2024 19:51:29 GMT
- Title: Leveraging Knowlegde Graphs for Interpretable Feature Generation
- Authors: Mohamed Bouadi, Arta Alavi, Salima Benbernou, Mourad Ouziri,
- Abstract summary: KRAFT is an AutoFE framework that leverages a knowledge graph to guide the generation of interpretable features.
Our hybrid AI approach combines a neural generator to transform raw features through a series of transformations and a knowledge-based reasoner to evaluate features interpretability.
The generator is trained through Deep Reinforcement Learning (DRL) to maximize the prediction accuracy and the interpretability of the generated features.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The quality of Machine Learning (ML) models strongly depends on the input data, as such Feature Engineering (FE) is often required in ML. In addition, with the proliferation of ML-powered systems, especially in critical contexts, the need for interpretability and explainability becomes increasingly important. Since manual FE is time-consuming and requires case specific knowledge, we propose KRAFT, an AutoFE framework that leverages a knowledge graph to guide the generation of interpretable features. Our hybrid AI approach combines a neural generator to transform raw features through a series of transformations and a knowledge-based reasoner to evaluate features interpretability using Description Logics (DL). The generator is trained through Deep Reinforcement Learning (DRL) to maximize the prediction accuracy and the interpretability of the generated features. Extensive experiments on real datasets demonstrate that KRAFT significantly improves accuracy while ensuring a high level of interpretability.
Related papers
- Understanding the Role of Equivariance in Self-supervised Learning [51.56331245499712]
equivariant self-supervised learning (E-SSL) learns features to be augmentation-aware.
We identify a critical explaining-away effect in E-SSL that creates a synergy between the equivariant and classification tasks.
We reveal several principles for practical designs of E-SSL.
arXiv Detail & Related papers (2024-11-10T16:09:47Z) - Semantic-Guided RL for Interpretable Feature Engineering [0.0]
We introduce SMART, a hybrid approach that uses semantic technologies to guide the generation of interpretable features.
Our experiments on public datasets demonstrate that SMART significantly improves prediction accuracy while ensuring a high level of interpretability.
arXiv Detail & Related papers (2024-10-03T14:28:05Z) - Enhancing AI-based Generation of Software Exploits with Contextual Information [9.327315119028809]
The study employs a dataset comprising real shellcodes to evaluate the models across various scenarios.
The experiments are designed to assess the models' resilience against incomplete descriptions, their proficiency in leveraging context for enhanced accuracy, and their ability to discern irrelevant information.
The models demonstrate an ability to filter out unnecessary context, maintaining high levels of accuracy in the generation of offensive security code.
arXiv Detail & Related papers (2024-08-05T11:52:34Z) - Understanding Encoder-Decoder Structures in Machine Learning Using Information Measures [10.066310107046084]
We present new results to model and understand the role of encoder-decoder design in machine learning (ML)
We use two main information concepts, information sufficiency (IS) and mutual information loss (MIL), to represent predictive structures in machine learning.
arXiv Detail & Related papers (2024-05-30T19:58:01Z) - Prospector Heads: Generalized Feature Attribution for Large Models & Data [82.02696069543454]
We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods.
We demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data.
arXiv Detail & Related papers (2024-02-18T23:01:28Z) - IB-UQ: Information bottleneck based uncertainty quantification for
neural function regression and neural operator learning [11.5992081385106]
We propose a novel framework for uncertainty quantification via information bottleneck (IB-UQ) for scientific machine learning tasks.
We incorporate the bottleneck by a confidence-aware encoder, which encodes inputs into latent representations according to the confidence of the input data.
We also propose a data augmentation based information bottleneck objective which can enhance the quality of the extrapolation uncertainty.
arXiv Detail & Related papers (2023-02-07T05:56:42Z) - Audacity of huge: overcoming challenges of data scarcity and data
quality for machine learning in computational materials discovery [1.0036312061637764]
Machine learning (ML)-accelerated discovery requires large amounts of high-fidelity data to reveal predictive structure-property relationships.
For many properties of interest in materials discovery, the challenging nature and high cost of data generation has resulted in a data landscape that is scarcely populated and of dubious quality.
In the absence of manual curation, increasingly sophisticated natural language processing and automated image analysis are making it possible to learn structure-property relationships from the literature.
arXiv Detail & Related papers (2021-11-02T21:43:58Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - Learning summary features of time series for likelihood free inference [93.08098361687722]
We present a data-driven strategy for automatically learning summary features from time series data.
Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values.
arXiv Detail & Related papers (2020-12-04T19:21:37Z) - Learning to Learn Kernels with Variational Random Features [118.09565227041844]
We introduce kernels with random Fourier features in the meta-learning framework to leverage their strong few-shot learning ability.
We formulate the optimization of MetaVRF as a variational inference problem.
We show that MetaVRF delivers much better, or at least competitive, performance compared to existing meta-learning alternatives.
arXiv Detail & Related papers (2020-06-11T18:05:29Z) - Multilinear Compressive Learning with Prior Knowledge [106.12874293597754]
Multilinear Compressive Learning (MCL) framework combines Multilinear Compressive Sensing and Machine Learning into an end-to-end system.
Key idea behind MCL is the assumption of the existence of a tensor subspace which can capture the essential features from the signal for the downstream learning task.
In this paper, we propose a novel solution to address both of the aforementioned requirements, i.e., How to find those tensor subspaces in which the signals of interest are highly separable?
arXiv Detail & Related papers (2020-02-17T19:06:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.