Retrieval-Augmented Feature Generation for Domain-Specific Classification
- URL: http://arxiv.org/abs/2406.11177v2
- Date: Sat, 28 Dec 2024 21:16:38 GMT
- Title: Retrieval-Augmented Feature Generation for Domain-Specific Classification
- Authors: Xinhao Zhang, Jinghan Zhang, Fengran Mo, Yuzhong Chen, Kunpeng Liu,
- Abstract summary: This paper introduces a new method RAFG for generating reasonable and explainable features specific to domain classification tasks.
To generate new features with interpretability in domain knowledge, we perform information retrieval on existing features to identify potential feature associations.
We develop a Large Language Model (LLM)-based framework for feature generation with reasoning to verify and filter features during the generation process.
- Score: 7.445440204397416
- License:
- Abstract: Feature generation can significantly enhance learning outcomes, particularly for tasks with limited data. An effective way to improve feature generation is by expanding the current feature space using existing features and enriching the informational content. However, generating new, interpretable features in application fields often requires domain-specific knowledge about the existing features. This paper introduces a new method RAFG for generating reasonable and explainable features specific to domain classification tasks. To generate new features with interpretability in domain knowledge, we perform information retrieval on existing features to identify potential feature associations, and utilize these associations to generate meaningful features. Furthermore, we develop a Large Language Model (LLM)-based framework for feature generation with reasoning to verify and filter features during the generation process. Experiments across several datasets in medical, economic, and geographic domains show that our RAFG method produces high-quality, meaningful features and significantly improves classification performance compared with baseline methods.
Related papers
- Augmented Functional Random Forests: Classifier Construction and Unbiased Functional Principal Components Importance through Ad-Hoc Conditional Permutations [0.0]
This paper introduces a novel supervised classification strategy that integrates functional data analysis with tree-based methods.
We propose augmented versions of functional classification trees and functional random forests, incorporating a new tool for assessing the importance of functional principal components.
arXiv Detail & Related papers (2024-08-23T15:58:41Z) - Prospector Heads: Generalized Feature Attribution for Large Models & Data [82.02696069543454]
We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods.
We demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data.
arXiv Detail & Related papers (2024-02-18T23:01:28Z) - Enhancing Retrieval-Augmented Large Language Models with Iterative
Retrieval-Generation Synergy [164.83371924650294]
We show that strong performance can be achieved by a method we call Iter-RetGen, which synergizes retrieval and generation in an iterative manner.
A model output shows what might be needed to finish a task, and thus provides an informative context for retrieving more relevant knowledge.
Iter-RetGen processes all retrieved knowledge as a whole and largely preserves the flexibility in generation without structural constraints.
arXiv Detail & Related papers (2023-05-24T16:17:36Z) - Challenges of ELA-guided Function Evolution using Genetic Programming [0.0]
We show that a genetic programming approach guided by exploratory landscape analysis (ELA) properties is not always able to find satisfying functions.
Our results suggest that careful considerations of the weighting of landscape properties, as well as the distance measure used, might be required to evolve functions that are sufficiently representative to the target landscape.
arXiv Detail & Related papers (2023-05-24T15:31:01Z) - Feature construction using explanations of individual predictions [0.0]
We propose a novel approach for reducing the search space based on aggregation of instance-based explanations of predictive models.
We empirically show that reducing the search to these groups significantly reduces the time of feature construction.
We show significant improvements in classification accuracy for several classifiers and demonstrate the feasibility of the proposed feature construction even for large datasets.
arXiv Detail & Related papers (2023-01-23T18:59:01Z) - Group-wise Reinforcement Feature Generation for Optimal and Explainable
Representation Space Reconstruction [25.604176830832586]
We reformulate representation space reconstruction into an interactive process of nested feature generation and selection.
We design a group-wise generation strategy to cross a feature group, an operation, and another feature group to generate new features.
We present extensive experiments to demonstrate the effectiveness, efficiency, traceability, and explicitness of our system.
arXiv Detail & Related papers (2022-05-28T21:34:14Z) - Attribute Prototype Network for Any-Shot Learning [113.50220968583353]
We argue that an image representation with integrated attribute localization ability would be beneficial for any-shot, i.e. zero-shot and few-shot, image classification tasks.
We propose a novel representation learning framework that jointly learns global and local features using only class-level attributes.
arXiv Detail & Related papers (2022-04-04T02:25:40Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - Structure-Aware Feature Generation for Zero-Shot Learning [108.76968151682621]
We introduce a novel structure-aware feature generation scheme, termed as SA-GAN, to account for the topological structure in learning both the latent space and the generative networks.
Our method significantly enhances the generalization capability on unseen-classes and consequently improve the classification performance.
arXiv Detail & Related papers (2021-08-16T11:52:08Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.