Related papers: Neural Feature Learning in Function Space

Neural Feature Learning in Function Space

URL: http://arxiv.org/abs/2309.10140v3
Date: Sun, 26 May 2024 16:53:25 GMT
Title: Neural Feature Learning in Function Space
Authors: Xiangxiang Xu, Lizhong Zheng,
Abstract summary: We present a novel framework for learning system design with neural feature extractors. We introduce the feature geometry, which unifies statistical dependence and feature representations in a function space equipped with inner products. We propose a nesting technique, which provides systematic algorithm designs for learning the optimal features from data samples.
Score: 5.807950618412389
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a novel framework for learning system design with neural feature extractors. First, we introduce the feature geometry, which unifies statistical dependence and feature representations in a function space equipped with inner products. This connection defines function-space concepts on statistical dependence, such as norms, orthogonal projection, and spectral decomposition, exhibiting clear operational meanings. In particular, we associate each learning setting with a dependence component and formulate learning tasks as finding corresponding feature approximations. We propose a nesting technique, which provides systematic algorithm designs for learning the optimal features from data samples with off-the-shelf network architectures and optimizers. We further demonstrate multivariate learning applications, including conditional inference and multimodal learning, where we present the optimal features and reveal their connections to classical approaches.

Related papers

Function Forms of Simple ReLU Networks with Random Hidden Weights [1.2289361708127877]
We investigate the function space dynamics of a two-layer ReLU neural network in the infinite-width limit.<n>We highlight the Fisher information matrix's role in steering learning.<n>This work offers a robust foundation for understanding wide neural networks.
arXiv Detail & Related papers (2025-05-23T13:53:02Z)
Explainable Neural Networks with Guarantees: A Sparse Estimation Approach [11.142723510517778]
This paper introduces a novel approach to constructing an explainable neural network that harmonizes predictiveness and explainability. Our model, termed SparXnet, is designed as a linear combination of a sparse set of jointly learned features. Our research paves the way for further research on sparse and explainable neural networks with guarantee.
arXiv Detail & Related papers (2025-01-02T12:10:17Z)
Hierarchical Ensemble-Based Feature Selection for Time Series Forecasting [0.0]
We introduce a novel ensemble approach for feature selection based on hierarchical stacking for non-stationarity. Our approach exploits the co-dependency between features using a hierarchical structure. The effectiveness of the approach is demonstrated on synthetic and well-known real-life datasets.
arXiv Detail & Related papers (2023-10-26T16:40:09Z)
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient [65.08966446962845]
offline reinforcement learning, which aims at optimizing decision-making strategies with historical data, has been extensively applied in real-life applications. We take a step by considering offline reinforcement learning with differentiable function class approximation (DFA) Most importantly, we show offline differentiable function approximation is provably efficient by analyzing the pessimistic fitted Q-learning algorithm.
arXiv Detail & Related papers (2022-10-03T07:59:42Z)
Minimax Optimal Kernel Operator Learning via Multilevel Training [11.36492861074981]
We study the statistical limit of learning a Hilbert-Schmidt operator between two infinite-dimensional Sobolev reproducing kernel Hilbert spaces. We develop a multilevel kernel operator learning algorithm that is optimal when learning linear operators between infinite-dimensional function spaces.
arXiv Detail & Related papers (2022-09-28T21:31:43Z)
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation. We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z)
Self-Attention Neural Bag-of-Features [103.70855797025689]
We build on the recently introduced 2D-Attention and reformulate the attention learning methodology. We propose a joint feature-temporal attention mechanism that learns a joint 2D attention mask highlighting relevant information.
arXiv Detail & Related papers (2022-01-26T17:54:14Z)
Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning. Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z)
A Functional Perspective on Learning Symmetric Functions with Neural Networks [48.80300074254758]
We study the learning and representation of neural networks defined on measures. We establish approximation and generalization bounds under different choices of regularization. The resulting models can be learned efficiently and enjoy generalization guarantees that extend across input sizes.
arXiv Detail & Related papers (2020-08-16T16:34:33Z)
Feature Extraction Functions for Neural Logic Rule Learning [4.181432858358386]
We propose functions for integrating human knowledge abstracted as logic rules into the predictive behavior of a neural network. Unlike other existing neural logic approaches, the programmatic nature of these functions implies that they do not require any kind of special mathematical encoding.
arXiv Detail & Related papers (2020-08-14T12:35:07Z)
A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference. Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)
Similarity of Neural Networks with Gradients [8.804507286438781]
We propose to leverage both feature vectors and gradient ones into designing the representation of a neural network. We show that the proposed approach provides a state-of-the-art method for computing similarity of neural networks.
arXiv Detail & Related papers (2020-03-25T17:04:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.