Supervised Learning for Non-Sequential Data: A Canonical Polyadic
Decomposition Approach
- URL: http://arxiv.org/abs/2001.10109v3
- Date: Tue, 30 Mar 2021 09:29:17 GMT
- Title: Supervised Learning for Non-Sequential Data: A Canonical Polyadic
Decomposition Approach
- Authors: Alexandros Haliassos, Kriton Konstantinidis, Danilo P. Mandic
- Abstract summary: Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks.
To alleviate this issue, it has been proposed to implicitly represent the model parameters as a tensor.
For enhanced expressiveness, we generalize the framework to allow feature mapping to arbitrarily high-dimensional feature vectors.
- Score: 85.12934750565971
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient modelling of feature interactions underpins supervised learning for
non-sequential tasks, characterized by a lack of inherent ordering of features
(variables). The brute force approach of learning a parameter for each
interaction of every order comes at an exponential computational and memory
cost (Curse of Dimensionality). To alleviate this issue, it has been proposed
to implicitly represent the model parameters as a tensor, the order of which is
equal to the number of features; for efficiency, it can be further factorized
into a compact Tensor Train (TT) format. However, both TT and other Tensor
Networks (TNs), such as Tensor Ring and Hierarchical Tucker, are sensitive to
the ordering of their indices (and hence to the features). To establish the
desired invariance to feature ordering, we propose to represent the weight
tensor through the Canonical Polyadic (CP) Decomposition (CPD), and introduce
the associated inference and learning algorithms, including suitable
regularization and initialization schemes. It is demonstrated that the proposed
CP-based predictor significantly outperforms other TN-based predictors on
sparse data while exhibiting comparable performance on dense non-sequential
tasks. Furthermore, for enhanced expressiveness, we generalize the framework to
allow feature mapping to arbitrarily high-dimensional feature vectors. In
conjunction with feature vector normalization, this is shown to yield dramatic
improvements in performance for dense non-sequential tasks, matching models
such as fully-connected neural networks.
Related papers
- Tensor Polynomial Additive Model [40.30621617188693]
The TPAM preserves the inherent interpretability of additive models, transparent decision-making and the extraction of meaningful feature values.
It can enhance accuracy by up to 30%, and compression rate by up to 5 times, while maintaining a good interpretability.
arXiv Detail & Related papers (2024-06-05T06:23:11Z) - Fine-grained Retrieval Prompt Tuning [149.9071858259279]
Fine-grained Retrieval Prompt Tuning steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompt and feature adaptation.
Our FRPT with fewer learnable parameters achieves the state-of-the-art performance on three widely-used fine-grained datasets.
arXiv Detail & Related papers (2022-07-29T04:10:04Z) - Orthogonal Stochastic Configuration Networks with Adaptive Construction
Parameter for Data Analytics [6.940097162264939]
randomness makes SCNs more likely to generate approximate linear correlative nodes that are redundant and low quality.
In light of a fundamental principle in machine learning, that is, a model with fewer parameters holds improved generalization.
This paper proposes orthogonal SCN, termed OSCN, to filtrate out the low-quality hidden nodes for network structure reduction.
arXiv Detail & Related papers (2022-05-26T07:07:26Z) - Contrastive Conditional Neural Processes [45.70735205041254]
Conditional Neural Processes(CNPs) bridge neural networks with probabilistic inference to approximate functions of Processes under meta-learning settings.
Two auxiliary contrastive branches are set up hierarchically, namely in-instantiation temporal contrastive learning(tt TCL) and cross-instantiation function contrastive learning(tt FCL)
We empirically show that tt TCL captures high-level abstraction of observations, whereas tt FCL helps identify underlying functions, which in turn provides more efficient representations.
arXiv Detail & Related papers (2022-03-08T10:08:45Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - Tensor Representations for Action Recognition [54.710267354274194]
Human actions in sequences are characterized by the complex interplay between spatial features and their temporal dynamics.
We propose novel tensor representations for capturing higher-order relationships between visual features for the task of action recognition.
We use higher-order tensors and so-called Eigenvalue Power Normalization (NEP) which have been long speculated to perform spectral detection of higher-order occurrences.
arXiv Detail & Related papers (2020-12-28T17:27:18Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Adaptive Learning of Tensor Network Structures [6.407946291544721]
We leverage the TN formalism to develop a generic and efficient adaptive algorithm to learn the structure and the parameters of a TN from data.
Our algorithm can adaptively identify TN structures with small number of parameters that effectively optimize any differentiable objective function.
arXiv Detail & Related papers (2020-08-12T16:41:56Z) - Slice Sampling for General Completely Random Measures [74.24975039689893]
We present a novel Markov chain Monte Carlo algorithm for posterior inference that adaptively sets the truncation level using auxiliary slice variables.
The efficacy of the proposed algorithm is evaluated on several popular nonparametric models.
arXiv Detail & Related papers (2020-06-24T17:53:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.