Variational Autoencoder Kernel Interpretation and Selection for
Classification
- URL: http://arxiv.org/abs/2209.04715v1
- Date: Sat, 10 Sep 2022 17:22:53 GMT
- Title: Variational Autoencoder Kernel Interpretation and Selection for
Classification
- Authors: F\'abio Mendon\c{c}a, Sheikh Shanawaz Mostafa, Fernando Morgado-Dias,
and Antonio G. Ravelo-Garc\'ia
- Abstract summary: This work proposed kernel selection approaches for probabilistic classifiers based on features produced by the convolutional encoder of a variational autoencoder.
In the proposed implementation, each latent variable was sampled from the distribution associated with a single kernel of the last encoder's convolution layer, as an individual distribution was created for each kernel.
choosing relevant features on the sampled latent variables makes it possible to perform kernel selection, filtering the uninformative features and kernels.
- Score: 59.30734371401315
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work proposed kernel selection approaches for probabilistic classifiers
based on features produced by the convolutional encoder of a variational
autoencoder. Particularly, the developed methodologies allow the selection of
the most relevant subset of latent variables. In the proposed implementation,
each latent variable was sampled from the distribution associated with a single
kernel of the last encoder's convolution layer, as an individual distribution
was created for each kernel. Therefore, choosing relevant features on the
sampled latent variables makes it possible to perform kernel selection,
filtering the uninformative features and kernels. Such leads to a reduction in
the number of the model's parameters. Both wrapper and filter methods were
evaluated for feature selection. The second was of particular relevance as it
is based only on the distributions of the kernels. It was assessed by measuring
the Kullback-Leibler divergence between all distributions, hypothesizing that
the kernels whose distributions are more similar can be discarded. This
hypothesis was confirmed since it was observed that the most similar kernels do
not convey relevant information and can be removed. As a result, the proposed
methodology is suitable for developing applications for resource-constrained
devices.
Related papers
- Learning to Embed Distributions via Maximum Kernel Entropy [0.0]
Emprimiical data can often be considered as samples from a set of probability distributions.
Kernel methods have emerged as a natural approach for learning to classify these distributions.
We propose a novel objective for the unsupervised learning of data-dependent distribution kernel.
arXiv Detail & Related papers (2024-08-01T13:34:19Z) - Optimal Kernel Choice for Score Function-based Causal Discovery [92.65034439889872]
We propose a kernel selection method within the generalized score function that automatically selects the optimal kernel that best fits the data.
We conduct experiments on both synthetic data and real-world benchmarks, and the results demonstrate that our proposed method outperforms kernel selection methods.
arXiv Detail & Related papers (2024-07-14T09:32:20Z) - Self-supervised learning with rotation-invariant kernels [4.059849656394191]
We propose a general kernel framework to design a generic regularization loss that promotes the embedding distribution to be close to the uniform distribution on the hypersphere.
Our framework uses rotation-invariant kernels defined on the hypersphere, also known as dot-product kernels.
Our experiments demonstrate that using a truncated rotation-invariant kernel provides competitive results compared to state-of-the-art methods.
arXiv Detail & Related papers (2022-07-28T08:06:24Z) - Generalized Reference Kernel for One-class Classification [100.53532594448048]
We formulate a new generalized reference kernel to improve the original base kernel using a set of reference vectors.
Our analysis and experimental results show that the new formulation provides approaches to regularize, adjust the rank, and incorporate additional information into the kernel itself.
arXiv Detail & Related papers (2022-05-01T18:36:55Z) - S-Rocket: Selective Random Convolution Kernels for Time Series
Classification [36.9596657353794]
Random convolution kernel transform (Rocket) is a fast, efficient, and novel approach for time series feature extraction.
selection of the most important kernels and pruning the redundant and less important ones is necessary to reduce computational complexity and accelerate inference of Rocket.
Population-based approach is proposed for selecting the most important kernels.
arXiv Detail & Related papers (2022-03-07T15:02:12Z) - A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space.
We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense.
We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z) - Taming Nonconvexity in Kernel Feature Selection---Favorable Properties
of the Laplace Kernel [77.73399781313893]
A challenge is to establish the objective function of kernel-based feature selection.
The gradient-based algorithms available for non-global optimization are only able to guarantee convergence to local minima.
arXiv Detail & Related papers (2021-06-17T11:05:48Z) - Towards Unbiased Random Features with Lower Variance For Stationary
Indefinite Kernels [26.57122949130266]
Our algorithm achieves lower variance and approximation error compared with the existing kernel approximation methods.
With better approximation to the originally selected kernels, improved classification accuracy and regression ability is obtained.
arXiv Detail & Related papers (2021-04-13T13:56:50Z) - Learning Deep Kernels for Non-Parametric Two-Sample Tests [50.92621794426821]
We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution.
Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test power.
arXiv Detail & Related papers (2020-02-21T03:54:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.