Stochastic Mutual Information Gradient Estimation for Dimensionality
Reduction Networks
- URL: http://arxiv.org/abs/2105.00191v1
- Date: Sat, 1 May 2021 08:20:04 GMT
- Title: Stochastic Mutual Information Gradient Estimation for Dimensionality
Reduction Networks
- Authors: Ozan Ozdenizci, Deniz Erdogmus
- Abstract summary: We introduce emerging information theoretic feature transformation protocols as an end-to-end neural network training approach.
We present a dimensionality reduction network (MMINet) training procedure based on the estimate of the mutual information gradient.
We experimentally evaluate our method with applications to high-dimensional biological data sets, and relate it to conventional feature selection algorithms.
- Score: 11.634729459989996
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Feature ranking and selection is a widely used approach in various
applications of supervised dimensionality reduction in discriminative machine
learning. Nevertheless there exists significant evidence on feature ranking and
selection algorithms based on any criterion leading to potentially sub-optimal
solutions for class separability. In that regard, we introduce emerging
information theoretic feature transformation protocols as an end-to-end neural
network training approach. We present a dimensionality reduction network
(MMINet) training procedure based on the stochastic estimate of the mutual
information gradient. The network projects high-dimensional features onto an
output feature space where lower dimensional representations of features carry
maximum mutual information with their associated class labels. Furthermore, we
formulate the training objective to be estimated non-parametrically with no
distributional assumptions. We experimentally evaluate our method with
applications to high-dimensional biological data sets, and relate it to
conventional feature selection algorithms to form a special case of our
approach.
Related papers
- Deep Metric Learning for Computer Vision: A Brief Overview [4.980117530293724]
Objective functions that optimize deep neural networks play a vital role in creating an enhanced feature representation of the input data.
Deep Metric Learning seeks to develop methods that aim to measure the similarity between data samples.
We will provide an overview of recent progress in this area and discuss state-of-the-art Deep Metric Learning approaches.
arXiv Detail & Related papers (2023-12-01T21:53:36Z) - MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets.
We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space.
Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z) - Federated Representation Learning via Maximal Coding Rate Reduction [109.26332878050374]
We propose a methodology to learn low-dimensional representations from a dataset that is distributed among several clients.
Our proposed method, which we refer to as FLOW, utilizes MCR2 as the objective of choice, hence resulting in representations that are both between-class discriminative and within-class compressible.
arXiv Detail & Related papers (2022-10-01T15:43:51Z) - Model-Based Deep Learning: On the Intersection of Deep Learning and
Optimization [101.32332941117271]
Decision making algorithms are used in a multitude of different applications.
Deep learning approaches that use highly parametric architectures tuned from data without relying on mathematical models are becoming increasingly popular.
Model-based optimization and data-centric deep learning are often considered to be distinct disciplines.
arXiv Detail & Related papers (2022-05-05T13:40:08Z) - Discriminative Attribution from Counterfactuals [64.94009515033984]
We present a method for neural network interpretability by combining feature attribution with counterfactual explanations.
We show that this method can be used to quantitatively evaluate the performance of feature attribution methods in an objective manner.
arXiv Detail & Related papers (2021-09-28T00:53:34Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Low-rank Dictionary Learning for Unsupervised Feature Selection [11.634317251468968]
We introduce a novel unsupervised feature selection approach by applying dictionary learning ideas in a low-rank representation.
A unified objective function for unsupervised feature selection is proposed in a sparse way by an $ell_2,1$-norm regularization.
Our experimental findings reveal that the proposed method outperforms the state-of-the-art algorithm.
arXiv Detail & Related papers (2021-06-21T13:39:10Z) - Joint Dimensionality Reduction for Separable Embedding Estimation [43.22422640265388]
Low-dimensional embeddings for data from disparate sources play critical roles in machine learning, multimedia information retrieval, and bioinformatics.
We propose a supervised dimensionality reduction method that learns linear embeddings jointly for two feature vectors representing data of different modalities or data from distinct types of entities.
Our approach compares favorably against other dimensionality reduction methods, and against a state-of-the-art method of bilinear regression for predicting gene-disease associations.
arXiv Detail & Related papers (2021-01-14T08:48:37Z) - Divergence Regulated Encoder Network for Joint Dimensionality Reduction
and Classification [2.989889278970106]
We investigate performing joint dimensionality reduction and classification using a novel histogram neural network.
Motivated by a popular dimensionality reduction approach, t-Distributed Neighbor Embedding (t-SNE), our proposed method incorporates a classification loss computed on samples in a low-dimensional embedding space.
Our results show that the proposed approach maintains and/or improves classification performance and reveals characteristics of features produced by neural networks that may be helpful for other applications.
arXiv Detail & Related papers (2020-12-31T17:39:02Z) - Feature space approximation for kernel-based supervised learning [2.653409741248232]
The goal is to reduce the size of the training data, resulting in lower storage consumption and computational complexity.
We demonstrate significant improvements in comparison to the computation of data-driven predictions involving the full training data set.
The method is applied to classification and regression problems from different application areas such as image recognition, system identification, and oceanographic time series analysis.
arXiv Detail & Related papers (2020-11-25T11:23:58Z) - Deep Dimension Reduction for Supervised Representation Learning [51.10448064423656]
We propose a deep dimension reduction approach to learning representations with essential characteristics.
The proposed approach is a nonparametric generalization of the sufficient dimension reduction method.
We show that the estimated deep nonparametric representation is consistent in the sense that its excess risk converges to zero.
arXiv Detail & Related papers (2020-06-10T14:47:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.