Related papers: From Two-Class Linear Discriminant Analysis to Interpretable Multilayer Perceptron Design

From Two-Class Linear Discriminant Analysis to Interpretable Multilayer Perceptron Design

URL: http://arxiv.org/abs/2009.04442v1
Date: Wed, 9 Sep 2020 17:43:39 GMT
Title: From Two-Class Linear Discriminant Analysis to Interpretable Multilayer Perceptron Design
Authors: Ruiyuan Lin, Zhiruo Zhou, Suya You, Raghuveer Rao and C.-C. Jay Kuo
Abstract summary: A closed-form solution exists in two-class linear discriminant analysis (LDA) We interpret the multilayer perceptron (MLP) as a generalization of a two-class LDA system. We present an automatic design that can specify the network architecture and all filter weights in a feedforward one-pass fashion.
Score: 31.446335485087758
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A closed-form solution exists in two-class linear discriminant analysis (LDA), which discriminates two Gaussian-distributed classes in a multi-dimensional feature space. In this work, we interpret the multilayer perceptron (MLP) as a generalization of a two-class LDA system so that it can handle an input composed by multiple Gaussian modalities belonging to multiple classes. Besides input layer $l_{in}$ and output layer $l_{out}$, the MLP of interest consists of two intermediate layers, $l_1$ and $l_2$. We propose a feedforward design that has three stages: 1) from $l_{in}$ to $l_1$: half-space partitionings accomplished by multiple parallel LDAs, 2) from $l_1$ to $l_2$: subspace isolation where one Gaussian modality is represented by one neuron, 3) from $l_2$ to $l_{out}$: class-wise subspace mergence, where each Gaussian modality is connected to its target class. Through this process, we present an automatic MLP design that can specify the network architecture (i.e., the layer number and the neuron number at a layer) and all filter weights in a feedforward one-pass fashion. This design can be generalized to an arbitrary distribution by leveraging the Gaussian mixture model (GMM). Experiments are conducted to compare the performance of the traditional backpropagation-based MLP (BP-MLP) and the new feedforward MLP (FF-MLP).

Related papers

Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators. Key to our solution is a novel projection technique based on ideas from harmonic analysis. Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z)
Multilayer Correlation Clustering [12.492037397168579]
We establish Multilayer Correlation Clustering, a novel generalization of Correlation Clustering (Bansal et al., FOCS '02) to the multilayer setting. In this paper, we are given a series of inputs of Correlation Clustering (called layers) over the common set $V$. The goal is then to find a clustering of $V$ that minimizes the $ell_p$-norm ($pgeq 1$) of the disagreements vector.
arXiv Detail & Related papers (2024-04-25T15:25:30Z)
BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons [37.28828605119602]
This paper studies the problem of designing compact binary architectures for vision multi-layer perceptrons (MLPs) We find that previous binarization methods perform poorly due to limited capacity of binary samplings. We propose to improve the performance of binary mixing and channel mixing (BiMLP) model by enriching the representation ability of binary FC layers.
arXiv Detail & Related papers (2022-12-29T02:43:41Z)
Minimax-Optimal Multi-Agent RL in Zero-Sum Markov Games With a Generative Model [50.38446482252857]
Two-player zero-sum Markov games are arguably the most basic setting in multi-agent reinforcement learning. We develop a learning algorithm that learns an $varepsilon$-approximate Markov NE policy using $$ widetildeObigg. We derive a refined regret bound for FTRL that makes explicit the role of variance-type quantities.
arXiv Detail & Related papers (2022-08-22T17:24:55Z)
Subspace Learning Machine (SLM): Methodology and Performance [28.98486923400986]
Subspace learning machine (SLM) is a new classification model inspired by feedforward multilayer perceptron (FF-MLP), decision tree (DT) and extreme learning machine (ELM) SLM first identifies a discriminant subspace, $S0$, by examining the discriminant power of each input feature. It uses probabilistic projections of features in $S0$ to yield 1D subspaces and finds the optimal partition for each of them.
arXiv Detail & Related papers (2022-05-11T06:44:51Z)
A new perspective on probabilistic image modeling [92.89846887298852]
We present a new probabilistic approach for image modeling capable of density estimation, sampling and tractable inference. DCGMMs can be trained end-to-end by SGD from random initial conditions, much like CNNs. We show that DCGMMs compare favorably to several recent PC and SPN models in terms of inference, classification and sampling.
arXiv Detail & Related papers (2022-03-21T14:53:57Z)
RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality [113.1414517605892]
We propose a methodology, Locality Injection, to incorporate local priors into an FC layer. RepMLPNet is the first that seamlessly transfer to Cityscapes semantic segmentation.
arXiv Detail & Related papers (2021-12-21T10:28:17Z)
$\alpha$-Stable convergence of heavy-tailed infinitely-wide neural networks [8.880921123362294]
infinitely-wide multi-layer perceptrons (MLPs) are limits of standard feed-forward neural networks. We show that the vector of pre-activation values at all nodes of a given hidden layer converges in the limit.
arXiv Detail & Related papers (2021-06-18T01:36:41Z)
Large scale analysis of generalization error in learning using margin based classification methods [2.436681150766912]
We derive the expression for the generalization error of a family of large-margin classifiers in the limit of both sample size $n$ and dimension $p$. For two layer neural networks, we reproduce the recently developed double descent' phenomenology for several classification models.
arXiv Detail & Related papers (2020-07-16T20:31:26Z)
Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity [67.02490430380415]
We show that model-based MARL achieves a sample complexity of $tilde O(|S||B|(gamma)-3epsilon-2)$ for finding the Nash equilibrium (NE) value up to some $epsilon$ error. We also show that such a sample bound is minimax-optimal (up to logarithmic factors) if the algorithm is reward-agnostic, where the algorithm queries state transition samples without reward knowledge.
arXiv Detail & Related papers (2020-07-15T03:25:24Z)
Learning Halfspaces with Tsybakov Noise [50.659479930171585]
We study the learnability of halfspaces in the presence of Tsybakov noise. We give an algorithm that achieves misclassification error $epsilon$ with respect to the true halfspace.
arXiv Detail & Related papers (2020-06-11T14:25:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.