DNF-Net: A Neural Architecture for Tabular Data
- URL: http://arxiv.org/abs/2006.06465v1
- Date: Thu, 11 Jun 2020 14:21:45 GMT
- Title: DNF-Net: A Neural Architecture for Tabular Data
- Authors: Ami Abutbul, Gal Elidan, Liran Katzir, Ran El-Yaniv
- Abstract summary: We present DNF-Net, a novel generic architecture for neural networks.
DNF-Net elicits models whose structure corresponds to logical Boolean formulas in disjunctive normal form (DNF) over affine soft-threshold decision terms.
We show that DNF-Nets significantly and consistently outperform FCNs over tabular data.
- Score: 21.819029316477227
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A challenging open question in deep learning is how to handle tabular data.
Unlike domains such as image and natural language processing, where deep
architectures prevail, there is still no widely accepted neural architecture
that dominates tabular data. As a step toward bridging this gap, we present
DNF-Net a novel generic architecture whose inductive bias elicits models whose
structure corresponds to logical Boolean formulas in disjunctive normal form
(DNF) over affine soft-threshold decision terms. In addition, DNF-Net promotes
localized decisions that are taken over small subsets of the features. We
present an extensive empirical study showing that DNF-Nets significantly and
consistently outperform FCNs over tabular data. With relatively few
hyperparameters, DNF-Nets open the door to practical end-to-end handling of
tabular data using neural networks. We present ablation studies, which justify
the design choices of DNF-Net including the three inductive bias elements,
namely, Boolean formulation, locality, and feature selection.
Related papers
- Tabular Data: Is Attention All You Need? [23.787352248749382]
We introduce a large-scale empirical study comparing neural networks against gradient-boosted decision trees on structured data.
In contrast to prior work, our empirical findings indicate that neural networks are competitive against decision trees.
arXiv Detail & Related papers (2024-02-06T12:59:02Z) - Defining Neural Network Architecture through Polytope Structures of Dataset [53.512432492636236]
This paper defines upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question.
We develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks.
It is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.
arXiv Detail & Related papers (2024-02-04T08:57:42Z) - Structured Neural Networks for Density Estimation and Causal Inference [15.63518195860946]
Injecting structure into neural networks enables learning functions that satisfy invariances with respect to subsets of inputs.
We propose the Structured Neural Network (StrNN), which injects structure through masking pathways in a neural network.
arXiv Detail & Related papers (2023-11-03T20:15:05Z) - Interpretable Mesomorphic Networks for Tabular Data [25.76214343259399]
We propose a new class of interpretable neural networks that are both deep and linear at the same time.
We optimize deep hypernetworks to generate explainable linear models on a per-instance basis.
arXiv Detail & Related papers (2023-05-22T14:41:17Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a
Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp)
In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks.
We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z) - Transfer Learning with Deep Tabular Models [66.67017691983182]
We show that upstream data gives tabular neural networks a decisive advantage over GBDT models.
We propose a realistic medical diagnosis benchmark for tabular transfer learning.
We propose a pseudo-feature method for cases where the upstream and downstream feature sets differ.
arXiv Detail & Related papers (2022-06-30T14:24:32Z) - A Robust Stacking Framework for Training Deep Graph Models with
Multifaceted Node Features [61.92791503017341]
Graph Neural Networks (GNNs) with numerical node features and graph structure as inputs have demonstrated superior performance on various supervised learning tasks with graph data.
The best models for such data types in most standard supervised learning settings with IID (non-graph) data are not easily incorporated into a GNN.
Here we propose a robust stacking framework that fuses graph-aware propagation with arbitrary models intended for IID data.
arXiv Detail & Related papers (2022-06-16T22:46:33Z) - TabularNet: A Neural Network Architecture for Understanding Semantic
Structures of Tabular Data [30.479822289380255]
We propose a novel neural network architecture, TabularNet, to simultaneously extract spatial and relational information from tables.
For relational information, we design a new graph construction method based on the WordNet tree and adopt a Graph Convolutional Network (GCN) based encoder.
Our neural network architecture can be a unified neural backbone for different understanding tasks and utilized in a multitask scenario.
arXiv Detail & Related papers (2021-06-06T11:48:09Z) - On the Empirical Neural Tangent Kernel of Standard Finite-Width
Convolutional Neural Network Architectures [3.4698840925433765]
It remains an open question how well NTK theory models standard neural network architectures of widths common in practice.
We study this question empirically for two well-known convolutional neural network architectures, namely AlexNet and LeNet.
For wider versions of these networks, where the number of channels and widths of fully-connected layers are increased, the deviation decreases.
arXiv Detail & Related papers (2020-06-24T11:40:36Z) - EdgeNets:Edge Varying Graph Neural Networks [179.99395949679547]
This paper puts forth a general framework that unifies state-of-the-art graph neural networks (GNNs) through the concept of EdgeNet.
An EdgeNet is a GNN architecture that allows different nodes to use different parameters to weigh the information of different neighbors.
This is a general linear and local operation that a node can perform and encompasses under one formulation all existing graph convolutional neural networks (GCNNs) as well as graph attention networks (GATs)
arXiv Detail & Related papers (2020-01-21T15:51:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.