Flexible, Non-parametric Modeling Using Regularized Neural Networks
- URL: http://arxiv.org/abs/2012.11369v2
- Date: Tue, 2 Feb 2021 09:36:55 GMT
- Title: Flexible, Non-parametric Modeling Using Regularized Neural Networks
- Authors: Oskar Allerbo, Rebecka J\"ornsten
- Abstract summary: PrAda-net is a one hidden layer neural network, trained with proximal gradient descent and adaptive lasso.
It automatically adjusts the size and architecture of the neural network to capture the structure of the underlying data generative model.
We demonstrate PrAda-net on simulated data, where we compare the test error performance, variable importance and variable subset identification properties.
We also apply Prada-net to the massive U.K. black smoke data set, to demonstrate the capability of using Prada-net as an alternative to GAMs.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Non-parametric regression, such as generalized additive models (GAMs), is
able to capture complex data dependencies in a flexible, yet interpretable way.
However, choosing the format of the additive components often requires
non-trivial data exploration. Here, we propose an alternative to GAMs,
PrAda-net, which uses a one hidden layer neural network, trained with proximal
gradient descent and adaptive lasso. PrAda-net automatically adjusts the size
and architecture of the neural network to capture the complexity and structure
of the underlying data generative model. The compact network obtained by
PrAda-net can be translated to additive model components, making it suitable
for non-parametric statistical modelling with automatic model selection. We
demonstrate PrAda-net on simulated data, where we compare the test error
performance, variable importance and variable subset identification properties
of PrAda-net to other lasso-based approaches. We also apply Prada-net to the
massive U.K. black smoke data set, to demonstrate the capability of using
Prada-net as an alternative to GAMs. In contrast to GAMs, which often require
domain knowledge to select the functional forms of the additive components,
Prada-net requires no such pre-selection while still resulting in interpretable
additive components.
Related papers
- Instruction-Guided Autoregressive Neural Network Parameter Generation [49.800239140036496]
We propose IGPG, an autoregressive framework that unifies parameter synthesis across diverse tasks and architectures.
By autoregressively generating neural network weights' tokens, IGPG ensures inter-layer coherence and enables efficient adaptation across models and datasets.
Experiments on multiple datasets demonstrate that IGPG consolidates diverse pretrained models into a single, flexible generative framework.
arXiv Detail & Related papers (2025-04-02T05:50:19Z) - Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction [88.65168366064061]
We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference.
Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable.
We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
arXiv Detail & Related papers (2024-10-10T17:18:30Z) - Few-shot Online Anomaly Detection and Segmentation [29.693357653538474]
This paper focuses on addressing the challenging yet practical few-shot online anomaly detection and segmentation (FOADS) task.
Under the FOADS framework, models are trained on a few-shot normal dataset, followed by inspection and improvement of their capabilities by leveraging unlabeled streaming data containing both normal and abnormal samples simultaneously.
In order to achieve improved performance with limited training samples, we employ multi-scale feature embedding extracted from a CNN pre-trained on ImageNet to obtain a robust representation.
arXiv Detail & Related papers (2024-03-27T02:24:00Z) - GCondNet: A Novel Method for Improving Neural Networks on Small High-Dimensional Tabular Data [14.124731264553889]
We propose GCondNet to enhance neural networks by leveraging implicit structures present in data.
GCondNet exploits the data's high-dimensionality, and thus improves the performance of an underlying predictor network.
We demonstrate GCondNet's effectiveness on 12 real-world datasets, where it outperforms 14 standard and state-of-the-art methods.
arXiv Detail & Related papers (2022-11-11T16:13:34Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - ARM-Net: Adaptive Relation Modeling Network for Structured Data [29.94433633729326]
ARM-Net is an adaptive relation modeling network tailored for structured data and a lightweight framework ARMOR based on ARM-Net for relational data.
We show that ARM-Net consistently outperforms existing models and provides more interpretable predictions for datasets.
arXiv Detail & Related papers (2021-07-05T07:37:24Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network.
PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z) - A Neural-embedded Choice Model: TasteNet-MNL Modeling Taste
Heterogeneity with Flexibility and Interpretability [0.0]
Discrete choice models (DCMs) require a priori knowledge of the utility functions, especially how tastes vary across individuals.
In this paper, we utilize a neural network to learn taste representation.
We show that TasteNet-MNL reaches the ground-truth model's predictability and recovers the nonlinear taste functions on synthetic data.
arXiv Detail & Related papers (2020-02-03T18:03:54Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.