Related papers: Subject-specific Deep Neural Networks for Count Data with High-cardinality Categorical Features

Subject-specific Deep Neural Networks for Count Data with High-cardinality Categorical Features

URL: http://arxiv.org/abs/2310.11654v1
Date: Wed, 18 Oct 2023 01:54:48 GMT
Title: Subject-specific Deep Neural Networks for Count Data with High-cardinality Categorical Features
Authors: Hangbin Lee, Il Do Ha, Changha Hwang, Youngjo Lee
Abstract summary: We propose a novel hierarchical likelihood learning framework for introducing gamma random effects into a Poisson deep neural network. The proposed method simultaneously yields maximum likelihood estimators for fixed parameters and best unbiased predictors for random effects. State-of-the-art network architectures can be easily implemented into the proposed h-likelihood framework.
Score: 1.2289361708127877
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: There is a growing interest in subject-specific predictions using deep neural networks (DNNs) because real-world data often exhibit correlations, which has been typically overlooked in traditional DNN frameworks. In this paper, we propose a novel hierarchical likelihood learning framework for introducing gamma random effects into the Poisson DNN, so as to improve the prediction performance by capturing both nonlinear effects of input variables and subject-specific cluster effects. The proposed method simultaneously yields maximum likelihood estimators for fixed parameters and best unbiased predictors for random effects by optimizing a single objective function. This approach enables a fast end-to-end algorithm for handling clustered count data, which often involve high-cardinality categorical features. Furthermore, state-of-the-art network architectures can be easily implemented into the proposed h-likelihood framework. As an example, we introduce multi-head attention layer and a sparsemax function, which allows feature selection in high-dimensional settings. To enhance practical performance and learning efficiency, we present an adjustment procedure for prediction of random parameters and a method-of-moments estimator for pretraining of variance component. Various experiential studies and real data analyses confirm the advantages of our proposed methods.

Related papers

Factor Informed Double Deep Learning For Average Treatment Effect Estimation [8.767175335575386]
We present the use of a Double Deep Learning strategy for estimation.<n>We use factor-augmented deep learning-based estimators, FAST-NN, for both the response functions and propensity scores.<n>Our method achieves semiparametric efficiency under a very flexible family of propensity and outcome models.
arXiv Detail & Related papers (2025-08-23T20:26:33Z)
Detecting and Pruning Prominent but Detrimental Neurons in Large Language Models [68.57424628540907]
Large language models (LLMs) often develop learned mechanisms specialized to specific datasets.<n>We introduce a fine-tuning approach designed to enhance generalization by identifying and pruning neurons associated with dataset-specific mechanisms.<n>Our method employs Integrated Gradients to quantify each neuron's influence on high-confidence predictions, pinpointing those that disproportionately contribute to dataset-specific performance.
arXiv Detail & Related papers (2025-07-12T08:10:10Z)
Neural Conformal Control for Time Series Forecasting [54.96087475179419]
We introduce a neural network conformal prediction method for time series that enhances adaptivity in non-stationary environments. Our approach acts as a neural controller designed to achieve desired target coverage, leveraging auxiliary multi-view data with neural network encoders. We empirically demonstrate significant improvements in coverage and probabilistic accuracy, and find that our method is the only one that combines good calibration with consistency in prediction intervals.
arXiv Detail & Related papers (2024-12-24T03:56:25Z)
Nonparametric Filtering, Estimation and Classification using Neural Jump ODEs [3.437372707846067]
Neural Jump ODEs model the conditional expectation between observations by neural ODEs and jump at arrival of new observations. They have demonstrated effectiveness for fully data-driven online forecasting in settings with irregular and partial observations. This work extends the framework to input-output systems, enabling direct applications in online filtering and classification.
arXiv Detail & Related papers (2024-12-04T12:31:15Z)
Implicit Generative Prior for Bayesian Neural Networks [8.013264410621357]
We propose a novel neural adaptive empirical Bayes (NA-EB) framework for complex data structures. The proposed NA-EB framework combines variational inference with a gradient ascent algorithm. We demonstrate the practical applications of our framework through extensive evaluations on a variety of tasks.
arXiv Detail & Related papers (2024-04-27T21:00:38Z)
Embedded feature selection in LSTM networks with multi-objective evolutionary ensemble learning for time series forecasting [49.1574468325115]
We present a novel feature selection method embedded in Long Short-Term Memory networks. Our approach optimize the weights and biases of the LSTM in a partitioned manner. Experimental evaluations on air quality time series data from Italy and southeast Spain demonstrate that our method substantially improves the ability generalization of conventional LSTMs.
arXiv Detail & Related papers (2023-12-29T08:42:10Z)
Function-Space Regularization in Neural Networks: A Probabilistic Perspective [51.133793272222874]
We show that we can derive a well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training. We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection and highly-calibrated predictive uncertainty estimates.
arXiv Detail & Related papers (2023-12-28T17:50:56Z)
Deep Neural Networks for Semiparametric Frailty Models via H-likelihood [0.0]
We propose a new deep neural network based frailty (DNN-FM) for prediction of time-to-event data. Joint estimators of the new h-likelihood model provide maximum likelihood for fixed parameters and best unbiased predictors for random frailties.
arXiv Detail & Related papers (2023-07-13T06:46:51Z)
Sparse-Input Neural Network using Group Concave Regularization [10.103025766129006]
Simultaneous feature selection and non-linear function estimation are challenging in neural networks. We propose a framework of sparse-input neural networks using group concave regularization for feature selection in both low-dimensional and high-dimensional settings.
arXiv Detail & Related papers (2023-07-01T13:47:09Z)
Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data. Main aim of the identified model is to predict new data from previous observations. We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z)
Efficient Graph Neural Network Inference at Large Scale [54.89457550773165]
Graph neural networks (GNNs) have demonstrated excellent performance in a wide range of applications. Existing scalable GNNs leverage linear propagation to preprocess the features and accelerate the training and inference procedure. We propose a novel adaptive propagation order approach that generates the personalized propagation order for each node based on its topological information.
arXiv Detail & Related papers (2022-11-01T14:38:18Z)
Deep-Ensemble-Based Uncertainty Quantification in Spatiotemporal Graph Neural Networks for Traffic Forecasting [2.088376060651494]
We focus on a diffusion convolutional recurrent neural network (DCRNN), a state-of-the-art method for short-term traffic forecasting. We develop a scalable deep ensemble approach to quantify uncertainties for DCRNN. We show that our generic and scalable approach outperforms the current state-of-the-art Bayesian and a number of other commonly used frequentist techniques.
arXiv Detail & Related papers (2022-04-04T16:10:55Z)
Rank-R FNN: A Tensor-Based Learning Model for High-Order Data Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters. First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension. We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z)
Adversarial Feature Augmentation and Normalization for Visual Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models. Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings. We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)
Ramifications of Approximate Posterior Inference for Bayesian Deep Learning in Adversarial and Out-of-Distribution Settings [7.476901945542385]
We show that Bayesian deep learning models on certain occasions marginally outperform conventional neural networks. Preliminary investigations indicate the potential inherent role of bias due to choices of initialisation, architecture or activation functions.
arXiv Detail & Related papers (2020-09-03T16:58:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.