Related papers: Integrating Random Effects in Deep Neural Networks

Integrating Random Effects in Deep Neural Networks

URL: http://arxiv.org/abs/2206.03314v1
Date: Tue, 7 Jun 2022 14:02:24 GMT
Title: Integrating Random Effects in Deep Neural Networks
Authors: Giora Simchoni, Saharon Rosset
Abstract summary: We propose to use the mixed models framework to handle correlated data in deep neural networks. By treating the effects underlying the correlation structure as random effects, mixed models are able to avoid overfitted parameter estimates. Our approach which we call LMMNN is demonstrated to improve performance over natural competitors in various correlation scenarios.
Score: 4.860671253873579
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern approaches to supervised learning like deep neural networks (DNNs) typically implicitly assume that observed responses are statistically independent. In contrast, correlated data are prevalent in real-life large-scale applications, with typical sources of correlation including spatial, temporal and clustering structures. These correlations are either ignored by DNNs, or ad-hoc solutions are developed for specific use cases. We propose to use the mixed models framework to handle correlated data in DNNs. By treating the effects underlying the correlation structure as random effects, mixed models are able to avoid overfitted parameter estimates and ultimately yield better predictive performance. The key to combining mixed models and DNNs is using the Gaussian negative log-likelihood (NLL) as a natural loss function that is minimized with DNN machinery including stochastic gradient descent (SGD). Since NLL does not decompose like standard DNN loss functions, the use of SGD with NLL presents some theoretical and implementation challenges, which we address. Our approach which we call LMMNN is demonstrated to improve performance over natural competitors in various correlation scenarios on diverse simulated and real datasets. Our focus is on a regression setting and tabular datasets, but we also show some results for classification. Our code is available at https://github.com/gsimchoni/lmmnn.

Related papers

Probabilistic Neural Networks (PNNs) with t-Distributed Outputs: Adaptive Prediction Intervals Beyond Gaussian Assumptions [2.77390041716769]
Probabilistic neural networks (PNNs) produce output distributions, enabling the construction of prediction intervals. We propose t-Distributed Neural Networks (TDistNNs), which generate t-distributed outputs, parameterized by location, scale, and degrees of freedom. We show that TDistNNs consistently produce narrower prediction intervals than Gaussian-based PNNs while maintaining proper coverage.
arXiv Detail & Related papers (2025-03-16T04:47:48Z)
Statistical Properties of Deep Neural Networks with Dependent Data [0.0]
This paper establishes statistical properties of deep neural network (DNN) estimators under dependent data. The framework provided also offers potential for research into other DNN architectures and time-series applications.
arXiv Detail & Related papers (2024-10-14T21:46:57Z)
Learning to Reweight for Graph Neural Network [63.978102332612906]
Graph Neural Networks (GNNs) show promising results for graph tasks. Existing GNNs' generalization ability will degrade when there exist distribution shifts between testing and training graph data. We propose a novel nonlinear graph decorrelation method, which can substantially improve the out-of-distribution generalization ability.
arXiv Detail & Related papers (2023-12-19T12:25:10Z)
Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification. Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z)
Adaptive-saturated RNN: Remember more with less instability [2.191505742658975]
This work proposes Adaptive-Saturated RNNs (asRNN), a variant that dynamically adjusts its saturation level between the two approaches. Our experiments show encouraging results of asRNN on challenging sequence learning benchmarks compared to several strong competitors.
arXiv Detail & Related papers (2023-04-24T02:28:03Z)
Task-Synchronized Recurrent Neural Networks [0.0]
Recurrent Neural Networks (RNNs) traditionally involve ignoring the fact, feeding the time differences as additional inputs, or resampling the data. We propose an elegant straightforward alternative approach where instead the RNN is in effect resampled in time to match the time of the data or the task at hand. We confirm empirically that our models can effectively compensate for the time-non-uniformity of the data and demonstrate that they compare favorably to data resampling, classical RNN methods, and alternative RNN models.
arXiv Detail & Related papers (2022-04-11T15:27:40Z)
EIGNN: Efficient Infinite-Depth Graph Neural Networks [51.97361378423152]
Graph neural networks (GNNs) are widely used for modelling graph-structured data in numerous applications. Motivated by this limitation, we propose a GNN model with infinite depth, which we call Efficient Infinite-Depth Graph Neural Networks (EIGNN) We show that EIGNN has a better ability to capture long-range dependencies than recent baselines, and consistently achieves state-of-the-art performance.
arXiv Detail & Related papers (2022-02-22T08:16:58Z)
Generalizing Graph Neural Networks on Out-Of-Distribution Graphs [51.33152272781324]
Graph Neural Networks (GNNs) are proposed without considering the distribution shifts between training and testing graphs. In such a setting, GNNs tend to exploit subtle statistical correlations existing in the training set for predictions, even though it is a spurious correlation. We propose a general causal representation framework, called StableGNN, to eliminate the impact of spurious correlations.
arXiv Detail & Related papers (2021-11-20T18:57:18Z)
Shift-Robust GNNs: Overcoming the Limitations of Localized Graph Training data [52.771780951404565]
Shift-Robust GNN (SR-GNN) is designed to account for distributional differences between biased training data and the graph's true inference distribution. We show that SR-GNN outperforms other GNN baselines by accuracy, eliminating at least (40%) of the negative effects introduced by biased training data.
arXiv Detail & Related papers (2021-08-02T18:00:38Z)
ASFGNN: Automated Separated-Federated Graph Neural Network [17.817867271722093]
We propose an automated Separated-Federated Graph Neural Network (ASFGNN) learning paradigm. We conduct experiments on benchmark datasets and the results demonstrate that ASFGNN significantly outperforms the naive federated GNN.
arXiv Detail & Related papers (2020-11-06T09:21:34Z)
Stochastic Graph Neural Networks [123.39024384275054]
Graph neural networks (GNNs) model nonlinear representations in graph data with applications in distributed agent coordination, control, and planning. Current GNN architectures assume ideal scenarios and ignore link fluctuations that occur due to environment, human factors, or external attacks. In these situations, the GNN fails to address its distributed task if the topological randomness is not considered accordingly.
arXiv Detail & Related papers (2020-06-04T08:00:00Z)
Comparing SNNs and RNNs on Neuromorphic Vision Datasets: Similarities and Differences [36.82069150045153]
Spiking neural networks (SNNs) and recurrent neural networks (RNNs) are benchmarked on neuromorphic data. In this work, we make a systematic study to compare SNNs and RNNs on neuromorphic data.
arXiv Detail & Related papers (2020-05-02T10:19:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.