Quadratic mutual information regularization in real-time deep CNN models
- URL: http://arxiv.org/abs/2108.11774v1
- Date: Thu, 26 Aug 2021 13:14:24 GMT
- Title: Quadratic mutual information regularization in real-time deep CNN models
- Authors: Maria Tzelepi and Anastasios Tefas
- Abstract summary: Regularization method motivated by the Quadratic Mutual Information is proposed.
Experiments on various binary classification problems are performed, indicating the effectiveness of the proposed models.
- Score: 51.66271681532262
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, regularized lightweight deep convolutional neural network
models, capable of effectively operating in real-time on devices with
restricted computational power for high-resolution video input are proposed.
Furthermore, a novel regularization method motivated by the Quadratic Mutual
Information, in order to improve the generalization ability of the utilized
models is proposed. Extensive experiments on various binary classification
problems involved in autonomous systems are performed, indicating the
effectiveness of the proposed models as well as of the proposed regularizer.
Related papers
- Enhancing Scalability in Recommender Systems through Lottery Ticket
Hypothesis and Knowledge Distillation-based Neural Network Pruning [1.3654846342364308]
This study introduces an innovative approach aimed at the efficient pruning of neural networks, with a particular focus on their deployment on edge devices.
Our method involves the integration of the Lottery Ticket Hypothesis (LTH) with the Knowledge Distillation (KD) framework, resulting in the formulation of three distinct pruning models.
Gratifyingly, our approaches yielded a GPU computation-power reduction of up to 66.67%.
arXiv Detail & Related papers (2024-01-19T04:17:50Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - An Explainable Framework for Machine learning-Based Reactive Power
Optimization of Distribution Network [3.239871645288635]
An explainable machine-learning framework is proposed to optimize the reactive power in distribution networks.
A Shapley additive explanation framework is presented to measure the contribution of each input feature to the solution of reactive power optimizations.
A model-agnostic approximation method is developed to estimate Shapley values, so as to avoid the heavy computational burden.
arXiv Detail & Related papers (2023-11-07T10:24:03Z) - Neural Harmonium: An Interpretable Deep Structure for Nonlinear Dynamic
System Identification with Application to Audio Processing [4.599180419117645]
Interpretability helps us understand a model's ability to generalize and reveal its limitations.
We introduce a causal interpretable deep structure for modeling dynamic systems.
Our proposed model makes use of the harmonic analysis by modeling the system in a time-frequency domain.
arXiv Detail & Related papers (2023-10-10T21:32:15Z) - On Robust Numerical Solver for ODE via Self-Attention Mechanism [82.95493796476767]
We explore training efficient and robust AI-enhanced numerical solvers with a small data size by mitigating intrinsic noise disturbances.
We first analyze the ability of the self-attention mechanism to regulate noise in supervised learning and then propose a simple-yet-effective numerical solver, Attr, which introduces an additive self-attention mechanism to the numerical solution of differential equations.
arXiv Detail & Related papers (2023-02-05T01:39:21Z) - Maximum entropy exploration in contextual bandits with neural networks
and energy based models [63.872634680339644]
We present two classes of models, one with neural networks as reward estimators, and the other with energy based models.
We show that both techniques outperform well-known standard algorithms, where energy based models have the best overall performance.
This provides practitioners with new techniques that perform well in static and dynamic settings, and are particularly well suited to non-linear scenarios with continuous action spaces.
arXiv Detail & Related papers (2022-10-12T15:09:45Z) - On the adaptation of recurrent neural networks for system identification [2.5234156040689237]
This paper presents a transfer learning approach which enables fast and efficient adaptation of Recurrent Neural Network (RNN) models of dynamical systems.
The system dynamics are then assumed to change, leading to an unacceptable degradation of the nominal model performance on the perturbed system.
To cope with the mismatch, the model is augmented with an additive correction term trained on fresh data from the new dynamic regime.
arXiv Detail & Related papers (2022-01-21T12:04:17Z) - Deep Variational Models for Collaborative Filtering-based Recommender
Systems [63.995130144110156]
Deep learning provides accurate collaborative filtering models to improve recommender system results.
Our proposed models apply the variational concept to injectity in the latent space of the deep architecture.
Results show the superiority of the proposed approach in scenarios where the variational enrichment exceeds the injected noise effect.
arXiv Detail & Related papers (2021-07-27T08:59:39Z) - Learning High-Dimensional Distributions with Latent Neural Fokker-Planck
Kernels [67.81799703916563]
We introduce new techniques to formulate the problem as solving Fokker-Planck equation in a lower-dimensional latent space.
Our proposed model consists of latent-distribution morphing, a generator and a parameterized Fokker-Planck kernel function.
arXiv Detail & Related papers (2021-05-10T17:42:01Z) - Prediction-Centric Learning of Independent Cascade Dynamics from Partial
Observations [13.680949377743392]
We address the problem of learning of a spreading model such that the predictions generated from this model are accurate.
We introduce a computationally efficient algorithm, based on a scalable dynamic message-passing approach.
We show that tractable inference from the learned model generates a better prediction of marginal probabilities compared to the original model.
arXiv Detail & Related papers (2020-07-13T17:58:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.