Related papers: Autoselection of the Ensemble of Convolutional Neural Networks with Second-Order Cone Programming

Autoselection of the Ensemble of Convolutional Neural Networks with Second-Order Cone Programming

URL: http://arxiv.org/abs/2302.05950v1
Date: Sun, 12 Feb 2023 16:18:06 GMT
Title: Autoselection of the Ensemble of Convolutional Neural Networks with Second-Order Cone Programming
Authors: Buse \c{C}isil G\"uldo\u{g}u\c{s}, Abdullah Nazhat Abdullah, Muhammad Ammar Ali, S\"ureyya \"Oz\"o\u{g}\"ur-Aky\"uz
Abstract summary: This study proposes a mathematical model which prunes the ensemble of Convolutional Neural Networks (CNN) The proposed model is tested on CIFAR-10, CIFAR-100 and MNIST data sets.
Score: 0.8029049649310213
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Ensemble techniques are frequently encountered in machine learning and engineering problems since the method combines different models and produces an optimal predictive solution. The ensemble concept can be adapted to deep learning models to provide robustness and reliability. Due to the growth of the models in deep learning, using ensemble pruning is highly important to deal with computational complexity. Hence, this study proposes a mathematical model which prunes the ensemble of Convolutional Neural Networks (CNN) consisting of different depths and layers that maximizes accuracy and diversity simultaneously with a sparse second order conic optimization model. The proposed model is tested on CIFAR-10, CIFAR-100 and MNIST data sets which gives promising results while reducing the complexity of models, significantly.

Related papers

No Free Lunch From Random Feature Ensembles [23.661623767100384]
Given a budget on total model size, one must decide whether to train a single, large neural network or to combine the predictions of many smaller networks. We prove that when a fixed number of trainable parameters are partitioned among $K$ independently trained models, $K=1$ achieves optimal performance. We identify conditions on the kernel and task eigenstructure under which ensembles can achieve near-optimal scaling laws.
arXiv Detail & Related papers (2024-12-06T20:55:27Z)
Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods. Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions. We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z)
Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion [53.33473557562837]
Solving multi-objective optimization problems for large deep neural networks is a challenging task due to the complexity of the loss landscape and the expensive computational cost. We propose a practical and scalable approach to solve this problem via mixture of experts (MoE) based model fusion. By ensembling the weights of specialized single-task models, the MoE module can effectively capture the trade-offs between multiple objectives.
arXiv Detail & Related papers (2024-06-14T07:16:18Z)
Learnable & Interpretable Model Combination in Dynamic Systems Modeling [0.0]
We discuss which types of models are usually combined and propose a model interface that is capable of expressing a variety of mixed equation based models. We propose a new wildcard topology, that is capable of describing the generic connection between two combined models in an easy to interpret fashion. The contributions of this paper are highlighted at a proof of concept: Different connection topologies between two models are learned, interpreted and compared.
arXiv Detail & Related papers (2024-06-12T11:17:11Z)
The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes. In this paper we examine the use of convex neural recovery models. We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program. We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z)
Transfer learning for ensembles: reducing computation time and keeping the diversity [12.220069569688714]
Transferring a deep neural network trained on one problem to another requires only a small amount of data and little additional computation time. A transfer of deep neural networks ensemble demands relatively high computational expenses. Our approach for the transfer learning of ensembles consists of two steps: (a) shifting weights of encoders of all models in the ensemble by a single shift vector and (b) doing a tiny fine-tuning for each individual model afterwards.
arXiv Detail & Related papers (2022-06-27T08:47:42Z)
Investigating the Relationship Between Dropout Regularization and Model Complexity in Neural Networks [0.0]
Dropout Regularization serves to reduce variance in Deep Learning models. We explore the relationship between the dropout rate and model complexity by training 2,000 neural networks. We build neural networks that predict the optimal dropout rate given the number of hidden units in each dense layer.
arXiv Detail & Related papers (2021-08-14T23:49:33Z)
Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers. We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z)
AgEBO-Tabular: Joint Neural Architecture and Hyperparameter Search with Autotuned Data-Parallel Training for Tabular Data [11.552769149674544]
Development of high-performing predictive models for large data sets is a challenging task. Recent automated machine learning (AutoML) is emerging as a promising approach to automate predictive model development. We have developed AgEBO-Tabular, an approach to combine aging evolution (AgE) and a parallel NAS method that searches over neural architecture space.
arXiv Detail & Related papers (2020-10-30T16:28:48Z)
Collegial Ensembles [11.64359837358763]
We show that collegial ensembles can be efficiently implemented in practical architectures using group convolutions and block diagonal layers. We also show how our framework can be used to analytically derive optimal group convolution modules without having to train a single model.
arXiv Detail & Related papers (2020-06-13T16:40:26Z)
Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model. This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs) The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
Learning Gaussian Graphical Models via Multiplicative Weights [54.252053139374205]
We adapt an algorithm of Klivans and Meka based on the method of multiplicative weight updates. The algorithm enjoys a sample complexity bound that is qualitatively similar to others in the literature. It has a low runtime $O(mp2)$ in the case of $m$ samples and $p$ nodes, and can trivially be implemented in an online manner.
arXiv Detail & Related papers (2020-02-20T10:50:58Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.