Dataset Complexity Assessment Based on Cumulative Maximum Scaled Area
Under Laplacian Spectrum
- URL: http://arxiv.org/abs/2209.14743v1
- Date: Thu, 29 Sep 2022 13:02:04 GMT
- Title: Dataset Complexity Assessment Based on Cumulative Maximum Scaled Area
Under Laplacian Spectrum
- Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama
- Abstract summary: It is meaningful to predict classification performance by assessing the complexity of datasets effectively before training DCNN models.
This paper proposes a novel method called cumulative maximum scaled Area Under Laplacian Spectrum (cmsAULS)
- Score: 38.65823547986758
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dataset complexity assessment aims to predict classification performance on a
dataset with complexity calculation before training a classifier, which can
also be used for classifier selection and dataset reduction. The training
process of deep convolutional neural networks (DCNNs) is iterative and
time-consuming because of hyperparameter uncertainty and the domain shift
introduced by different datasets. Hence, it is meaningful to predict
classification performance by assessing the complexity of datasets effectively
before training DCNN models. This paper proposes a novel method called
cumulative maximum scaled Area Under Laplacian Spectrum (cmsAULS), which can
achieve state-of-the-art complexity assessment performance on six datasets.
Related papers
- Minimally Supervised Learning using Topological Projections in
Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs)
Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU)
Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z) - Iterative self-transfer learning: A general methodology for response
time-history prediction based on small dataset [0.0]
An iterative self-transfer learningmethod for training neural networks based on small datasets is proposed in this study.
The results show that the proposed method can improve the model performance by near an order of magnitude on small datasets.
arXiv Detail & Related papers (2023-06-14T18:48:04Z) - Treatment-RSPN: Recurrent Sum-Product Networks for Sequential Treatment
Regimes [3.7004311481324677]
Sum-product networks (SPNs) have emerged as a novel deep learning architecture enabling highly efficient probabilistic inference.
We propose a general framework for modelling sequential treatment decision-making behaviour and treatment response using RSPNs.
We evaluate our approach on a synthetic dataset as well as real-world data from the MIMIC-IV intensive care unit medical database.
arXiv Detail & Related papers (2022-11-14T00:18:44Z) - Do We Really Need a Learnable Classifier at the End of Deep Neural
Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training.
Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - CDEvalSumm: An Empirical Study of Cross-Dataset Evaluation for Neural
Summarization Systems [121.78477833009671]
We investigate the performance of different summarization models under a cross-dataset setting.
A comprehensive study of 11 representative summarization systems on 5 datasets from different domains reveals the effect of model architectures and generation ways.
arXiv Detail & Related papers (2020-10-11T02:19:15Z) - SECODA: Segmentation- and Combination-Based Detection of Anomalies [0.0]
SECODA is an unsupervised non-parametric anomaly detection algorithm for datasets containing continuous and categorical attributes.
The algorithm has a low memory imprint and its runtime performance scales linearly with the size of the dataset.
An evaluation with simulated and real-life datasets shows that this algorithm is able to identify many different types of anomalies.
arXiv Detail & Related papers (2020-08-16T10:03:14Z) - New advances in enumerative biclustering algorithms with online
partitioning [80.22629846165306]
This paper further extends RIn-Close_CVC, a biclustering algorithm capable of performing an efficient, complete, correct and non-redundant enumeration of maximal biclusters with constant values on columns in numerical datasets.
The improved algorithm is called RIn-Close_CVC3, keeps those attractive properties of RIn-Close_CVC, and is characterized by: a drastic reduction in memory usage; a consistent gain in runtime.
arXiv Detail & Related papers (2020-03-07T14:54:26Z) - Tighter Bound Estimation of Sensitivity Analysis for Incremental and
Decremental Data Modification [39.62854914952284]
In large-scale classification problems, the data set always be faced with frequent updates when a part of the data is added to or removed from the original data set.
We propose an algorithm to make rational inferences about the updated linear classifier without exactly updating the classifier.
Both theoretical analysis and experiment results show that the proposed approach is superior to existing methods in terms of tightness of coefficients' bounds and computational complexity.
arXiv Detail & Related papers (2020-03-06T18:28:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.