Classification Algorithm of Speech Data of Parkinsons Disease Based on
Convolution Sparse Kernel Transfer Learning with Optimal Kernel and Parallel
Sample Feature Selection
- URL: http://arxiv.org/abs/2002.03716v1
- Date: Mon, 10 Feb 2020 13:20:21 GMT
- Title: Classification Algorithm of Speech Data of Parkinsons Disease Based on
Convolution Sparse Kernel Transfer Learning with Optimal Kernel and Parallel
Sample Feature Selection
- Authors: Xiaoheng Zhang, Yongming Li, Pin Wang, Xiaoheng Tan, and Yuchuan Liu
- Abstract summary: A novel PD classification algorithm based on sparse kernel transfer learning is proposed.
Sparse transfer learning is used to extract structural information of PD speech features from public datasets.
The proposed algorithm achieves obvious improvements in classification accuracy.
- Score: 14.1270098940551
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Labeled speech data from patients with Parkinsons disease (PD) are scarce,
and the statistical distributions of training and test data differ
significantly in the existing datasets. To solve these problems, dimensional
reduction and sample augmentation must be considered. In this paper, a novel PD
classification algorithm based on sparse kernel transfer learning combined with
a parallel optimization of samples and features is proposed. Sparse transfer
learning is used to extract effective structural information of PD speech
features from public datasets as source domain data, and the fast ADDM
iteration is improved to enhance the information extraction performance. To
implement the parallel optimization, the potential relationships between
samples and features are considered to obtain high-quality combined features.
First, features are extracted from a specific public speech dataset to
construct a feature dataset as the source domain. Then, the PD target domain,
including the training and test datasets, is encoded by convolution sparse
coding, which can extract more in-depth information. Next, parallel
optimization is implemented. To further improve the classification performance,
a convolution kernel optimization mechanism is designed. Using two
representative public datasets and one self-constructed dataset, the
experiments compare over thirty relevant algorithms. The results show that when
taking the Sakar dataset, MaxLittle dataset and DNSH dataset as target domains,
the proposed algorithm achieves obvious improvements in classification
accuracy. The study also found large improvements in the algorithms in this
paper compared with nontransfer learning approaches, demonstrating that
transfer learning is both more effective and has a more acceptable time cost.
Related papers
- Transfer Learning in $\ell_1$ Regularized Regression: Hyperparameter
Selection Strategy based on Sharp Asymptotic Analysis [4.178980693837599]
Transfer learning techniques aim to leverage information from multiple related datasets to enhance prediction quality against a target dataset.
Some Lasso-based algorithms have been invented: Trans-Lasso and Pretraining Lasso.
We conduct a thorough, precise study of the algorithm in a high-dimensional setting via an analysis using the replica method.
Our approach reveals a surprisingly simple behavior of the algorithm: Ignoring one of the two types of information transferred to the fine-tuning stage has little effect on generalization performance.
arXiv Detail & Related papers (2024-09-26T10:20:59Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Exploring Data Redundancy in Real-world Image Classification through
Data Selection [20.389636181891515]
Deep learning models often require large amounts of data for training, leading to increased costs.
We present two data valuation metrics based on Synaptic Intelligence and gradient norms, respectively, to study redundancy in real-world image data.
Online and offline data selection algorithms are then proposed via clustering and grouping based on the examined data values.
arXiv Detail & Related papers (2023-06-25T03:31:05Z) - Joint Optimization of Class-Specific Training- and Test-Time Data
Augmentation in Segmentation [35.41274775082237]
This paper presents an effective and general data augmentation framework for medical image segmentation.
We adopt a computationally efficient and data-efficient gradient-based meta-learning scheme to align the distribution of training and validation data.
We demonstrate the effectiveness of our method on four medical image segmentation tasks with two state-of-the-art segmentation models, DeepMedic and nnU-Net.
arXiv Detail & Related papers (2023-05-30T14:48:45Z) - Personalized Decentralized Multi-Task Learning Over Dynamic
Communication Graphs [59.96266198512243]
We propose a decentralized and federated learning algorithm for tasks that are positively and negatively correlated.
Our algorithm uses gradients to calculate the correlations among tasks automatically, and dynamically adjusts the communication graph to connect mutually beneficial tasks and isolate those that may negatively impact each other.
We conduct experiments on a synthetic Gaussian dataset and a large-scale celebrity attributes (CelebA) dataset.
arXiv Detail & Related papers (2022-12-21T18:58:24Z) - Dataset Complexity Assessment Based on Cumulative Maximum Scaled Area
Under Laplacian Spectrum [38.65823547986758]
It is meaningful to predict classification performance by assessing the complexity of datasets effectively before training DCNN models.
This paper proposes a novel method called cumulative maximum scaled Area Under Laplacian Spectrum (cmsAULS)
arXiv Detail & Related papers (2022-09-29T13:02:04Z) - Towards Automated Imbalanced Learning with Deep Hierarchical
Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class.
Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class.
We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z) - Weakly Supervised Change Detection Using Guided Anisotropic Difusion [97.43170678509478]
We propose original ideas that help us to leverage such datasets in the context of change detection.
First, we propose the guided anisotropic diffusion (GAD) algorithm, which improves semantic segmentation results.
We then show its potential in two weakly-supervised learning strategies tailored for change detection.
arXiv Detail & Related papers (2021-12-31T10:03:47Z) - A Novel Multi-Stage Training Approach for Human Activity Recognition
from Multimodal Wearable Sensor Data Using Deep Neural Network [11.946078871080836]
Deep neural network is an effective choice to automatically recognize human actions utilizing data from various wearable sensors.
In this paper, we have proposed a novel multi-stage training approach that increases diversity in this feature extraction process.
arXiv Detail & Related papers (2021-01-03T20:48:56Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - New advances in enumerative biclustering algorithms with online
partitioning [80.22629846165306]
This paper further extends RIn-Close_CVC, a biclustering algorithm capable of performing an efficient, complete, correct and non-redundant enumeration of maximal biclusters with constant values on columns in numerical datasets.
The improved algorithm is called RIn-Close_CVC3, keeps those attractive properties of RIn-Close_CVC, and is characterized by: a drastic reduction in memory usage; a consistent gain in runtime.
arXiv Detail & Related papers (2020-03-07T14:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.