On Accelerating Deep Neural Network Mutation Analysis by Neuron and Mutant Clustering
- URL: http://arxiv.org/abs/2501.12598v1
- Date: Wed, 22 Jan 2025 02:48:07 GMT
- Title: On Accelerating Deep Neural Network Mutation Analysis by Neuron and Mutant Clustering
- Authors: Lauren Lyons, Ali Ghanbari,
- Abstract summary: Mutation analysis of deep neural networks (DNNs) is a promising method for effective evaluation of test data quality and model robustness.<n>We present DEEPMAACC, a technique and a tool that speeds up mutation analysis through neuron and mutant clustering.<n>Our results demonstrate that a trade-off can be made between mutation testing speed and mutation score error.
- Score: 1.7188280334580197
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mutation analysis of deep neural networks (DNNs) is a promising method for effective evaluation of test data quality and model robustness, but it can be computationally expensive, especially for large models. To alleviate this, we present DEEPMAACC, a technique and a tool that speeds up DNN mutation analysis through neuron and mutant clustering. DEEPMAACC implements two methods: (1) neuron clustering to reduce the number of generated mutants and (2) mutant clustering to reduce the number of mutants to be tested by selecting representative mutants for testing. Both use hierarchical agglomerative clustering to group neurons and mutants with similar weights, with the goal of improving efficiency while maintaining mutation score. DEEPMAACC has been evaluated on 8 DNN models across 4 popular classification datasets and two DNN architectures. When compared to exhaustive, or vanilla, mutation analysis, the results provide empirical evidence that neuron clustering approach, on average, accelerates mutation analysis by 69.77%, with an average -26.84% error in mutation score. Meanwhile, mutant clustering approach, on average, accelerates mutation analysis by 35.31%, with an average 1.96% error in mutation score. Our results demonstrate that a trade-off can be made between mutation testing speed and mutation score error.
Related papers
- DeepVRegulome: DNABERT-based deep-learning framework for predicting the functional impact of short genomic variants on the human regulome [6.877744260030448]
Deep VRegulome is a deep-learning method for prediction and interpretation of functionally disruptive variants in the human regulome.<n>We showcase its application on TCGA glioblastoma WGS dataset in prioritizing survival-associated mutations and regulatory regions.
arXiv Detail & Related papers (2025-11-12T06:25:31Z) - Using Fourier Analysis and Mutant Clustering to Accelerate DNN Mutation Testing [0.9617606953987995]
Deep neural network (DNN) mutation analysis is a promising approach to evaluating test set adequacy.<n>We present a technique, named DM#, for accelerating mutation testing using Fourier analysis.<n>Our results provide empirical evidence on the effectiveness of DM# in accelerating mutation testing by 28.38%, on average, at the average cost of only 0.72% error in mutation score.
arXiv Detail & Related papers (2025-10-03T04:36:42Z) - Dynamicasome: a molecular dynamics-guided and AI-driven pathogenicity prediction catalogue for all genetic mutations [1.5071448753819772]
We show that integrating detailed conformational data extracted from molecular dynamics simulations into advanced AI-based models increases their predictive power.<n>We carry out an exhaustive mutational analysis of the disease gene PMM2 and subject structural models of each variant to MDS.<n>Our best performing model, a neuronal networks model, also predicts the pathogenicity of several PMM2 mutations currently considered of unknown signi cance.
arXiv Detail & Related papers (2025-09-23T17:33:05Z) - Improving statistical learning methods via features selection without replacement sampling and random projection [0.680740878601496]
Cancer is a genetic disease characterized by genetic and epigenetic alterations that disrupt normal gene expression.<n>High-dimensional microarray datasets pose challenges for classification models due to the "small n, large p" problem.<n>This study contributes to cancer biomarker discovery, offering a robust computational method for analyzing microarray data.
arXiv Detail & Related papers (2025-05-28T22:36:46Z) - Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.<n>A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.<n>The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z) - Wafer Map Defect Classification Using Autoencoder-Based Data Augmentation and Convolutional Neural Network [4.8748194765816955]
This study proposes a novel method combining a self-encoder-based data augmentation technique with a convolutional neural network (CNN)
The proposed method achieves a classification accuracy of 98.56%, surpassing Random Forest, SVM, and Logistic Regression by 19%, 21%, and 27%, respectively.
arXiv Detail & Related papers (2024-11-17T10:19:54Z) - Effective Adaptive Mutation Rates for Program Synthesis [3.2228025627337864]
Problem-solving performance of evolutionary algorithms depends on mutation rates.
We propose an adaptive bandit-based mutation scheme that removes the need to specify a mutation rate.
Results on software synthesis and symbolic regression problems validate the effectiveness of our approach.
arXiv Detail & Related papers (2024-06-23T00:56:37Z) - An Empirical Evaluation of Manually Created Equivalent Mutants [54.02049952279685]
Less than 10 % of manually created mutants are equivalent.
Surprisingly, our findings indicate that a significant portion of developers struggle to accurately identify equivalent mutants.
arXiv Detail & Related papers (2024-04-14T13:04:10Z) - Predicting loss-of-function impact of genetic mutations: a machine
learning approach [0.0]
This paper aims to train machine learning models on the attributes of a genetic mutation to predict LoFtool scores.
These attributes included, but were not limited to, the position of a mutation on a chromosome, changes in amino acids, and changes in codons caused by the mutation.
Models were evaluated using five-fold cross-validated averages of r-squared, mean squared error, root mean squared error, mean absolute error, and explained variance.
arXiv Detail & Related papers (2024-01-26T19:27:38Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Breast Ultrasound Tumor Classification Using a Hybrid Multitask
CNN-Transformer Network [63.845552349914186]
Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification.
Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations.
In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation.
arXiv Detail & Related papers (2023-08-04T01:19:32Z) - Deep neural networks with controlled variable selection for the
identification of putative causal genetic variants [0.43012765978447565]
We propose an interpretable neural network model, stabilized using ensembling, with controlled variable selection for genetic studies.
The merit of the proposed method includes: (1) flexible modelling of the non-linear effect of genetic variants to improve statistical power; (2) multiple knockoffs in the input layer to rigorously control false discovery rate; (3) hierarchical layers to substantially reduce the number of weight parameters and activations to improve computational efficiency.
arXiv Detail & Related papers (2021-09-29T20:57:48Z) - High performing ensemble of convolutional neural networks for insect
pest image detection [124.23179560022761]
Pest infestation is a major cause of crop damage and lost revenues worldwide.
We generate ensembles of CNNs based on different topologies.
Two new Adam algorithms for deep network optimization are proposed.
arXiv Detail & Related papers (2021-08-28T00:49:11Z) - DeepMutation: A Neural Mutation Tool [26.482720255691646]
DeepMutation is a tool wrapping our deep learning model into a fully automated tool chain.
It can generate, inject, and test mutants learned from real faults.
arXiv Detail & Related papers (2020-02-12T01:57:41Z) - Q-GADMM: Quantized Group ADMM for Communication Efficient Decentralized Machine Learning [66.18202188565922]
We propose a communication-efficient decentralized machine learning (ML) algorithm, coined QGADMM (QGADMM)<n>We develop a novel quantization method to adaptively adjust modelization levels and their probabilities, while proving the convergence of QGADMM for convex functions.
arXiv Detail & Related papers (2019-10-23T10:47:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.