Related papers: Tailoring the Hyperparameters of a Wide-Kernel Convolutional Neural Network to Fit Different Bearing Fault Vibration Datasets

Tailoring the Hyperparameters of a Wide-Kernel Convolutional Neural Network to Fit Different Bearing Fault Vibration Datasets

URL: http://arxiv.org/abs/2411.15191v1
Date: Tue, 19 Nov 2024 09:17:13 GMT
Title: Tailoring the Hyperparameters of a Wide-Kernel Convolutional Neural Network to Fit Different Bearing Fault Vibration Datasets
Authors: Dan Hudson, Jurgen van den Hoogen, Martin Atzmueller,
Abstract summary: State-of-the-art algorithms are reported to be almost perfect at distinguishing vibrations arising from healthy and damaged machine bearings. But what about their application to new data? In this paper, we are able to confirm that neural networks for bearing fault detection can be crippled by incorrect hyper parameterisation.
Score: 0.10241134756773229
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: State-of-the-art algorithms are reported to be almost perfect at distinguishing the vibrations arising from healthy and damaged machine bearings, according to benchmark datasets at least. However, what about their application to new data? In this paper, we are able to confirm that neural networks for bearing fault detection can be crippled by incorrect hyperparameterisation, and also that the correct hyperparameter settings can actually change when transitioning to new data. The paper weaves together multiple methods to explain the behaviour of the hyperparameters of a wide-kernel convolutional neural network and how to set them. Since guidance already exists for generic hyperparameters like minibatch size, we focus on how to set architecture-specific hyperparameters such as the width of the convolutional kernels, a topic which might otherwise be obscure. We reflect different data properties by fusing information from seven different benchmark datasets, and our results show that the kernel size in the first layer in particular is sensitive to changes in the data. Looking deeper, we use manipulated copies of one dataset in an attempt to spot why the kernel size sometimes needs to change. The relevance of sampling rate is studied by using different levels of resampling, and spectral content is studied by increasingly filtering out high frequencies. At the end of our paper we conclude by stating clear guidance on how to set the hyperparameters of our neural network architecture.

Related papers

Cross-Frequency Implicit Neural Representation with Self-Evolving Parameters [52.574661274784916]
Implicit neural representation (INR) has emerged as a powerful paradigm for visual data representation. We propose a self-evolving cross-frequency INR using the Haar wavelet transform (termed CF-INR), which decouples data into four frequency components and employs INRs in the wavelet space. We evaluate CF-INR on a variety of visual data representation and recovery tasks, including image regression, inpainting, denoising, and cloud removal.
arXiv Detail & Related papers (2025-04-15T07:14:35Z)
Verification of Neural Networks against Convolutional Perturbations via Parameterised Kernels [18.052298354970258]
We develop a method for the efficient verification of neural networks against convolutional perturbations such as blurring or sharpening.<n>To define input perturbations we use well-known camera shake, box blur and sharpen kernels.<n>To facilitate their use in neural network verification, we develop an efficient way of convolving a given input with these parameterised kernels.
arXiv Detail & Related papers (2024-11-07T10:25:20Z)
NIDS Neural Networks Using Sliding Time Window Data Processing with Trainable Activations and its Generalization Capability [0.0]
This paper presents neural networks for network intrusion detection systems (NIDS) that operate on flow data preprocessed with a time window. It requires only eleven features which do not rely on deep packet inspection and can be found in most NIDS datasets and easily obtained from conventional flow collectors. The reported training accuracy exceeds 99% for the proposed method with as little as twenty neural network input features.
arXiv Detail & Related papers (2024-10-24T11:36:19Z)
SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds. With the development of Transformer, the scale of SIRST models is constantly increasing. With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z)
Compression of Structured Data with Autoencoders: Provable Benefit of Nonlinearities and Depth [83.15263499262824]
We prove that gradient descent converges to a solution that completely disregards the sparse structure of the input. We show how to improve upon Gaussian performance for the compression of sparse data by adding a denoising function to a shallow architecture. We validate our findings on image datasets, such as CIFAR-10 and MNIST.
arXiv Detail & Related papers (2024-02-07T16:32:29Z)
Assessing Neural Network Representations During Training Using Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process. We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures. We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z)
Linking convolutional kernel size to generalization bias in face analysis CNNs [9.030335233143603]
We present a causal framework for linking an architectural hyper parameter to out-of-distribution algorithmic bias. In our experiments, we focused on measuring the causal relationship between convolutional kernel size and face analysis classification bias. We show that modifying kernel size, even in one layer of a CNN, changes the frequency content of learned features significantly across data subgroups.
arXiv Detail & Related papers (2023-02-07T20:55:09Z)
Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters. We find that our approach successfully generates parameters for a wide range of loss prompts. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z)
AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning. We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z)
Robust Scatterer Number Density Segmentation of Ultrasound Images [2.599882743586164]
Quantitative UltraSound (QUS) aims to reveal information about the tissue microstructure using backscattered echo signals from clinical scanners. scatterer number density is an important property that can affect estimation of other QUS parameters. We employ a convolutional neural network (CNN) for the segmentation task and investigate the effect of domain shift when the network is tested on different datasets.
arXiv Detail & Related papers (2022-01-16T22:08:47Z)
A manifold learning approach for gesture identification from micro-Doppler radar measurements [1.4610038284393163]
We present a kernel based approximation for manifold learning that does not require the knowledge of anything about the manifold, except its dimension. We demonstrate the performance of our approach using a publicly available micro-Doppler data set.
arXiv Detail & Related papers (2021-10-04T19:08:44Z)
SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory. We propose StreaMRAK - a streaming version of KRR. We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z)
SignalNet: A Low Resolution Sinusoid Decomposition and Estimation Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples. We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions. In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z)
Understanding Overparameterization in Generative Adversarial Networks [56.57403335510056]
Generative Adversarial Networks (GANs) are used to train non- concave mini-max optimization problems. A theory has shown the importance of the gradient descent (GD) to globally optimal solutions. We show that in an overized GAN with a $1$-layer neural network generator and a linear discriminator, the GDA converges to a global saddle point of the underlying non- concave min-max problem.
arXiv Detail & Related papers (2021-04-12T16:23:37Z)
Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network. We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z)
Randomized kernels for large scale Earth observation applications [8.835750299747229]
This paper introduces an efficient kernel method for fast statistical retrieval of bio-geo-physical parameters and image classification problems. We show that kernel regression and classification is now possible for datasets with millions of examples and high dimensionality.
arXiv Detail & Related papers (2020-12-07T12:23:56Z)
Ensembled sparse-input hierarchical networks for high-dimensional datasets [8.629912408966145]
We show that dense neural networks can be a practical data analysis tool in settings with small sample sizes. A proposed method appropriately prunes the network structure by tuning only two L1-penalty parameters. On a collection of real-world datasets with different sizes, EASIER-net selected network architectures in a data-adaptive manner and achieved higher prediction accuracy than off-the-shelf methods on average.
arXiv Detail & Related papers (2020-05-11T02:08:53Z)
A Hybrid Objective Function for Robustness of Artificial Neural Networks -- Estimation of Parameters in a Mechanical System [0.0]
We consider the task of estimating parameters of a mechanical vehicle model based on acceleration profiles. We introduce a convolutional neural network architecture that is capable to predict the parameters for a family of vehicle models that differ in the unknown parameters.
arXiv Detail & Related papers (2020-04-16T15:06:43Z)
Learning Deep Kernels for Non-Parametric Two-Sample Tests [50.92621794426821]
We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution. Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test power.
arXiv Detail & Related papers (2020-02-21T03:54:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.