Sound Event Classification in an Industrial Environment: Pipe Leakage
Detection Use Case
- URL: http://arxiv.org/abs/2205.02706v1
- Date: Thu, 5 May 2022 15:26:22 GMT
- Title: Sound Event Classification in an Industrial Environment: Pipe Leakage
Detection Use Case
- Authors: Ibrahim Shaer and Abdallah Shami
- Abstract summary: A multi-stage Machine Learning pipeline is proposed for pipe leakage detection in an industrial environment.
The proposed pipeline applies multiple steps, each addressing the environment's challenges.
The results show that the model produces excellent results with 99% accuracy and an F1-score of 0.93 and 0.9 for the respective datasets.
- Score: 3.9414768019101682
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, a multi-stage Machine Learning (ML) pipeline is proposed for
pipe leakage detection in an industrial environment. As opposed to other
industrial and urban environments, the environment under study includes many
interfering background noises, complicating the identification of leaks.
Furthermore, the harsh environmental conditions limit the amount of data
collected and impose the use of low-complexity algorithms. To address the
environment's constraints, the developed ML pipeline applies multiple steps,
each addressing the environment's challenges. The proposed ML pipeline first
reduces the data dimensionality by feature selection techniques and then
incorporates time correlations by extracting time-based features. The resultant
features are fed to a Support Vector Machine (SVM) of low-complexity that
generalizes well to a small amount of data. An extensive experimental procedure
was carried out on two datasets, one with background industrial noise and one
without, to evaluate the validity of the proposed pipeline. The SVM
hyper-parameters and parameters specific to the pipeline steps were tuned as
part of the experimental procedure. The best models obtained from the dataset
with industrial noise and leaks were applied to datasets without noise and with
and without leaks to test their generalizability. The results show that the
model produces excellent results with 99\% accuracy and an F1-score of 0.93 and
0.9 for the respective datasets.
Related papers
- Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Let's Roll: Synthetic Dataset Analysis for Pedestrian Detection Across
Different Shutter Types [7.0441427250832644]
This paper studies the impact of different shutter mechanisms on machine learning (ML) object detection models on a synthetic dataset.
In particular, we train and evaluate mainstream detection models with our synthetically-generated paired GS and RS datasets.
arXiv Detail & Related papers (2023-09-15T04:07:42Z) - Can We Transfer Noise Patterns? A Multi-environment Spectrum Analysis
Model Using Generated Cases [10.876490928902838]
spectral data-based testing devices suffer from complex noise patterns when deployed in non-laboratory environments.
We propose a noise patterns transferring model, which takes the spectrum of standard water samples in different environments as cases and learns the differences in their noise patterns.
We generate a sample-to-sample case-base to exclude the interference of sample-level noise on dataset-level noise learning.
arXiv Detail & Related papers (2023-08-02T13:29:31Z) - Multisample Flow Matching: Straightening Flows with Minibatch Couplings [38.82598694134521]
Simulation-free methods for training continuous-time generative models construct probability paths that go between noise distributions and individual data samples.
We propose Multisample Flow Matching, a more general framework that uses non-trivial couplings between data and noise samples.
We show that our proposed methods improve sample consistency on downsampled ImageNet data sets, and lead to better low-cost sample generation.
arXiv Detail & Related papers (2023-04-28T11:33:08Z) - Improving the Robustness of Summarization Models by Detecting and
Removing Input Noise [50.27105057899601]
We present a large empirical study quantifying the sometimes severe loss in performance from different types of input noise for a range of datasets and model sizes.
We propose a light-weight method for detecting and removing such noise in the input during model inference without requiring any training, auxiliary models, or even prior knowledge of the type of noise.
arXiv Detail & Related papers (2022-12-20T00:33:11Z) - Decision Forest Based EMG Signal Classification with Low Volume Dataset
Augmented with Random Variance Gaussian Noise [51.76329821186873]
We produce a model that can classify six different hand gestures with a limited number of samples that generalizes well to a wider audience.
We appeal to a set of more elementary methods such as the use of random bounds on a signal, but desire to show the power these methods can carry in an online setting.
arXiv Detail & Related papers (2022-06-29T23:22:18Z) - Robust Meta-learning with Sampling Noise and Label Noise via
Eigen-Reptile [78.1212767880785]
meta-learner is prone to overfitting since there are only a few available samples.
When handling the data with noisy labels, the meta-learner could be extremely sensitive to label noise.
We present Eigen-Reptile (ER) that updates the meta- parameters with the main direction of historical task-specific parameters.
arXiv Detail & Related papers (2022-06-04T08:48:02Z) - Noise-Aware Statistical Inference with Differentially Private Synthetic
Data [0.0]
We show that simply analysing DP synthetic data as if it were real does not produce valid inferences of population-level quantities.
We tackle this problem by combining synthetic data analysis techniques from the field of multiple imputation, and synthetic data generation.
We develop a novel noise-aware synthetic data generation algorithm NAPSU-MQ using the principle of maximum entropy.
arXiv Detail & Related papers (2022-05-28T16:59:46Z) - Noise-Resistant Deep Metric Learning with Probabilistic Instance
Filtering [59.286567680389766]
Noisy labels are commonly found in real-world data, which cause performance degradation of deep neural networks.
We propose Probabilistic Ranking-based Instance Selection with Memory (PRISM) approach for DML.
PRISM calculates the probability of a label being clean, and filters out potentially noisy samples.
arXiv Detail & Related papers (2021-08-03T12:15:25Z) - Bridging the Gap Between Clean Data Training and Real-World Inference
for Spoken Language Understanding [76.89426311082927]
Existing models are trained on clean data, which causes a textitgap between clean data training and real-world inference.
We propose a method from the perspective of domain adaptation, by which both high- and low-quality samples are embedding into similar vector space.
Experiments on the widely-used dataset, Snips, and large scale in-house dataset (10 million training examples) demonstrate that this method not only outperforms the baseline models on real-world (noisy) corpus but also enhances the robustness, that is, it produces high-quality results under a noisy environment.
arXiv Detail & Related papers (2021-04-13T17:54:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.