Sound Event Classification in an Industrial Environment: Pipe Leakage
Detection Use Case
- URL: http://arxiv.org/abs/2205.02706v1
- Date: Thu, 5 May 2022 15:26:22 GMT
- Title: Sound Event Classification in an Industrial Environment: Pipe Leakage
Detection Use Case
- Authors: Ibrahim Shaer and Abdallah Shami
- Abstract summary: A multi-stage Machine Learning pipeline is proposed for pipe leakage detection in an industrial environment.
The proposed pipeline applies multiple steps, each addressing the environment's challenges.
The results show that the model produces excellent results with 99% accuracy and an F1-score of 0.93 and 0.9 for the respective datasets.
- Score: 3.9414768019101682
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, a multi-stage Machine Learning (ML) pipeline is proposed for
pipe leakage detection in an industrial environment. As opposed to other
industrial and urban environments, the environment under study includes many
interfering background noises, complicating the identification of leaks.
Furthermore, the harsh environmental conditions limit the amount of data
collected and impose the use of low-complexity algorithms. To address the
environment's constraints, the developed ML pipeline applies multiple steps,
each addressing the environment's challenges. The proposed ML pipeline first
reduces the data dimensionality by feature selection techniques and then
incorporates time correlations by extracting time-based features. The resultant
features are fed to a Support Vector Machine (SVM) of low-complexity that
generalizes well to a small amount of data. An extensive experimental procedure
was carried out on two datasets, one with background industrial noise and one
without, to evaluate the validity of the proposed pipeline. The SVM
hyper-parameters and parameters specific to the pipeline steps were tuned as
part of the experimental procedure. The best models obtained from the dataset
with industrial noise and leaks were applied to datasets without noise and with
and without leaks to test their generalizability. The results show that the
model produces excellent results with 99\% accuracy and an F1-score of 0.93 and
0.9 for the respective datasets.
Related papers
- Flow Matching for Atmospheric Retrieval of Exoplanets: Where Reliability meets Adaptive Noise Levels [38.84835238599221]
Flow matching posterior estimation (FMPE) is a new machine learning approach to atmospheric retrieval.
FMPE trains about 3 times faster than neural posterior estimation (NPE) and yields higher IS efficiencies.
IS successfully corrects inaccurate ML results, identifies model failures via low efficiencies, and provides accurate estimates of the Bayesian evidence.
arXiv Detail & Related papers (2024-10-28T19:28:07Z) - RecFlow: An Industrial Full Flow Recommendation Dataset [66.06445386541122]
Industrial recommendation systems rely on the multi-stage pipeline to balance effectiveness and efficiency when delivering items to users.
We introduce RecFlow, an industrial full flow recommendation dataset designed to bridge the gap between offline RS benchmarks and the real online environment.
Our dataset comprises 38M interactions from 42K users across nearly 9M items with additional 1.9B stage samples collected from 9.3M online requests over 37 days and spanning 6 stages.
arXiv Detail & Related papers (2024-10-28T09:36:03Z) - SYNOSIS: Image synthesis pipeline for machine vision in metal surface inspection [1.1802456989915404]
We introduce a complete pipeline which describes in detail how to approach image synthesis for surface inspection.
The pipeline is in detail evaluated for milled and sandblasted aluminum surfaces.
arXiv Detail & Related papers (2024-10-18T19:46:12Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Let's Roll: Synthetic Dataset Analysis for Pedestrian Detection Across
Different Shutter Types [7.0441427250832644]
This paper studies the impact of different shutter mechanisms on machine learning (ML) object detection models on a synthetic dataset.
In particular, we train and evaluate mainstream detection models with our synthetically-generated paired GS and RS datasets.
arXiv Detail & Related papers (2023-09-15T04:07:42Z) - Can We Transfer Noise Patterns? A Multi-environment Spectrum Analysis
Model Using Generated Cases [10.876490928902838]
spectral data-based testing devices suffer from complex noise patterns when deployed in non-laboratory environments.
We propose a noise patterns transferring model, which takes the spectrum of standard water samples in different environments as cases and learns the differences in their noise patterns.
We generate a sample-to-sample case-base to exclude the interference of sample-level noise on dataset-level noise learning.
arXiv Detail & Related papers (2023-08-02T13:29:31Z) - Improving the Robustness of Summarization Models by Detecting and
Removing Input Noise [50.27105057899601]
We present a large empirical study quantifying the sometimes severe loss in performance from different types of input noise for a range of datasets and model sizes.
We propose a light-weight method for detecting and removing such noise in the input during model inference without requiring any training, auxiliary models, or even prior knowledge of the type of noise.
arXiv Detail & Related papers (2022-12-20T00:33:11Z) - Decision Forest Based EMG Signal Classification with Low Volume Dataset
Augmented with Random Variance Gaussian Noise [51.76329821186873]
We produce a model that can classify six different hand gestures with a limited number of samples that generalizes well to a wider audience.
We appeal to a set of more elementary methods such as the use of random bounds on a signal, but desire to show the power these methods can carry in an online setting.
arXiv Detail & Related papers (2022-06-29T23:22:18Z) - Robust Meta-learning with Sampling Noise and Label Noise via
Eigen-Reptile [78.1212767880785]
meta-learner is prone to overfitting since there are only a few available samples.
When handling the data with noisy labels, the meta-learner could be extremely sensitive to label noise.
We present Eigen-Reptile (ER) that updates the meta- parameters with the main direction of historical task-specific parameters.
arXiv Detail & Related papers (2022-06-04T08:48:02Z) - Bridging the Gap Between Clean Data Training and Real-World Inference
for Spoken Language Understanding [76.89426311082927]
Existing models are trained on clean data, which causes a textitgap between clean data training and real-world inference.
We propose a method from the perspective of domain adaptation, by which both high- and low-quality samples are embedding into similar vector space.
Experiments on the widely-used dataset, Snips, and large scale in-house dataset (10 million training examples) demonstrate that this method not only outperforms the baseline models on real-world (noisy) corpus but also enhances the robustness, that is, it produces high-quality results under a noisy environment.
arXiv Detail & Related papers (2021-04-13T17:54:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.