Related papers: A Maritime Industry Experience for Vessel Operational Anomaly Detection: Utilizing Deep Learning Augmented with Lightweight Interpretable Models

A Maritime Industry Experience for Vessel Operational Anomaly Detection: Utilizing Deep Learning Augmented with Lightweight Interpretable Models

URL: http://arxiv.org/abs/2401.00112v2
Date: Sat, 25 Jan 2025 00:34:06 GMT
Title: A Maritime Industry Experience for Vessel Operational Anomaly Detection: Utilizing Deep Learning Augmented with Lightweight Interpretable Models
Authors: Mahshid Helali Moghadam, Mateusz Rzymowski, Lukasz Kulas,
Abstract summary: This study showcases a vessel operational anomaly detection approach that utilizes semi-supervised deep learning models augmented with lightweight interpretable surrogate models.<n>We leverage standard and Long Short-Term Memory (LSTM) autoencoders trained on normal operational data and tested with real anomaly-revealing data.
Score: 0.19116784879310028
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This study presents an industry experience showcasing a vessel operational anomaly detection approach that utilizes semi-supervised deep learning models augmented with lightweight interpretable surrogate models, applied to an industrial sensorized vessel, called TUCANA. We leverage standard and Long Short-Term Memory (LSTM) autoencoders trained on normal operational data and tested with real anomaly-revealing data. We then provide a projection of the inference results on a lower-dimension data map generated by t-distributed stochastic neighbor embedding (t-SNE), which serves as an unsupervised baseline and shows the distribution of the identified anomalies. We also develop lightweight surrogate models using random forest and decision tree to promote transparency and interpretability for the inference results of the deep learning models and assist the engineer with an agile assessment of the flagged anomalies. The approach is empirically evaluated using real data from TUCANA. The empirical results show higher performance of the LSTM autoencoder -- as the anomaly detection module with effective capturing of temporal dependencies in the data -- and demonstrate the practicality of the lightweight surrogate models in providing helpful interpretability, which leads to higher efficiency for the engineer's decision-making.

Related papers

Strengthening Anomaly Awareness [0.0]
We present a refined version of the Anomaly Awareness framework for enhancing unsupervised anomaly detection. Our approach introduces minimal supervision into Variational Autoencoders (VAEs) through a two-stage training strategy.
arXiv Detail & Related papers (2025-04-15T16:52:22Z)
Meta Learning-Driven Iterative Refinement for Robust Anomaly Detection in Industrial Inspection [9.132399905884364]
We propose to leverage the adaptation ability of meta learning approaches to identify and reject noisy training data to improve the learning process. In our model, we employ Model Agnostic Meta Learning (MAML) and an iterative refinement process through an Inter-Quartile Range rejection scheme to enhance their adaptability and robustness.
arXiv Detail & Related papers (2025-03-03T14:11:41Z)
Deep evolving semi-supervised anomaly detection [14.027613461156864]
The aim of this paper is to formalise the task of continual semi-supervised anomaly detection (CSAD) The paper introduces a baseline model of a variational autoencoder (VAE) to work with semi-supervised data along with a continual learning method of deep generative replay with outlier rejection.
arXiv Detail & Related papers (2024-12-01T15:48:37Z)
Remaining Useful Life Prediction: A Study on Multidimensional Industrial Signal Processing and Efficient Transfer Learning Based on Large Language Models [6.118896920507198]
This paper introduces an innovative regression framework utilizing large language models (LLMs) for RUL prediction. Experiments on the Turbofan engine's RUL prediction task show that the proposed model surpasses state-of-the-art (SOTA) methods. With minimal target domain data for fine-tuning, the model outperforms SOTA methods trained on full target domain data.
arXiv Detail & Related papers (2024-10-04T04:21:53Z)
Anomaly Detection of Tabular Data Using LLMs [54.470648484612866]
We show that pre-trained large language models (LLMs) are zero-shot batch-level anomaly detectors. We propose an end-to-end fine-tuning strategy to bring out the potential of LLMs in detecting real anomalies.
arXiv Detail & Related papers (2024-06-24T04:17:03Z)
Low-rank finetuning for LLMs: A fairness perspective [54.13240282850982]
Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models. This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution. We show that low-rank fine-tuning inadvertently preserves undesirable biases and toxic behaviors.
arXiv Detail & Related papers (2024-05-28T20:43:53Z)
The Artificial Neural Twin -- Process Optimization and Continual Learning in Distributed Process Chains [3.79770624632814]
We propose the Artificial Neural Twin, which combines concepts from model predictive control, deep learning, and sensor networks. Our approach introduces differentiable data fusion to estimate the state of distributed process steps. By treating the interconnected process steps as a quasi neural-network, we can backpropagate loss gradients for process optimization or model fine-tuning to process parameters.
arXiv Detail & Related papers (2024-03-27T08:34:39Z)
Reliability in Semantic Segmentation: Can We Use Synthetic Data? [69.28268603137546]
We show for the first time how synthetic data can be specifically generated to assess comprehensively the real-world reliability of semantic segmentation models. This synthetic data is employed to evaluate the robustness of pretrained segmenters. We demonstrate how our approach can be utilized to enhance the calibration and OOD detection capabilities of segmenters.
arXiv Detail & Related papers (2023-12-14T18:56:07Z)
A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime. We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z)
EdgeFD: An Edge-Friendly Drift-Aware Fault Diagnosis System for Industrial IoT [0.0]
We propose the Drift-Aware Weight Consolidation (DAWC) to mitigate the challenges posed by frequent data drift in the industrial Internet of Things (IIoT) DAWC efficiently manages multiple data drift scenarios, minimizing the need for constant model fine-tuning on edge devices. We have also developed a comprehensive diagnosis and visualization platform.
arXiv Detail & Related papers (2023-10-07T06:48:07Z)
Quality In / Quality Out: Data quality more relevant than model choice in anomaly detection with the UGR'16 [0.29998889086656577]
We show that relatively minor modifications on a benchmark dataset cause significantly more impact on model performance than the specific ML technique considered. We also show that the measured model performance is uncertain, as a result of labelling inaccuracies.
arXiv Detail & Related papers (2023-05-31T12:03:12Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
On Robust Numerical Solver for ODE via Self-Attention Mechanism [82.95493796476767]
We explore training efficient and robust AI-enhanced numerical solvers with a small data size by mitigating intrinsic noise disturbances. We first analyze the ability of the self-attention mechanism to regulate noise in supervised learning and then propose a simple-yet-effective numerical solver, Attr, which introduces an additive self-attention mechanism to the numerical solution of differential equations.
arXiv Detail & Related papers (2023-02-05T01:39:21Z)
How robust are pre-trained models to distribution shift? [82.08946007821184]
We show how spurious correlations affect the performance of popular self-supervised learning (SSL) and auto-encoder based models (AE) We develop a novel evaluation scheme with the linear head trained on out-of-distribution (OOD) data, to isolate the performance of the pre-trained models from a potential bias of the linear head used for evaluation.
arXiv Detail & Related papers (2022-06-17T16:18:28Z)
Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction. One of the main challenges in SER is data scarcity. We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z)
Real-World Anomaly Detection by using Digital Twin Systems and Weakly-Supervised Learning [3.0100975935933567]
We present novel weakly-supervised approaches to anomaly detection for industrial settings. The approaches make use of a Digital Twin to generate a training dataset which simulates the normal operation of the machinery. The performance of the proposed methods is compared against various state-of-the-art anomaly detection algorithms on an application to a real-world dataset.
arXiv Detail & Related papers (2020-11-12T10:15:56Z)
Unsupervised Multi-Modal Representation Learning for Affective Computing with Multi-Corpus Wearable Data [16.457778420360537]
We propose an unsupervised framework to reduce the reliance on human supervision. The proposed framework utilizes two stacked convolutional autoencoders to learn latent representations from wearable electrocardiogram (ECG) and electrodermal activity (EDA) signals. Our method outperforms current state-of-the-art results that have performed arousal detection on the same datasets.
arXiv Detail & Related papers (2020-08-24T22:01:55Z)
Towards Interpretable Deep Learning Models for Knowledge Tracing [62.75876617721375]
We propose to adopt the post-hoc method to tackle the interpretability issue for deep learning based knowledge tracing (DLKT) models. Specifically, we focus on applying the layer-wise relevance propagation (LRP) method to interpret RNN-based DLKT model. Experiment results show the feasibility using the LRP method for interpreting the DLKT model's predictions.
arXiv Detail & Related papers (2020-05-13T04:03:21Z)
Interpreting Rate-Distortion of Variational Autoencoder and Using Model Uncertainty for Anomaly Detection [5.491655566898372]
We build a scalable machine learning system for unsupervised anomaly detection via representation learning. We revisit VAE from the perspective of information theory to provide some theoretical foundations on using the reconstruction error. We show empirically the competitive performance of our approach on benchmark datasets.
arXiv Detail & Related papers (2020-05-05T00:03:48Z)
Unsupervised Anomaly Detection with Adversarial Mirrored AutoEncoders [51.691585766702744]
We propose a variant of Adversarial Autoencoder which uses a mirrored Wasserstein loss in the discriminator to enforce better semantic-level reconstruction. We put forward an alternative measure of anomaly score to replace the reconstruction-based metric. Our method outperforms the current state-of-the-art methods for anomaly detection on several OOD detection benchmarks.
arXiv Detail & Related papers (2020-03-24T08:26:58Z)
Data-Driven Symbol Detection via Model-Based Machine Learning [117.58188185409904]
We review a data-driven framework to symbol detection design which combines machine learning (ML) and model-based algorithms. In this hybrid approach, well-known channel-model-based algorithms are augmented with ML-based algorithms to remove their channel-model-dependence. Our results demonstrate that these techniques can yield near-optimal performance of model-based algorithms without knowing the exact channel input-output statistical relationship.
arXiv Detail & Related papers (2020-02-14T06:58:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.