Related papers: Self-Supervised Anomaly Detection in the Wild: Favor Joint Embeddings Methods

Self-Supervised Anomaly Detection in the Wild: Favor Joint Embeddings Methods

URL: http://arxiv.org/abs/2410.04289v1
Date: Sat, 5 Oct 2024 21:27:47 GMT
Title: Self-Supervised Anomaly Detection in the Wild: Favor Joint Embeddings Methods
Authors: Daniel Otero, Rafael Mateus, Randall Balestriero,
Abstract summary: Self-Supervised Learning (SSL) offers a promising approach by learning robust representations from unlabeled data. This paper provides a comprehensive evaluation of SSL methods for real-world anomaly detection, focusing on sewer infrastructure.
Score: 12.277762115388187
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Accurate anomaly detection is critical in vision-based infrastructure inspection, where it helps prevent costly failures and enhances safety. Self-Supervised Learning (SSL) offers a promising approach by learning robust representations from unlabeled data. However, its application in anomaly detection remains underexplored. This paper addresses this gap by providing a comprehensive evaluation of SSL methods for real-world anomaly detection, focusing on sewer infrastructure. Using the Sewer-ML dataset, we evaluate lightweight models such as ViT-Tiny and ResNet-18 across SSL frameworks, including BYOL, Barlow Twins, SimCLR, DINO, and MAE, under varying class imbalance levels. Through 250 experiments, we rigorously assess the performance of these SSL methods to ensure a robust and comprehensive evaluation. Our findings highlight the superiority of joint-embedding methods like SimCLR and Barlow Twins over reconstruction-based approaches such as MAE, which struggle to maintain performance under class imbalance. Furthermore, we find that the SSL model choice is more critical than the backbone architecture. Additionally, we emphasize the need for better label-free assessments of SSL representations, as current methods like RankMe fail to adequately evaluate representation quality, making cross-validation without labels infeasible. Despite the remaining performance gap between SSL and supervised models, these findings highlight the potential of SSL to enhance anomaly detection, paving the way for further research in this underexplored area of SSL applications.

Related papers

Revisiting semi-supervised learning in the era of foundation models [28.414667991336067]
Semi-supervised learning (SSL) leverages abundant unlabeled data alongside limited labeled data to enhance learning. We develop new SSL benchmark datasets where frozen vision foundation models (VFMs) underperform and systematically evaluate representative SSL methods. We make a surprising observation: parameter-efficient fine-tuning (PEFT) using only labeled data often matches SSL performance, even without leveraging unlabeled data. To overcome the notorious issue of noisy pseudo-labels, we propose ensembling multiple PEFT approaches and VFM backbones to produce more robust pseudo-labels.
arXiv Detail & Related papers (2025-03-12T18:01:10Z)
SeMi: When Imbalanced Semi-Supervised Learning Meets Mining Hard Examples [54.760757107700755]
Semi-Supervised Learning (SSL) can leverage abundant unlabeled data to boost model performance. The class-imbalanced data distribution in real-world scenarios poses great challenges to SSL, resulting in performance degradation. We propose a method that enhances the performance of Imbalanced Semi-Supervised Learning by Mining Hard Examples (SeMi)
arXiv Detail & Related papers (2025-01-10T14:35:16Z)
On the Discriminability of Self-Supervised Representation Learning [38.598160031349686]
Self-supervised learning (SSL) has recently achieved significant success in downstream visual tasks. A notable gap still exists between SSL and supervised learning (SL), especially in complex downstream tasks.
arXiv Detail & Related papers (2024-07-18T14:18:03Z)
A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification [51.35500308126506]
Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels. We study how classification-based evaluation protocols for SSL correlate and how well they predict downstream performance on different dataset types.
arXiv Detail & Related papers (2024-07-16T23:17:36Z)
Reinforcement Learning-Guided Semi-Supervised Learning [20.599506122857328]
We propose a novel Reinforcement Learning Guided SSL method, RLGSSL, that formulates SSL as a one-armed bandit problem. RLGSSL incorporates a carefully designed reward function that balances the use of labeled and unlabeled data to enhance generalization performance. We demonstrate the effectiveness of RLGSSL through extensive experiments on several benchmark datasets and show that our approach achieves consistent superior performance compared to state-of-the-art SSL methods.
arXiv Detail & Related papers (2024-05-02T21:52:24Z)
Active Semi-Supervised Learning by Exploring Per-Sample Uncertainty and Consistency [30.94964727745347]
We propose a method called Active Semi-supervised Learning (ASSL) to improve accuracy of models at a lower cost. ASSL involves more dynamic model updates than Active Learning (AL) due to the use of unlabeled data. ASSL achieved about 5.3 times higher computational efficiency than Semi-supervised Learning (SSL) while achieving the same performance.
arXiv Detail & Related papers (2023-03-15T22:58:23Z)
Complementing Semi-Supervised Learning with Uncertainty Quantification [6.612035830987296]
We propose a novel unsupervised uncertainty-aware objective that relies on aleatoric and epistemic uncertainty quantification. Our results outperform the state-of-the-art results on complex datasets such as CIFAR-100 and Mini-ImageNet.
arXiv Detail & Related papers (2022-07-22T00:15:02Z)
OpenLDN: Learning to Discover Novel Classes for Open-World Semi-Supervised Learning [110.40285771431687]
Semi-supervised learning (SSL) is one of the dominant approaches to address the annotation bottleneck of supervised learning. Recent SSL methods can effectively leverage a large repository of unlabeled data to improve performance while relying on a small set of labeled data. This work introduces OpenLDN that utilizes a pairwise similarity loss to discover novel classes.
arXiv Detail & Related papers (2022-07-05T18:51:05Z)
Collaborative Intelligence Orchestration: Inconsistency-Based Fusion of Semi-Supervised Learning and Active Learning [60.26659373318915]
Active learning (AL) and semi-supervised learning (SSL) are two effective, but often isolated, means to alleviate the data-hungry problem. We propose an innovative Inconsistency-based virtual aDvErial algorithm to further investigate SSL-AL's potential superiority. Two real-world case studies visualize the practical industrial value of applying and deploying the proposed data sampling algorithm.
arXiv Detail & Related papers (2022-06-07T13:28:43Z)
Robust Deep Semi-Supervised Learning: A Brief Introduction [63.09703308309176]
Semi-supervised learning (SSL) aims to improve learning performance by leveraging unlabeled data when labels are insufficient. SSL with deep models has proven to be successful on standard benchmark tasks. However, they are still vulnerable to various robustness threats in real-world applications.
arXiv Detail & Related papers (2022-02-12T04:16:41Z)
Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance. Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations. We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z)
Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning [101.28281124670647]
Open-set semi-supervised learning (open-set SSL) investigates a challenging but practical scenario where out-of-distribution (OOD) samples are contained in the unlabeled data. We propose a novel training mechanism that could effectively exploit the presence of OOD data for enhanced feature learning. Our approach substantially lifts the performance on open-set SSL and outperforms the state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-08-12T09:14:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.