A Survey on Visual Anomaly Detection: Challenge, Approach, and Prospect
- URL: http://arxiv.org/abs/2401.16402v1
- Date: Mon, 29 Jan 2024 18:41:21 GMT
- Title: A Survey on Visual Anomaly Detection: Challenge, Approach, and Prospect
- Authors: Yunkang Cao, Xiaohao Xu, Jiangning Zhang, Yuqi Cheng, Xiaonan Huang,
Guansong Pang, Weiming Shen
- Abstract summary: Visual Anomaly Detection (VAD) endeavors to pinpoint deviations from the concept of normality in visual data, widely applied across diverse domains, e.g., industrial defect inspection, and medical lesion detection.
This survey comprehensively examines recent advancements in VAD by identifying three primary challenges: 1) scarcity of training data, 2) diversity of visual modalities, and 3) complexity of hierarchical anomalies.
- Score: 29.006716009327032
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual Anomaly Detection (VAD) endeavors to pinpoint deviations from the
concept of normality in visual data, widely applied across diverse domains,
e.g., industrial defect inspection, and medical lesion detection. This survey
comprehensively examines recent advancements in VAD by identifying three
primary challenges: 1) scarcity of training data, 2) diversity of visual
modalities, and 3) complexity of hierarchical anomalies. Starting with a brief
overview of the VAD background and its generic concept definitions, we
progressively categorize, emphasize, and discuss the latest VAD progress from
the perspective of sample number, data modality, and anomaly hierarchy. Through
an in-depth analysis of the VAD field, we finally summarize future developments
for VAD and conclude the key findings and contributions of this survey.
Related papers
- Video Anomaly Detection in 10 Years: A Survey and Outlook [10.143205531474907]
Video anomaly detection (VAD) holds immense importance across diverse domains such as surveillance, healthcare, and environmental monitoring.
This survey explores deep learning-based VAD, expanding beyond traditional supervised training paradigms to encompass emerging weakly supervised, self-supervised, and unsupervised approaches.
arXiv Detail & Related papers (2024-05-29T17:56:31Z) - RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis [56.57177181778517]
RadGenome-Chest CT is a large-scale, region-guided 3D chest CT interpretation dataset based on CT-RATE.
We leverage the latest powerful universal segmentation and large language models to extend the original datasets.
arXiv Detail & Related papers (2024-04-25T17:11:37Z) - A Survey on Domain Generalization for Medical Image Analysis [9.410880477358942]
Domain Generalization for MedIA aims to address the domain shift challenge by generalizing effectively and performing robustly across unknown data distributions.
We provide a formal definition of domain shift and domain generalization in medical field, and discuss several related settings.
We summarize the recent methods from three viewpoints: data manipulation level, feature representation level, and model training level, and present some algorithms in detail.
arXiv Detail & Related papers (2024-02-07T17:08:27Z) - Deep Learning and Computer Vision for Glaucoma Detection: A Review [0.8379286663107844]
Glaucoma is the leading cause of irreversible blindness worldwide.
Recent advances in computer vision and deep learning have demonstrated the potential for automated assessment.
We survey recent studies on AI-based glaucoma diagnosis using fundus, optical coherence tomography, and visual field images.
arXiv Detail & Related papers (2023-07-31T09:49:51Z) - VISION Datasets: A Benchmark for Vision-based InduStrial InspectiON [28.511625423590605]
VISION datasets are diverse collection of 14 industrial inspection datasets.
With a total of 18k images encompassing 44 defect types, VISION strives to mirror a wide range of real-world production scenarios.
arXiv Detail & Related papers (2023-06-13T16:31:02Z) - Explainable Anomaly Detection in Images and Videos: A Survey [49.07140708026425]
Anomaly detection and localization of visual data, including images and videos, are of great significance in machine learning academia and applied real-world scenarios.
Despite the rapid development of visual anomaly detection techniques in recent years, the interpretations of these black-box models and reasonable explanations of why anomalies can be distinguished out are scarce.
This paper provides the first survey concentrated on explainable visual anomaly detection methods.
arXiv Detail & Related papers (2023-02-13T20:17:41Z) - AlignTransformer: Hierarchical Alignment of Visual Regions and Disease
Tags for Medical Report Generation [50.21065317817769]
We propose an AlignTransformer framework, which includes the Align Hierarchical Attention (AHA) and the Multi-Grained Transformer (MGT) modules.
Experiments on the public IU-Xray and MIMIC-CXR datasets show that the AlignTransformer can achieve results competitive with state-of-the-art methods on the two datasets.
arXiv Detail & Related papers (2022-03-18T13:43:53Z) - Factored Attention and Embedding for Unstructured-view Topic-related
Ultrasound Report Generation [70.7778938191405]
We propose a novel factored attention and embedding model (termed FAE-Gen) for the unstructured-view topic-related ultrasound report generation.
The proposed FAE-Gen mainly consists of two modules, i.e., view-guided factored attention and topic-oriented factored embedding, which capture the homogeneous and heterogeneous morphological characteristic across different views.
arXiv Detail & Related papers (2022-03-12T15:24:03Z) - A Survey of Visual Sensory Anomaly Detection [53.23336329817023]
Visual sensory anomaly detection (AD) is an essential problem in computer vision.
We provide a comprehensive review of visual sensory AD and category into three levels according to the form of anomalies.
arXiv Detail & Related papers (2022-02-14T19:50:03Z) - Recent Advances in Monocular 2D and 3D Human Pose Estimation: A Deep
Learning Perspective [69.44384540002358]
We provide a comprehensive and holistic 2D-to-3D perspective to tackle this problem.
We categorize the mainstream and milestone approaches since the year 2014 under unified frameworks.
We also summarize the pose representation styles, benchmarks, evaluation metrics, and the quantitative performance of popular approaches.
arXiv Detail & Related papers (2021-04-23T11:07:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.