Related papers: ARTeFACT: Benchmarking Segmentation Models on Diverse Analogue Media Damage

ARTeFACT: Benchmarking Segmentation Models on Diverse Analogue Media Damage

URL: http://arxiv.org/abs/2412.04580v1
Date: Thu, 05 Dec 2024 19:52:25 GMT
Title: ARTeFACT: Benchmarking Segmentation Models on Diverse Analogue Media Damage
Authors: Daniela Ivanova, Marco Aversa, Paul Henderson, John Williamson,
Abstract summary: ARTeFACT is a dataset for damage detection in diverse types analogue media.<n>Over 11,000 annotations cover 15 kinds of damage across various subjects, media, and historical provenance.<n>We evaluate CNN, Transformer, diffusion-based segmentation models, and foundation vision models in zero-shot, supervised, unsupervised and text-guided settings.
Score: 5.6872893893453105
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Accurately detecting and classifying damage in analogue media such as paintings, photographs, textiles, mosaics, and frescoes is essential for cultural heritage preservation. While machine learning models excel in correcting degradation if the damage operator is known a priori, we show that they fail to robustly predict where the damage is even after supervised training; thus, reliable damage detection remains a challenge. Motivated by this, we introduce ARTeFACT, a dataset for damage detection in diverse types analogue media, with over 11,000 annotations covering 15 kinds of damage across various subjects, media, and historical provenance. Furthermore, we contribute human-verified text prompts describing the semantic contents of the images, and derive additional textual descriptions of the annotated damage. We evaluate CNN, Transformer, diffusion-based segmentation models, and foundation vision models in zero-shot, supervised, unsupervised and text-guided settings, revealing their limitations in generalising across media types. Our dataset is available at $\href{https://daniela997.github.io/ARTeFACT/}{https://daniela997.github.io/ARTeFACT/}$ as the first-of-its-kind benchmark for analogue media damage detection and restoration.

Related papers

Web Artifact Attacks Disrupt Vision Language Models [61.59021920232986]
Vision-language models (VLMs) are trained on large-scale, lightly curated web datasets. They learn unintended correlations between semantic concepts and unrelated visual signals. Prior work has weaponized these correlations as an attack vector to manipulate model predictions. We introduce artifact-based attacks: a novel class of manipulations that mislead models using both non-matching text and graphical elements.
arXiv Detail & Related papers (2025-03-17T18:59:29Z)
State-of-the-Art Fails in the Art of Damage Detection [5.6872893893453105]
We show that machine learning models fail to predict where damage is even after supervised training. We introduce DamBench, a dataset for damage detection in diverse analogue media.
arXiv Detail & Related papers (2024-08-23T10:03:07Z)
Visual Context-Aware Person Fall Detection [52.49277799455569]
We present a segmentation pipeline to semi-automatically separate individuals and objects in images. Background objects such as beds, chairs, or wheelchairs can challenge fall detection systems, leading to false positive alarms. We demonstrate that object-specific contextual transformations during training effectively mitigate this challenge.
arXiv Detail & Related papers (2024-04-11T19:06:36Z)
Counterfactual Image Generation for adversarially robust and interpretable Classifiers [1.3859669037499769]
We propose a unified framework leveraging image-to-image translation Generative Adrial Networks (GANs) to produce counterfactual samples. This is achieved by combining the classifier and discriminator into a single model that attributes real images to their respective classes and flags generated images as "fake" We show how the model exhibits improved robustness to adversarial attacks, and we show how the discriminator's "fakeness" value serves as an uncertainty measure of the predictions.
arXiv Detail & Related papers (2023-10-01T18:50:29Z)
Verifying the Robustness of Automatic Credibility Assessment [50.55687778699995]
We show that meaning-preserving changes in input text can mislead the models. We also introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks. Our experimental results show that modern large language models are often more vulnerable to attacks than previous, smaller solutions.
arXiv Detail & Related papers (2023-03-14T16:11:47Z)
Simulating analogue film damage to analyse and improve artefact restoration on high-resolution scans [10.871587311621974]
Digital scans of analogue photographic film typically contain artefacts such as dust and scratches. Deep learning models have shown impressive results in general image inpainting and denoising, but film artefact removal is an understudied problem. There are no publicly available high-quality datasets of real-world analogue film damage for training and evaluation. We collect a dataset of 4K damaged analogue film scans paired with manually-restored versions produced by a human expert. We construct a larger synthetic dataset of damaged images with paired clean versions using a statistical model of artefact shape and occurrence learnt from real, heavily-damaged images.
arXiv Detail & Related papers (2023-02-20T14:24:18Z)
A hierarchical semantic segmentation framework for computer vision-based bridge damage detection [3.7642333932730634]
Computer vision-based damage detection using remote cameras and unmanned aerial vehicles (UAVs) enables efficient and low-cost bridge health monitoring. This paper introduces a semantic segmentation framework that imposes the hierarchical semantic relationship between component category and damage types. In this way, the damage detection model could focus on learning features from possible damaged regions only and avoid the effects of other irrelevant regions.
arXiv Detail & Related papers (2022-07-18T18:42:54Z)
A Comprehensive Study of Image Classification Model Sensitivity to Foregrounds, Backgrounds, and Visual Attributes [58.633364000258645]
We call this dataset RIVAL10 consisting of roughly $26k$ instances over $10$ classes. We evaluate the sensitivity of a broad set of models to noise corruptions in foregrounds, backgrounds and attributes. In our analysis, we consider diverse state-of-the-art architectures (ResNets, Transformers) and training procedures (CLIP, SimCLR, DeiT, Adversarial Training)
arXiv Detail & Related papers (2022-01-26T06:31:28Z)
Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations [71.00754846434744]
We show that imperceptible additive perturbations can significantly alter the disparity map. We show that, when used for adversarial data augmentation, our perturbations result in trained models that are more robust.
arXiv Detail & Related papers (2020-09-21T19:20:09Z)
MSNet: A Multilevel Instance Segmentation Network for Natural Disaster Damage Assessment in Aerial Videos [74.22132693931145]
We study the problem of efficiently assessing building damage after natural disasters like hurricanes, floods or fires. The first contribution is a new dataset, consisting of user-generated aerial videos from social media with annotations of instance-level building damage masks. The second contribution is a new model, namely MSNet, which contains novel region proposal network designs.
arXiv Detail & Related papers (2020-06-30T02:23:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.