SINDER: Repairing the Singular Defects of DINOv2
- URL: http://arxiv.org/abs/2407.16826v1
- Date: Tue, 23 Jul 2024 20:34:23 GMT
- Title: SINDER: Repairing the Singular Defects of DINOv2
- Authors: Haoqi Wang, Tong Zhang, Mathieu Salzmann,
- Abstract summary: Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract.
We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
- Score: 61.98878352956125
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vision Transformer models trained on large-scale datasets, although effective, often exhibit artifacts in the patch token they extract. While such defects can be alleviated by re-training the entire model with additional classification tokens, the underlying reasons for the presence of these tokens remain unclear. In this paper, we conduct a thorough investigation of this phenomenon, combining theoretical analysis with empirical observations. Our findings reveal that these artifacts originate from the pre-trained network itself, specifically stemming from the leading left singular vector of the network's weights. Furthermore, to mitigate these defects, we propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset, thereby avoiding the need for complete re-training. We validate our method on various downstream tasks, including unsupervised segmentation, classification, supervised segmentation, and depth estimation, demonstrating its effectiveness in improving model performance. Codes and checkpoints are available at https://github.com/haoqiwang/sinder.
Related papers
- GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features [68.14842693208465]
GeneralAD is an anomaly detection framework designed to operate in semantic, near-distribution, and industrial settings.
We propose a novel self-supervised anomaly generation module that employs straightforward operations like noise addition and shuffling to patch features.
We extensively evaluated our approach on ten datasets, achieving state-of-the-art results in six and on-par performance in the remaining.
arXiv Detail & Related papers (2024-07-17T09:27:41Z) - ProtoVAE: Prototypical Networks for Unsupervised Disentanglement [1.6114012813668934]
We introduce a novel deep generative VAE-based model, ProtoVAE, that leverages a deep metric learning Prototypical network trained using self-supervision.
Our model is completely unsupervised and requires no priori knowledge of the dataset, including the number of factors.
We evaluate our proposed model on the benchmark dSprites, 3DShapes, and MPI3D disentanglement datasets.
arXiv Detail & Related papers (2023-05-16T01:29:26Z) - Informative regularization for a multi-layer perceptron RR Lyrae
classifier under data shift [3.303002683812084]
We propose a scalable and easily adaptable approach based on an informative regularization and an ad-hoc training procedure to mitigate the shift problem.
Our method provides a new path to incorporate knowledge from characteristic features into artificial neural networks to manage the underlying data shift problem.
arXiv Detail & Related papers (2023-03-12T02:49:19Z) - Towards Practical Control of Singular Values of Convolutional Layers [65.25070864775793]
Convolutional neural networks (CNNs) are easy to train, but their essential properties, such as generalization error and adversarial robustness, are hard to control.
Recent research demonstrated that singular values of convolutional layers significantly affect such elusive properties.
We offer a principled approach to alleviating constraints of the prior art at the expense of an insignificant reduction in layer expressivity.
arXiv Detail & Related papers (2022-11-24T19:09:44Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Test-time Adaptation with Slot-Centric Models [63.981055778098444]
Slot-TTA is a semi-supervised scene decomposition model that at test time is adapted per scene through gradient descent on reconstruction or cross-view synthesis objectives.
We show substantial out-of-distribution performance improvements against state-of-the-art supervised feed-forward detectors, and alternative test-time adaptation methods.
arXiv Detail & Related papers (2022-03-21T17:59:50Z) - Efficient and Robust Classification for Sparse Attacks [34.48667992227529]
We consider perturbations bounded by the $ell$--norm, which have been shown as effective attacks in the domains of image-recognition, natural language processing, and malware-detection.
We propose a novel defense method that consists of "truncation" and "adrial training"
Motivated by the insights we obtain, we extend these components to neural network classifiers.
arXiv Detail & Related papers (2022-01-23T21:18:17Z) - CutPaste: Self-Supervised Learning for Anomaly Detection and
Localization [59.719925639875036]
We propose a framework for building anomaly detectors using normal training data only.
We first learn self-supervised deep representations and then build a generative one-class classifier on learned representations.
Our empirical study on MVTec anomaly detection dataset demonstrates the proposed algorithm is general to be able to detect various types of real-world defects.
arXiv Detail & Related papers (2021-04-08T19:04:55Z) - Analyzing Overfitting under Class Imbalance in Neural Networks for Image
Segmentation [19.259574003403998]
In image segmentation neural networks may overfit to the foreground samples from small structures.
In this study, we provide new insights on the problem of overfitting under class imbalance by inspecting the network behavior.
arXiv Detail & Related papers (2021-02-20T14:57:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.