Stylist: Style-Driven Feature Ranking for Robust Novelty Detection
- URL: http://arxiv.org/abs/2310.03738v1
- Date: Thu, 5 Oct 2023 17:58:32 GMT
- Title: Stylist: Style-Driven Feature Ranking for Robust Novelty Detection
- Authors: Stefan Smeu, Elena Burceanu, Emanuela Haller, Andrei Liviu Nicolicioiu
- Abstract summary: We propose to use the formalization of separating into semantic or content changes, that are relevant to our task, and style changes, that are irrelevant.
Within this formalization, we define the robust novelty detection as the task of finding semantic changes while being robust to style distributional shifts.
We show that our selection manages to remove features responsible for spurious correlations and improve novelty detection performance.
- Score: 8.402607231390606
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Novelty detection aims at finding samples that differ in some form from the
distribution of seen samples. But not all changes are created equal. Data can
suffer a multitude of distribution shifts, and we might want to detect only
some types of relevant changes. Similar to works in out-of-distribution
generalization, we propose to use the formalization of separating into semantic
or content changes, that are relevant to our task, and style changes, that are
irrelevant. Within this formalization, we define the robust novelty detection
as the task of finding semantic changes while being robust to style
distributional shifts. Leveraging pretrained, large-scale model
representations, we introduce Stylist, a novel method that focuses on dropping
environment-biased features. First, we compute a per-feature score based on the
feature distribution distances between environments. Next, we show that our
selection manages to remove features responsible for spurious correlations and
improve novelty detection performance. For evaluation, we adapt domain
generalization datasets to our task and analyze the methods behaviors. We
additionally built a large synthetic dataset where we have control over the
spurious correlations degree. We prove that our selection mechanism improves
novelty detection algorithms across multiple datasets, containing both
stylistic and content shifts.
Related papers
- Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance.
DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator.
Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z) - Environment-biased Feature Ranking for Novelty Detection Robustness [8.402607231390606]
We tackle the problem of robust novelty detection, where we aim to detect novelties in terms of semantic content.
We propose a method that starts with a pretrained embedding and a multi-env setup and manages to rank the features based on their environment-focus.
arXiv Detail & Related papers (2023-09-21T17:58:26Z) - Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot
Text Classification Tasks [75.42002070547267]
We propose a self evolution learning (SE) based mixup approach for data augmentation in text classification.
We introduce a novel instance specific label smoothing approach, which linearly interpolates the model's output and one hot labels of the original samples to generate new soft for label mixing up.
arXiv Detail & Related papers (2023-05-22T23:43:23Z) - Even Small Correlation and Diversity Shifts Pose Dataset-Bias Issues [19.4921353136871]
We study two types of distribution shifts: diversity shifts, which occur when test samples exhibit patterns unseen during training, and correlation shifts, which occur when test data present a different correlation between seen invariant and spurious features.
We propose an integrated protocol to analyze both types of shifts using datasets where they co-exist in a controllable manner.
arXiv Detail & Related papers (2023-05-09T23:40:23Z) - Intra-class Adaptive Augmentation with Neighbor Correction for Deep
Metric Learning [99.14132861655223]
We propose a novel intra-class adaptive augmentation (IAA) framework for deep metric learning.
We reasonably estimate intra-class variations for every class and generate adaptive synthetic samples to support hard samples mining.
Our method significantly improves and outperforms the state-of-the-art methods on retrieval performances by 3%-6%.
arXiv Detail & Related papers (2022-11-29T14:52:38Z) - Fake It Till You Make It: Near-Distribution Novelty Detection by
Score-Based Generative Models [54.182955830194445]
existing models either fail or face a dramatic drop under the so-called near-distribution" setting.
We propose to exploit a score-based generative model to produce synthetic near-distribution anomalous data.
Our method improves the near-distribution novelty detection by 6% and passes the state-of-the-art by 1% to 5% across nine novelty detection benchmarks.
arXiv Detail & Related papers (2022-05-28T02:02:53Z) - Deep learning model solves change point detection for multiple change
types [69.77452691994712]
A change points detection aims to catch an abrupt disorder in data distribution.
We propose an approach that works in the multiple-distributions scenario.
arXiv Detail & Related papers (2022-04-15T09:44:21Z) - Learning a Unified Sample Weighting Network for Object Detection [113.98404690619982]
Region sampling or weighting is significantly important to the success of modern region-based object detectors.
We argue that sample weighting should be data-dependent and task-dependent.
We propose a unified sample weighting network to predict a sample's task weights.
arXiv Detail & Related papers (2020-06-11T16:19:16Z) - Incremental Unsupervised Domain-Adversarial Training of Neural Networks [17.91571291302582]
In the context of supervised statistical learning, it is typically assumed that the training set comes from the same distribution that draws the test samples.
Here we take a different avenue and approach the problem from an incremental point of view, where the model is adapted to the new domain iteratively.
Our results report a clear improvement with respect to the non-incremental case in several datasets, also outperforming other state-of-the-art domain adaptation algorithms.
arXiv Detail & Related papers (2020-01-13T09:54:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.