OLED: One-Class Learned Encoder-Decoder Network with Adversarial Context
Masking for Novelty Detection
- URL: http://arxiv.org/abs/2103.14953v1
- Date: Sat, 27 Mar 2021 17:59:40 GMT
- Title: OLED: One-Class Learned Encoder-Decoder Network with Adversarial Context
Masking for Novelty Detection
- Authors: John Taylor Jewell, Vahid Reza Khazaie, Yalda Mohsenzadeh
- Abstract summary: novelty detection is the task of recognizing samples that do not belong to the distribution of the target class.
Deep autoencoders have been widely used as a base of many unsupervised novelty detection methods.
We have designed a framework consisting of two competing networks, a Mask Module and a Reconstructor.
- Score: 1.933681537640272
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Novelty detection is the task of recognizing samples that do not belong to
the distribution of the target class. During training, the novelty class is
absent, preventing the use of traditional classification approaches. Deep
autoencoders have been widely used as a base of many unsupervised novelty
detection methods. In particular, context autoencoders have been successful in
the novelty detection task because of the more effective representations they
learn by reconstructing original images from randomly masked images. However, a
significant drawback of context autoencoders is that random masking fails to
consistently cover important structures of the input image, leading to
suboptimal representations - especially for the novelty detection task. In this
paper, to optimize input masking, we have designed a framework consisting of
two competing networks, a Mask Module and a Reconstructor. The Mask Module is a
convolutional autoencoder that learns to generate optimal masks that cover the
most important parts of images. Alternatively, the Reconstructor is a
convolutional encoder-decoder that aims to reconstruct unperturbed images from
masked images. The networks are trained in an adversarial manner in which the
Mask Module generates masks that are applied to images given to the
Reconstructor. In this way, the Mask Module seeks to maximize the
reconstruction error that the Reconstructor is minimizing. When applied to
novelty detection, the proposed approach learns semantically richer
representations compared to context autoencoders and enhances novelty detection
at test time through more optimal masking. Novelty detection experiments on the
MNIST and CIFAR-10 image datasets demonstrate the proposed approach's
superiority over cutting-edge methods. In a further experiment on the UCSD
video dataset for novelty detection, the proposed approach achieves
state-of-the-art results.
Related papers
- ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders [53.3185750528969]
Masked AutoEncoders (MAE) have emerged as a robust self-supervised framework.
We introduce a data-independent method, termed ColorMAE, which generates different binary mask patterns by filtering random noise.
We demonstrate our strategy's superiority in downstream tasks compared to random masking.
arXiv Detail & Related papers (2024-07-17T22:04:00Z) - MaskCD: A Remote Sensing Change Detection Network Based on Mask Classification [29.15203530375882]
Change (CD) from remote sensing (RS) images using deep learning has been widely investigated in the literature.
We propose MaskCD to detect changed areas by adaptively generating categorized masks from input image pairs.
It reconstructs the desired changed objects by decoding the pixel-wise representations into learnable mask proposals.
arXiv Detail & Related papers (2024-04-18T11:05:15Z) - On Mask-based Image Set Desensitization with Recognition Support [46.51027529020668]
We propose a mask-based image desensitization approach while supporting recognition.
We exploit an interpretation algorithm to maintain critical information for the recognition task.
In addition, we propose a feature selection masknet as the model adjustment method to improve the performance based on the masked images.
arXiv Detail & Related papers (2023-12-14T14:26:42Z) - Neural Image Compression Using Masked Sparse Visual Representation [17.229601298529825]
We study neural image compression based on the Sparse Visual Representation (SVR), where images are embedded into a discrete latent space spanned by learned visual codebooks.
By sharing codebooks with the decoder, the encoder transfers codeword indices that are efficient and cross-platform robust.
We propose a Masked Adaptive Codebook learning (M-AdaCode) method that applies masks to the latent feature subspace to balance and reconstruction quality.
arXiv Detail & Related papers (2023-09-20T21:59:23Z) - Not All Image Regions Matter: Masked Vector Quantization for
Autoregressive Image Generation [78.13793505707952]
Existing autoregressive models follow the two-stage generation paradigm that first learns a codebook in the latent space for image reconstruction and then completes the image generation autoregressively based on the learned codebook.
We propose a novel two-stage framework, which consists of Masked Quantization VAE (MQ-VAE) Stack model from modeling redundancy.
arXiv Detail & Related papers (2023-05-23T02:15:53Z) - Improving Masked Autoencoders by Learning Where to Mask [65.89510231743692]
Masked image modeling is a promising self-supervised learning method for visual data.
We present AutoMAE, a framework that uses Gumbel-Softmax to interlink an adversarially-trained mask generator and a mask-guided image modeling process.
In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.
arXiv Detail & Related papers (2023-03-12T05:28:55Z) - MaskSketch: Unpaired Structure-guided Masked Image Generation [56.88038469743742]
MaskSketch is an image generation method that allows spatial conditioning of the generation result using a guiding sketch as an extra conditioning signal during sampling.
We show that intermediate self-attention maps of a masked generative transformer encode important structural information of the input image.
Our results show that MaskSketch achieves high image realism and fidelity to the guiding structure.
arXiv Detail & Related papers (2023-02-10T20:27:02Z) - Context Autoencoder for Self-Supervised Representation Learning [64.63908944426224]
We pretrain an encoder by making predictions in the encoded representation space.
The network is an encoder-regressor-decoder architecture.
We demonstrate the effectiveness of our CAE through superior transfer performance in downstream tasks.
arXiv Detail & Related papers (2022-02-07T09:33:45Z) - Contrastive Attention Network with Dense Field Estimation for Face
Completion [11.631559190975034]
We propose a self-supervised Siamese inference network to improve the generalization and robustness of encoders.
To deal with geometric variations of face images, a dense correspondence field is integrated into the network.
This multi-scale architecture is beneficial for the decoder to utilize discriminative representations learned from encoders into images.
arXiv Detail & Related papers (2021-12-20T02:54:38Z) - Adaptive Shrink-Mask for Text Detection [91.34459257409104]
Existing real-time text detectors reconstruct text contours by shrink-masks directly.
The dependence on predicted shrink-masks leads to unstable detection results.
Super-pixel Window (SPW) is designed to supervise the network.
arXiv Detail & Related papers (2021-11-18T07:38:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.