Unsupervised Discovery, Control, and Disentanglement of Semantic
Attributes with Applications to Anomaly Detection
- URL: http://arxiv.org/abs/2002.11169v4
- Date: Mon, 7 Jun 2021 15:50:10 GMT
- Title: Unsupervised Discovery, Control, and Disentanglement of Semantic
Attributes with Applications to Anomaly Detection
- Authors: William Paul, I-Jeng Wang, Fady Alajaji, Philippe Burlina
- Abstract summary: We focus on unsupervised generative representations that discover latent factors controlling image semantic attributes.
For (a), we propose a network architecture that exploits the combination of multiscale generative models with mutual information (MI)
For (b), we derive an analytical result (Lemma 1) that brings clarity to two related but distinct concepts.
- Score: 15.817227809141116
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Our work focuses on unsupervised and generative methods that address the
following goals: (a) learning unsupervised generative representations that
discover latent factors controlling image semantic attributes, (b) studying how
this ability to control attributes formally relates to the issue of latent
factor disentanglement, clarifying related but dissimilar concepts that had
been confounded in the past, and (c) developing anomaly detection methods that
leverage representations learned in (a). For (a), we propose a network
architecture that exploits the combination of multiscale generative models with
mutual information (MI) maximization. For (b), we derive an analytical result
(Lemma 1) that brings clarity to two related but distinct concepts: the ability
of generative networks to control semantic attributes of images they generate,
resulting from MI maximization, and the ability to disentangle latent space
representations, obtained via total correlation minimization. More
specifically, we demonstrate that maximizing semantic attribute control
encourages disentanglement of latent factors. Using Lemma 1 and adopting MI in
our loss function, we then show empirically that, for image generation tasks,
the proposed approach exhibits superior performance as measured in the quality
and disentanglement trade space, when compared to other state of the art
methods, with quality assessed via the Frechet Inception Distance (FID), and
disentanglement via mutual information gap. For (c), we design several systems
for anomaly detection exploiting representations learned in (a), and
demonstrate their performance benefits when compared to state-of-the-art
generative and discriminative algorithms. The above contributions in
representation learning have potential applications in addressing other
important problems in computer vision, such as bias and privacy in AI.
Related papers
- Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly.
Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness.
Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings.
This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z) - Improving Vision Anomaly Detection with the Guidance of Language
Modality [64.53005837237754]
This paper tackles the challenges for vision modality from a multimodal point of view.
We propose Cross-modal Guidance (CMG) to tackle the redundant information issue and sparse space issue.
To learn a more compact latent space for the vision anomaly detector, CMLE learns a correlation structure matrix from the language modality.
arXiv Detail & Related papers (2023-10-04T13:44:56Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Learning Prompt-Enhanced Context Features for Weakly-Supervised Video
Anomaly Detection [37.99031842449251]
Video anomaly detection under weak supervision presents significant challenges.
We present a weakly supervised anomaly detection framework that focuses on efficient context modeling and enhanced semantic discriminability.
Our approach significantly improves the detection accuracy of certain anomaly sub-classes, underscoring its practical value and efficacy.
arXiv Detail & Related papers (2023-06-26T06:45:16Z) - Variational Distillation for Multi-View Learning [104.17551354374821]
We design several variational information bottlenecks to exploit two key characteristics for multi-view representation learning.
Under rigorously theoretical guarantee, our approach enables IB to grasp the intrinsic correlation between observations and semantic labels.
arXiv Detail & Related papers (2022-06-20T03:09:46Z) - Generative and Contrastive Self-Supervised Learning for Graph Anomaly
Detection [14.631674952942207]
We propose a novel method, Self-Supervised Learning for Graph Anomaly Detection (SL-GAD)
Our method constructs different contextual subgraphs based on a target node and employs two modules, generative attribute regression and multi-view contrastive learning for anomaly detection.
We conduct extensive experiments on six benchmark datasets and the results demonstrate that our method outperforms state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-08-23T02:15:21Z) - Proactive Pseudo-Intervention: Causally Informed Contrastive Learning
For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI)
PPI leverages proactive interventions to guard against image features with no causal relevance.
We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z) - Two-Level Adversarial Visual-Semantic Coupling for Generalized Zero-shot
Learning [21.89909688056478]
We propose a new two-level joint idea to augment the generative network with an inference network during training.
This provides strong cross-modal interaction for effective transfer of knowledge between visual and semantic domains.
We evaluate our approach on four benchmark datasets against several state-of-the-art methods, and show its performance.
arXiv Detail & Related papers (2020-07-15T15:34:09Z) - DisCont: Self-Supervised Visual Attribute Disentanglement using Context
Vectors [6.385006149689549]
We propose a self-supervised framework DisCont to disentangle multiple attributes by exploiting the structural inductive biases within images.
Motivated by the recent surge in contrastive learning paradigms, our model bridges the gap between self-supervised contrastive learning algorithms and unsupervised disentanglement.
arXiv Detail & Related papers (2020-06-10T15:29:20Z) - Graph Representation Learning via Graphical Mutual Information
Maximization [86.32278001019854]
We propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graphs and high-level hidden representations.
We develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder.
arXiv Detail & Related papers (2020-02-04T08:33:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.