Robust Concept Erasure via Kernelized Rate-Distortion Maximization
- URL: http://arxiv.org/abs/2312.00194v1
- Date: Thu, 30 Nov 2023 21:10:44 GMT
- Title: Robust Concept Erasure via Kernelized Rate-Distortion Maximization
- Authors: Somnath Basu Roy Chowdhury, Nicholas Monath, Avinava Dubey, Amr Ahmed,
Snigdha Chaturvedi
- Abstract summary: We propose a new distance metric learning-based objective, the Kernelized Rate-Distortion Maximizer (KRaM)
KRaM fits a transformation of representations to match a specified distance measure (defined by a labeled concept to erase) using a modified rate-distortion function.
We find that KRaM effectively erases various types of concepts: categorical, continuous, and vector-valued variables from data representations across diverse domains.
- Score: 38.19696482602788
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distributed representations provide a vector space that captures meaningful
relationships between data instances. The distributed nature of these
representations, however, entangles together multiple attributes or concepts of
data instances (e.g., the topic or sentiment of a text, characteristics of the
author (age, gender, etc), etc). Recent work has proposed the task of concept
erasure, in which rather than making a concept predictable, the goal is to
remove an attribute from distributed representations while retaining other
information from the original representation space as much as possible. In this
paper, we propose a new distance metric learning-based objective, the
Kernelized Rate-Distortion Maximizer (KRaM), for performing concept erasure.
KRaM fits a transformation of representations to match a specified distance
measure (defined by a labeled concept to erase) using a modified
rate-distortion function. Specifically, KRaM's objective function aims to make
instances with similar concept labels dissimilar in the learned representation
space while retaining other information. We find that optimizing KRaM
effectively erases various types of concepts: categorical, continuous, and
vector-valued variables from data representations across diverse domains. We
also provide a theoretical analysis of several properties of KRaM's objective.
To assess the quality of the learned representations, we propose an alignment
score to evaluate their similarity with the original representation space.
Additionally, we conduct experiments to showcase KRaM's efficacy in various
settings, from erasing binary gender variables in word embeddings to
vector-valued variables in GPT-3 representations.
Related papers
- Separating common from salient patterns with Contrastive Representation
Learning [2.250968907999846]
Contrastive Analysis aims at separating common factors of variation between two datasets.
Current models based on Variational Auto-Encoders have shown poor performance in learning semantically-expressive representations.
We propose to leverage the ability of Contrastive Learning to learn semantically expressive representations well adapted for Contrastive Analysis.
arXiv Detail & Related papers (2024-02-19T08:17:13Z) - Dynamic Visual Semantic Sub-Embeddings and Fast Re-Ranking [0.5242869847419834]
We propose a Dynamic Visual Semantic Sub-Embeddings framework (DVSE) to reduce the information entropy.
To encourage the generated candidate embeddings to capture various semantic variations, we construct a mixed distribution.
We compare the performance with existing set-based method using four image feature encoders and two text feature encoders on three benchmark datasets.
arXiv Detail & Related papers (2023-09-15T04:39:11Z) - KMF: Knowledge-Aware Multi-Faceted Representation Learning for Zero-Shot
Node Classification [75.95647590619929]
Zero-Shot Node Classification (ZNC) has been an emerging and crucial task in graph data analysis.
We propose a Knowledge-Aware Multi-Faceted framework (KMF) that enhances the richness of label semantics.
A novel geometric constraint is developed to alleviate the problem of prototype drift caused by node information aggregation.
arXiv Detail & Related papers (2023-08-15T02:38:08Z) - An Integral Projection-based Semantic Autoencoder for Zero-Shot Learning [0.46644955105516456]
Zero-shot Learning (ZSL) classification categorizes or predicts classes (labels) that are not included in the training set (unseen classes)
Recent works proposed different semantic autoencoder (SAE) models where the encoder embeds a visual feature space into the semantic space and the decoder reconstructs the original visual feature space.
We propose an integral projection-based semantic autoencoder (IP-SAE) where an encoder projects a visual feature space vectord with the semantic space into a latent representation space.
arXiv Detail & Related papers (2023-06-26T12:06:20Z) - Reflection Invariance Learning for Few-shot Semantic Segmentation [53.20466630330429]
Few-shot semantic segmentation (FSS) aims to segment objects of unseen classes in query images with only a few annotated support images.
This paper proposes a fresh few-shot segmentation framework to mine the reflection invariance in a multi-view matching manner.
Experiments on both PASCAL-$5textiti$ and COCO-$20textiti$ datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-01T15:14:58Z) - Measuring the Interpretability of Unsupervised Representations via
Quantized Reverse Probing [97.70862116338554]
We investigate the problem of measuring interpretability of self-supervised representations.
We formulate the latter as estimating the mutual information between the representation and a space of manually labelled concepts.
We use our method to evaluate a large number of self-supervised representations, ranking them by interpretability.
arXiv Detail & Related papers (2022-09-07T16:18:50Z) - VAE-CE: Visual Contrastive Explanation using Disentangled VAEs [3.5027291542274357]
Variational Autoencoder-based Contrastive Explanation (VAE-CE)
We build the model using a disentangled VAE, extended with a new supervised method for disentangling individual dimensions.
An analysis on synthetic data and MNIST shows that the approaches to both disentanglement and explanation provide benefits over other methods.
arXiv Detail & Related papers (2021-08-20T13:15:24Z) - Instance-Level Relative Saliency Ranking with Graph Reasoning [126.09138829920627]
We present a novel unified model to segment salient instances and infer relative saliency rank order.
A novel loss function is also proposed to effectively train the saliency ranking branch.
experimental results demonstrate that our proposed model is more effective than previous methods.
arXiv Detail & Related papers (2021-07-08T13:10:42Z) - Learning Disentangled Representations with Latent Variation
Predictability [102.4163768995288]
This paper defines the variation predictability of latent disentangled representations.
Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs.
We develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations.
arXiv Detail & Related papers (2020-07-25T08:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.