Text Counterfactuals via Latent Optimization and Shapley-Guided Search
- URL: http://arxiv.org/abs/2110.11589v1
- Date: Fri, 22 Oct 2021 05:04:40 GMT
- Title: Text Counterfactuals via Latent Optimization and Shapley-Guided Search
- Authors: Quintin Pope, Xiaoli Z. Fern
- Abstract summary: We study the problem of generating counterfactual text for a classification model.
We aim to minimally alter the text to change the model's prediction.
White-box approaches have been successfully applied to similar problems in vision.
- Score: 15.919650185010491
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of generating counterfactual text for a classifier as a
means for understanding and debugging classification. Given a textual input and
a classification model, we aim to minimally alter the text to change the
model's prediction. White-box approaches have been successfully applied to
similar problems in vision where one can directly optimize the continuous
input. Optimization-based approaches become difficult in the language domain
due to the discrete nature of text. We bypass this issue by directly optimizing
in the latent space and leveraging a language model to generate candidate
modifications from optimized latent representations. We additionally use
Shapley values to estimate the combinatoric effect of multiple changes. We then
use these estimates to guide a beam search for the final counterfactual text.
We achieve favorable performance compared to recent white-box and black-box
baselines using human and automatic evaluations. Ablation studies show that
both latent optimization and the use of Shapley values improve success rate and
the quality of the generated counterfactuals.
Related papers
- Predicting from Strings: Language Model Embeddings for Bayesian Optimization [21.370382766970877]
We propose Embed-then-Regress, a paradigm for applying in-context regression over string inputs.
By expressing all inputs as strings we are able to perform general-purpose regression for Optimization over various domains.
arXiv Detail & Related papers (2024-10-14T06:22:11Z) - Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models [85.96013373385057]
Fine-tuning text-to-image models with reward functions trained on human feedback data has proven effective for aligning model behavior with human intent.
However, excessive optimization with such reward models, which serve as mere proxy objectives, can compromise the performance of fine-tuned models.
We propose TextNorm, a method that enhances alignment based on a measure of reward model confidence estimated across a set of semantically contrastive text prompts.
arXiv Detail & Related papers (2024-04-02T11:40:38Z) - Data-driven Prior Learning for Bayesian Optimisation [5.199765487172328]
We show that PLeBO and prior transfer find good inputs in fewer evaluations.
We validate the learned priors and compare to a breadth of transfer learning approaches.
We show that PLeBO and prior transfer find good inputs in fewer evaluations.
arXiv Detail & Related papers (2023-11-24T18:37:52Z) - RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z) - Language Model Decoding as Direct Metrics Optimization [87.68281625776282]
Current decoding methods struggle to generate texts that align with human texts across different aspects.
In this work, we frame decoding from a language model as an optimization problem with the goal of strictly matching the expected performance with human texts.
We prove that this induced distribution is guaranteed to improve the perplexity on human texts, which suggests a better approximation to the underlying distribution of human texts.
arXiv Detail & Related papers (2023-10-02T09:35:27Z) - LRANet: Towards Accurate and Efficient Scene Text Detection with
Low-Rank Approximation Network [63.554061288184165]
We propose a novel parameterized text shape method based on low-rank approximation.
By exploring the shape correlation among different text contours, our method achieves consistency, compactness, simplicity, and robustness in shape representation.
We implement an accurate and efficient arbitrary-shaped text detector named LRANet.
arXiv Detail & Related papers (2023-06-27T02:03:46Z) - Adaptive Meta-learner via Gradient Similarity for Few-shot Text
Classification [11.035878821365149]
We propose a novel Adaptive Meta-learner via Gradient Similarity (AMGS) to improve the model generalization ability to a new task.
Experimental results on several benchmarks demonstrate that the proposed AMGS consistently improves few-shot text classification performance.
arXiv Detail & Related papers (2022-09-10T16:14:53Z) - Fourier Representations for Black-Box Optimization over Categorical
Variables [34.0277529502051]
We propose to use existing methods in conjunction with a surrogate model for the black-box evaluations over purely categorical variables.
To learn such representations, we consider two different settings to update our surrogate model.
Numerical experiments over synthetic benchmarks as well as real-world RNA sequence optimization and design problems demonstrate the representational power of the proposed methods.
arXiv Detail & Related papers (2022-02-08T08:14:58Z) - Bridge the Gap Between CV and NLP! A Gradient-based Textual Adversarial
Attack Framework [17.17479625646699]
We propose a unified framework to craft textual adversarial samples.
In this paper, we instantiate our framework with an attack algorithm named Textual Projected Gradient Descent (T-PGD)
arXiv Detail & Related papers (2021-10-28T17:31:51Z) - Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment.
Policy gradients for local search are often obtained from random perturbations.
We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z) - Stochastic Optimization Forests [60.523606291705214]
We show how to train forest decision policies by growing trees that choose splits to directly optimize the downstream decision quality, rather than splitting to improve prediction accuracy as in the standard random forest algorithm.
We show that our approximate splitting criteria can reduce running time hundredfold, while achieving performance close to forest algorithms that exactly re-optimize for every candidate split.
arXiv Detail & Related papers (2020-08-17T16:56:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.