Robust Embeddings Via Distributions
- URL: http://arxiv.org/abs/2104.08420v1
- Date: Sat, 17 Apr 2021 02:02:36 GMT
- Title: Robust Embeddings Via Distributions
- Authors: Kira A. Selby (1), Yinong Wang (1), Ruizhe Wang (1), Peyman Passban
(2), Ahmad Rashid (2), Mehdi Rezagholizadeh (2) and Pascal Poupart (1) ((1)
University of Waterloo, (2) Huawei Noah's Ark Lab)
- Abstract summary: We propose a novel probabilistic embedding-level method to improve the robustness of NLP models.
Our method, Robust Embeddings via Distributions (RED), incorporates information from both noisy tokens and surrounding context to obtain distributions over embedding vectors.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite recent monumental advances in the field, many Natural Language
Processing (NLP) models still struggle to perform adequately on noisy domains.
We propose a novel probabilistic embedding-level method to improve the
robustness of NLP models. Our method, Robust Embeddings via Distributions
(RED), incorporates information from both noisy tokens and surrounding context
to obtain distributions over embedding vectors that can express uncertainty in
semantic space more fully than any deterministic method. We evaluate our method
on a number of downstream tasks using existing state-of-the-art models in the
presence of both natural and synthetic noise, and demonstrate a clear
improvement over other embedding approaches to robustness from the literature.
Related papers
- Language Rectified Flow: Advancing Diffusion Language Generation with Probabilistic Flows [53.31856123113228]
This paper proposes Language Rectified Flow (ours)
Our method is based on the reformulation of the standard probabilistic flow models.
Experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many NLP tasks.
arXiv Detail & Related papers (2024-03-25T17:58:22Z) - TaCo: Targeted Concept Removal in Output Embeddings for NLP via Information Theory and Explainability [4.2560452339165895]
Information theory indicates that a model should not be able to predict sensitive variables, such as gender, ethnicity, and age.
We present a novel approach that operates at the embedding level of an NLP model.
We show that the proposed post-hoc approach significantly reduces gender-related associations in NLP models.
arXiv Detail & Related papers (2023-12-11T16:22:37Z) - Implicit Variational Inference for High-Dimensional Posteriors [7.924706533725115]
In variational inference, the benefits of Bayesian models rely on accurately capturing the true posterior distribution.
We propose using neural samplers that specify implicit distributions, which are well-suited for approximating complex multimodal and correlated posteriors.
Our approach introduces novel bounds for approximate inference using implicit distributions by locally linearising the neural sampler.
arXiv Detail & Related papers (2023-10-10T14:06:56Z) - Observation-Guided Diffusion Probabilistic Models [41.749374023639156]
We propose a novel diffusion-based image generation method called the observation-guided diffusion probabilistic model (OGDM)
Our approach reestablishes the training objective by integrating the guidance of the observation process with the Markov chain.
We demonstrate the effectiveness of our training algorithm using diverse inference techniques on strong diffusion model baselines.
arXiv Detail & Related papers (2023-10-06T06:29:06Z) - Distributionally Robust Model-based Reinforcement Learning with Large
State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment.
We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets.
We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z) - A Cheaper and Better Diffusion Language Model with Soft-Masked Noise [62.719656543880596]
Masked-Diffuse LM is a novel diffusion model for language modeling, inspired by linguistic features in languages.
Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual data.
We demonstrate that our Masked-Diffuse LM can achieve better generation quality than the state-of-the-art diffusion models with better efficiency.
arXiv Detail & Related papers (2023-04-10T17:58:42Z) - Tailoring Language Generation Models under Total Variation Distance [55.89964205594829]
The standard paradigm of neural language generation adopts maximum likelihood estimation (MLE) as the optimizing method.
We develop practical bounds to apply it to language generation.
We introduce the TaiLr objective that balances the tradeoff of estimating TVD.
arXiv Detail & Related papers (2023-02-26T16:32:52Z) - Explaining text classifiers through progressive neighborhood
approximation with realistic samples [19.26084350822197]
The importance of neighborhood construction in local explanation methods has been highlighted in the literature.
Several attempts have been made to improve neighborhood quality for high-dimensional data, for example, texts, by adopting generative models.
We propose a progressive approximation approach that refines the neighborhood of a to-be-explained decision with a careful two-stage approach.
arXiv Detail & Related papers (2023-02-11T11:42:39Z) - Multi-View Knowledge Distillation from Crowd Annotations for
Out-of-Domain Generalization [53.24606510691877]
We propose new methods for acquiring soft-labels from crowd-annotations by aggregating the distributions produced by existing methods.
We demonstrate that these aggregation methods lead to the most consistent performance across four NLP tasks on out-of-domain test sets.
arXiv Detail & Related papers (2022-12-19T12:40:18Z) - Obtaining Better Static Word Embeddings Using Contextual Embedding
Models [53.86080627007695]
Our proposed distillation method is a simple extension of CBOW-based training.
As a side-effect, our approach also allows a fair comparison of both contextual and static embeddings.
arXiv Detail & Related papers (2021-06-08T12:59:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.