Robust Embeddings Via Distributions
- URL: http://arxiv.org/abs/2104.08420v1
- Date: Sat, 17 Apr 2021 02:02:36 GMT
- Title: Robust Embeddings Via Distributions
- Authors: Kira A. Selby (1), Yinong Wang (1), Ruizhe Wang (1), Peyman Passban
(2), Ahmad Rashid (2), Mehdi Rezagholizadeh (2) and Pascal Poupart (1) ((1)
University of Waterloo, (2) Huawei Noah's Ark Lab)
- Abstract summary: We propose a novel probabilistic embedding-level method to improve the robustness of NLP models.
Our method, Robust Embeddings via Distributions (RED), incorporates information from both noisy tokens and surrounding context to obtain distributions over embedding vectors.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite recent monumental advances in the field, many Natural Language
Processing (NLP) models still struggle to perform adequately on noisy domains.
We propose a novel probabilistic embedding-level method to improve the
robustness of NLP models. Our method, Robust Embeddings via Distributions
(RED), incorporates information from both noisy tokens and surrounding context
to obtain distributions over embedding vectors that can express uncertainty in
semantic space more fully than any deterministic method. We evaluate our method
on a number of downstream tasks using existing state-of-the-art models in the
presence of both natural and synthetic noise, and demonstrate a clear
improvement over other embedding approaches to robustness from the literature.
Related papers
- Latent Guided Sampling for Combinatorial Optimization [3.636090511738153]
Recent Combinatorial Optimization methods leverage deep learning to learn solution strategies, trained via Supervised or Reinforcement Learning (RL)<n>While promising, these approaches often rely on task-specific augmentations, perform poorly on out-of-distribution instances, and lack robust inference mechanisms.<n>In this work, we propose LGS-Net, a novel latent space model that conditions on efficient problem instances, and introduce an efficient Neural inference method, Latent Guided Sampling (LGS)
arXiv Detail & Related papers (2025-06-04T08:02:59Z) - DIDiffGes: Decoupled Semi-Implicit Diffusion Models for Real-time Gesture Generation from Speech [42.663766380488205]
DIDiffGes can synthesize high-quality, expressive gestures from speech using only a few sampling steps.
Our method outperforms state-of-the-art approaches in human likeness, appropriateness, and style correctness.
arXiv Detail & Related papers (2025-03-21T11:23:39Z) - Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models [60.00178316095646]
Sentence embedding is essential for many NLP tasks, with contrastive learning methods achieving strong performance using datasets like NLI.
Recent studies leverage large language models (LLMs) to generate sentence pairs, reducing annotation dependency.
We propose a method for controlling the generation direction of LLMs in the latent space. Unlike unconstrained generation, the controlled approach ensures meaningful semantic divergence.
Experiments on multiple benchmarks demonstrate that our method achieves new SOTA performance with a modest cost in ranking sentence synthesis.
arXiv Detail & Related papers (2025-02-19T12:07:53Z) - Uncertainty Quantification for LLMs through Minimum Bayes Risk: Bridging Confidence and Consistency [66.96286531087549]
Uncertainty quantification (UQ) methods for Large Language Models (LLMs) encompass a variety of approaches.<n>We propose a novel approach to integrating model confidence with output consistency, resulting in a family of efficient and robust UQ methods.<n>We evaluate our approach across various tasks such as question answering, abstractive summarization, and machine translation.
arXiv Detail & Related papers (2025-02-07T14:30:12Z) - An Effective Deployment of Diffusion LM for Data Augmentation in Low-Resource Sentiment Classification [2.0930389307057427]
Sentiment classification (SC) often suffers from low-resource challenges such as domain-specific contexts, imbalanced label distributions, and few-shot scenarios.
We propose Diffusion LM to capture in-domain knowledge and generate pseudo samples by reconstructing strong label-related tokens.
arXiv Detail & Related papers (2024-09-05T02:51:28Z) - Language Rectified Flow: Advancing Diffusion Language Generation with Probabilistic Flows [53.31856123113228]
This paper proposes Language Rectified Flow (ours)
Our method is based on the reformulation of the standard probabilistic flow models.
Experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many NLP tasks.
arXiv Detail & Related papers (2024-03-25T17:58:22Z) - Implicit Variational Inference for High-Dimensional Posteriors [7.924706533725115]
In variational inference, the benefits of Bayesian models rely on accurately capturing the true posterior distribution.
We propose using neural samplers that specify implicit distributions, which are well-suited for approximating complex multimodal and correlated posteriors.
Our approach introduces novel bounds for approximate inference using implicit distributions by locally linearising the neural sampler.
arXiv Detail & Related papers (2023-10-10T14:06:56Z) - Observation-Guided Diffusion Probabilistic Models [41.749374023639156]
We propose a novel diffusion-based image generation method called the observation-guided diffusion probabilistic model (OGDM)
Our approach reestablishes the training objective by integrating the guidance of the observation process with the Markov chain.
We demonstrate the effectiveness of our training algorithm using diverse inference techniques on strong diffusion model baselines.
arXiv Detail & Related papers (2023-10-06T06:29:06Z) - Distributionally Robust Model-based Reinforcement Learning with Large
State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment.
We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets.
We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z) - A Cheaper and Better Diffusion Language Model with Soft-Masked Noise [62.719656543880596]
Masked-Diffuse LM is a novel diffusion model for language modeling, inspired by linguistic features in languages.
Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual data.
We demonstrate that our Masked-Diffuse LM can achieve better generation quality than the state-of-the-art diffusion models with better efficiency.
arXiv Detail & Related papers (2023-04-10T17:58:42Z) - Tailoring Language Generation Models under Total Variation Distance [55.89964205594829]
The standard paradigm of neural language generation adopts maximum likelihood estimation (MLE) as the optimizing method.
We develop practical bounds to apply it to language generation.
We introduce the TaiLr objective that balances the tradeoff of estimating TVD.
arXiv Detail & Related papers (2023-02-26T16:32:52Z) - Explaining text classifiers through progressive neighborhood
approximation with realistic samples [19.26084350822197]
The importance of neighborhood construction in local explanation methods has been highlighted in the literature.
Several attempts have been made to improve neighborhood quality for high-dimensional data, for example, texts, by adopting generative models.
We propose a progressive approximation approach that refines the neighborhood of a to-be-explained decision with a careful two-stage approach.
arXiv Detail & Related papers (2023-02-11T11:42:39Z) - Multi-View Knowledge Distillation from Crowd Annotations for
Out-of-Domain Generalization [53.24606510691877]
We propose new methods for acquiring soft-labels from crowd-annotations by aggregating the distributions produced by existing methods.
We demonstrate that these aggregation methods lead to the most consistent performance across four NLP tasks on out-of-domain test sets.
arXiv Detail & Related papers (2022-12-19T12:40:18Z) - Obtaining Better Static Word Embeddings Using Contextual Embedding
Models [53.86080627007695]
Our proposed distillation method is a simple extension of CBOW-based training.
As a side-effect, our approach also allows a fair comparison of both contextual and static embeddings.
arXiv Detail & Related papers (2021-06-08T12:59:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.