Related papers: A Cluster-based Approach for Improving Isotropy in Contextual Embedding Space

A Cluster-based Approach for Improving Isotropy in Contextual Embedding Space

URL: http://arxiv.org/abs/2106.01183v1
Date: Wed, 2 Jun 2021 14:26:37 GMT
Title: A Cluster-based Approach for Improving Isotropy in Contextual Embedding Space
Authors: Sara Rajaee and Mohammad Taher Pilehvar
Abstract summary: The representation degeneration problem in Contextual Word Representations (CWRs) hurts the expressiveness of the embedding space. We propose a local cluster-based method to address the degeneration issue in contextual embedding spaces. We show that removing dominant directions of verb representations can transform the space to better suit semantic applications.
Score: 18.490856440975996
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The representation degeneration problem in Contextual Word Representations (CWRs) hurts the expressiveness of the embedding space by forming an anisotropic cone where even unrelated words have excessively positive correlations. Existing techniques for tackling this issue require a learning process to re-train models with additional objectives and mostly employ a global assessment to study isotropy. Our quantitative analysis over isotropy shows that a local assessment could be more accurate due to the clustered structure of CWRs. Based on this observation, we propose a local cluster-based method to address the degeneration issue in contextual embedding spaces. We show that in clusters including punctuations and stop words, local dominant directions encode structural information, removing which can improve CWRs performance on semantic tasks. Moreover, we find that tense information in verb representations dominates sense semantics. We show that removing dominant directions of verb representations can transform the space to better suit semantic applications. Our experiments demonstrate that the proposed cluster-based method can mitigate the degeneration problem on multiple tasks.

Related papers

Spatial Semantic Recurrent Mining for Referring Image Segmentation [63.34997546393106]
We propose Stextsuperscript2RM to achieve high-quality cross-modality fusion. It follows a working strategy of trilogy: distributing language feature, spatial semantic recurrent coparsing, and parsed-semantic balancing. Our proposed method performs favorably against other state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-15T00:17:48Z)
Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets. In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem. This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z)
CPR++: Object Localization via Single Coarse Point Supervision [55.8671776333499]
coarse point refinement (CPR) is first attempt to alleviate semantic variance from an algorithmic perspective. CPR reduces semantic variance by selecting a semantic centre point in a neighbourhood region to replace the initial annotated point. CPR++ can obtain scale information and further reduce the semantic variance in a global region.
arXiv Detail & Related papers (2024-01-30T17:38:48Z)
Learning Disentangled Semantic Spaces of Explanations via Invertible Neural Networks [10.880057430629126]
Disentangled latent spaces usually have better semantic separability and geometrical properties, which leads to better interpretability and more controllable data generation. In this work, we focus on a more general form of sentence disentanglement, targeting the localised modification and control of more general sentence semantic features. We introduce a flow-based invertible neural network (INN) mechanism integrated with a transformer-based language Autoencoder (AE) in order to deliver latent spaces with better separability properties.
arXiv Detail & Related papers (2023-05-02T18:27:13Z)
Discovering Class-Specific GAN Controls for Semantic Image Synthesis [73.91655061467988]
We propose a novel method for finding spatially disentangled class-specific directions in the latent space of pretrained SIS models. We show that the latent directions found by our method can effectively control the local appearance of semantic classes.
arXiv Detail & Related papers (2022-12-02T21:39:26Z)
Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding [214.8003571700285]
Weakly supervised Referring Expression Grounding (REG) aims to ground a particular target in an image described by a language expression. We design an entity-enhanced adaptive reconstruction network (EARN) EARN includes three modules: entity enhancement, adaptive grounding, and collaborative reconstruction.
arXiv Detail & Related papers (2022-07-18T05:30:45Z)
Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation. This paper aims to address the issue with a mask-and-predict strategy. We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions. Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z)
Contextual-Relation Consistent Domain Adaptation for Semantic Segmentation [44.19436340246248]
This paper presents an innovative local contextual-relation consistent domain adaptation technique. It aims to achieve local-level consistencies during the global-level alignment. Experiments demonstrate its superior segmentation performance as compared with state-of-the-art methods.
arXiv Detail & Related papers (2020-07-05T19:00:46Z)
Focus on Semantic Consistency for Cross-domain Crowd Understanding [34.560447389853614]
Some domain adaptation algorithms try to liberate it by training models with synthetic data. We found that a mass of estimation errors in the background areas impede the performance of the existing methods. In this paper, we propose a domain adaptation method to eliminate it.
arXiv Detail & Related papers (2020-02-20T08:51:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.