A Cluster-based Approach for Improving Isotropy in Contextual Embedding
Space
- URL: http://arxiv.org/abs/2106.01183v1
- Date: Wed, 2 Jun 2021 14:26:37 GMT
- Title: A Cluster-based Approach for Improving Isotropy in Contextual Embedding
Space
- Authors: Sara Rajaee and Mohammad Taher Pilehvar
- Abstract summary: The representation degeneration problem in Contextual Word Representations (CWRs) hurts the expressiveness of the embedding space.
We propose a local cluster-based method to address the degeneration issue in contextual embedding spaces.
We show that removing dominant directions of verb representations can transform the space to better suit semantic applications.
- Score: 18.490856440975996
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The representation degeneration problem in Contextual Word Representations
(CWRs) hurts the expressiveness of the embedding space by forming an
anisotropic cone where even unrelated words have excessively positive
correlations. Existing techniques for tackling this issue require a learning
process to re-train models with additional objectives and mostly employ a
global assessment to study isotropy. Our quantitative analysis over isotropy
shows that a local assessment could be more accurate due to the clustered
structure of CWRs. Based on this observation, we propose a local cluster-based
method to address the degeneration issue in contextual embedding spaces. We
show that in clusters including punctuations and stop words, local dominant
directions encode structural information, removing which can improve CWRs
performance on semantic tasks. Moreover, we find that tense information in verb
representations dominates sense semantics. We show that removing dominant
directions of verb representations can transform the space to better suit
semantic applications. Our experiments demonstrate that the proposed
cluster-based method can mitigate the degeneration problem on multiple tasks.
Related papers
- Spatial Semantic Recurrent Mining for Referring Image Segmentation [63.34997546393106]
We propose Stextsuperscript2RM to achieve high-quality cross-modality fusion.
It follows a working strategy of trilogy: distributing language feature, spatial semantic recurrent coparsing, and parsed-semantic balancing.
Our proposed method performs favorably against other state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-15T00:17:48Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - CPR++: Object Localization via Single Coarse Point Supervision [55.8671776333499]
coarse point refinement (CPR) is first attempt to alleviate semantic variance from an algorithmic perspective.
CPR reduces semantic variance by selecting a semantic centre point in a neighbourhood region to replace the initial annotated point.
CPR++ can obtain scale information and further reduce the semantic variance in a global region.
arXiv Detail & Related papers (2024-01-30T17:38:48Z) - Learning Disentangled Semantic Spaces of Explanations via Invertible Neural Networks [10.880057430629126]
Disentangled latent spaces usually have better semantic separability and geometrical properties, which leads to better interpretability and more controllable data generation.
In this work, we focus on a more general form of sentence disentanglement, targeting the localised modification and control of more general sentence semantic features.
We introduce a flow-based invertible neural network (INN) mechanism integrated with a transformer-based language Autoencoder (AE) in order to deliver latent spaces with better separability properties.
arXiv Detail & Related papers (2023-05-02T18:27:13Z) - Discovering Class-Specific GAN Controls for Semantic Image Synthesis [73.91655061467988]
We propose a novel method for finding spatially disentangled class-specific directions in the latent space of pretrained SIS models.
We show that the latent directions found by our method can effectively control the local appearance of semantic classes.
arXiv Detail & Related papers (2022-12-02T21:39:26Z) - Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised
Referring Expression Grounding [214.8003571700285]
Weakly supervised Referring Expression Grounding (REG) aims to ground a particular target in an image described by a language expression.
We design an entity-enhanced adaptive reconstruction network (EARN)
EARN includes three modules: entity enhancement, adaptive grounding, and collaborative reconstruction.
arXiv Detail & Related papers (2022-07-18T05:30:45Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - Contextual-Relation Consistent Domain Adaptation for Semantic
Segmentation [44.19436340246248]
This paper presents an innovative local contextual-relation consistent domain adaptation technique.
It aims to achieve local-level consistencies during the global-level alignment.
Experiments demonstrate its superior segmentation performance as compared with state-of-the-art methods.
arXiv Detail & Related papers (2020-07-05T19:00:46Z) - Focus on Semantic Consistency for Cross-domain Crowd Understanding [34.560447389853614]
Some domain adaptation algorithms try to liberate it by training models with synthetic data.
We found that a mass of estimation errors in the background areas impede the performance of the existing methods.
In this paper, we propose a domain adaptation method to eliminate it.
arXiv Detail & Related papers (2020-02-20T08:51:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.