GRI: Graph-based Relative Isomorphism of Word Embedding Spaces
- URL: http://arxiv.org/abs/2310.12360v1
- Date: Wed, 18 Oct 2023 22:10:47 GMT
- Title: GRI: Graph-based Relative Isomorphism of Word Embedding Spaces
- Authors: Muhammad Asif Ali, Yan Hu, Jianbin Qin, Di Wang
- Abstract summary: Automated construction of bilingual dictionaries using monolingual embedding spaces is a core challenge in machine translation.
Existing attempts aimed at controlling the relative isomorphism of different spaces fail to incorporate the impact of semantically related words in the training objective.
We propose GRI that combines the distributional training objectives with attentive graph convolutions to unanimously consider the impact of semantically similar words.
- Score: 10.984134369344117
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated construction of bilingual dictionaries using monolingual embedding
spaces is a core challenge in machine translation. The end performance of these
dictionaries relies upon the geometric similarity of individual spaces, i.e.,
their degree of isomorphism. Existing attempts aimed at controlling the
relative isomorphism of different spaces fail to incorporate the impact of
semantically related words in the training objective. To address this, we
propose GRI that combines the distributional training objectives with attentive
graph convolutions to unanimously consider the impact of semantically similar
words required to define/compute the relative isomorphism of multiple spaces.
Experimental evaluation shows that GRI outperforms the existing research by
improving the average P@1 by a relative score of up to 63.6%. We release the
codes for GRI at https://github.com/asif6827/GRI.
Related papers
- Spatial Semantic Recurrent Mining for Referring Image Segmentation [63.34997546393106]
We propose Stextsuperscript2RM to achieve high-quality cross-modality fusion.
It follows a working strategy of trilogy: distributing language feature, spatial semantic recurrent coparsing, and parsed-semantic balancing.
Our proposed method performs favorably against other state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-15T00:17:48Z) - Homonym Sense Disambiguation in the Georgian Language [49.1574468325115]
This research proposes a novel approach to the Word Sense Disambiguation (WSD) task in the Georgian language.
It is based on supervised fine-tuning of a pre-trained Large Language Model (LLM) on a dataset formed by filtering the Georgian Common Crawls corpus.
arXiv Detail & Related papers (2024-04-24T21:48:43Z) - GARI: Graph Attention for Relative Isomorphism of Arabic Word Embeddings [10.054788741823627]
Lexical Induction (BLI) is a core challenge in NLP, it relies on the relative isomorphism of individual embedding spaces.
Existing attempts aimed at controlling the relative isomorphism of different embedding spaces fail to incorporate the impact of semantically related words.
We propose GARI that combines the distributional training objectives with multiple isomorphism losses guided by the graph attention network.
arXiv Detail & Related papers (2023-10-19T18:08:22Z) - Leveraging multilingual transfer for unsupervised semantic acoustic word
embeddings [23.822788597966646]
Acoustic word embeddings (AWEs) are fixed-dimensional vector representations of speech segments that encode phonetic content.
In this paper we explore semantic AWE modelling.
We show -- for the first time -- that AWEs can be used for downstream semantic query-by-example search.
arXiv Detail & Related papers (2023-07-05T07:46:54Z) - IsoVec: Controlling the Relative Isomorphism of Word Embedding Spaces [24.256732557154486]
We address the root-cause of faulty cross-lingual mapping: that word embedding training resulted in the underlying spaces being non-isomorphic.
We incorporate global measures of isomorphism directly into the Skip-gram loss function, successfully increasing the relative isomorphism of trained word embedding spaces.
arXiv Detail & Related papers (2022-10-11T02:29:34Z) - A Differentiable Relaxation of Graph Segmentation and Alignment for AMR
Parsing [75.36126971685034]
We treat alignment and segmentation as latent variables in our model and induce them as part of end-to-end training.
Our method also approaches that of a model that relies on citetLyu2018AMRPA's segmentation rules, which were hand-crafted to handle individual AMR constructions.
arXiv Detail & Related papers (2020-10-23T21:22:50Z) - Joint Semantic Analysis with Document-Level Cross-Task Coherence Rewards [13.753240692520098]
We present a neural network architecture for joint coreference resolution and semantic role labeling for English.
We use reinforcement learning to encourage global coherence over the document and between semantic annotations.
This leads to improvements on both tasks in multiple datasets from different domains.
arXiv Detail & Related papers (2020-10-12T09:36:24Z) - Exploring the Hierarchy in Relation Labels for Scene Graph Generation [75.88758055269948]
The proposed method can improve several state-of-the-art baselines by a large margin (up to $33%$ relative gain) in terms of Recall@50.
Experiments show that the proposed simple yet effective method can improve several state-of-the-art baselines by a large margin.
arXiv Detail & Related papers (2020-09-12T17:36:53Z) - Simultaneous Semantic Alignment Network for Heterogeneous Domain
Adaptation [67.37606333193357]
We propose aSimultaneous Semantic Alignment Network (SSAN) to simultaneously exploit correlations among categories and align the centroids for each category across domains.
By leveraging target pseudo-labels, a robust triplet-centroid alignment mechanism is explicitly applied to align feature representations for each category.
Experiments on various HDA tasks across text-to-image, image-to-image and text-to-text successfully validate the superiority of our SSAN against state-of-the-art HDA methods.
arXiv Detail & Related papers (2020-08-04T16:20:37Z) - Are All Good Word Vector Spaces Isomorphic? [79.04509759167952]
We show that variance in performance across language pairs is not only due to typological differences, but can mostly be attributed to the size of the monolingual resources available.
arXiv Detail & Related papers (2020-04-08T15:49:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.