Zero-Shot Cross-Lingual Dependency Parsing through Contextual Embedding
Transformation
- URL: http://arxiv.org/abs/2103.02212v1
- Date: Wed, 3 Mar 2021 06:50:43 GMT
- Title: Zero-Shot Cross-Lingual Dependency Parsing through Contextual Embedding
Transformation
- Authors: Haoran Xu and Philipp Koehn
- Abstract summary: Cross-lingual embedding space mapping is usually studied in static word-level embeddings.
We investigate a contextual embedding alignment approach which is sense-level and dictionary-free.
Experiments on zero-shot dependency parsing through the concept-shared space built by our embedding transformation substantially outperform state-of-the-art methods using multilingual embeddings.
- Score: 7.615096161060399
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Linear embedding transformation has been shown to be effective for zero-shot
cross-lingual transfer tasks and achieve surprisingly promising results.
However, cross-lingual embedding space mapping is usually studied in static
word-level embeddings, where a space transformation is derived by aligning
representations of translation pairs that are referred from dictionaries. We
move further from this line and investigate a contextual embedding alignment
approach which is sense-level and dictionary-free. To enhance the quality of
the mapping, we also provide a deep view of properties of contextual
embeddings, i.e., anisotropy problem and its solution. Experiments on zero-shot
dependency parsing through the concept-shared space built by our embedding
transformation substantially outperform state-of-the-art methods using
multilingual embeddings.
Related papers
- Spatial Semantic Recurrent Mining for Referring Image Segmentation [63.34997546393106]
We propose Stextsuperscript2RM to achieve high-quality cross-modality fusion.
It follows a working strategy of trilogy: distributing language feature, spatial semantic recurrent coparsing, and parsed-semantic balancing.
Our proposed method performs favorably against other state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-15T00:17:48Z) - Robust Unsupervised Cross-Lingual Word Embedding using Domain Flow
Interpolation [48.32604585839687]
Previous adversarial approaches have shown promising results in inducing cross-lingual word embedding without parallel data.
We propose to make use of a sequence of intermediate spaces for smooth bridging.
arXiv Detail & Related papers (2022-10-07T04:37:47Z) - Multilingual Extraction and Categorization of Lexical Collocations with
Graph-aware Transformers [86.64972552583941]
We put forward a sequence tagging BERT-based model enhanced with a graph-aware transformer architecture, which we evaluate on the task of collocation recognition in context.
Our results suggest that explicitly encoding syntactic dependencies in the model architecture is helpful, and provide insights on differences in collocation typification in English, Spanish and French.
arXiv Detail & Related papers (2022-05-23T16:47:37Z) - Cross-Lingual BERT Contextual Embedding Space Mapping with Isotropic and
Isometric Conditions [7.615096161060399]
We investigate a context-aware and dictionary-free mapping approach by leveraging parallel corpora.
Our findings unfold the tight relationship between isotropy, isometry, and isomorphism in normalized contextual embedding spaces.
arXiv Detail & Related papers (2021-07-19T22:57:36Z) - Unsupervised Word Translation Pairing using Refinement based Point Set
Registration [8.568050813210823]
Cross-lingual alignment of word embeddings play an important role in knowledge transfer across languages.
Current unsupervised approaches rely on similarities in geometric structure of word embedding spaces across languages.
This paper proposes BioSpere, a novel framework for unsupervised mapping of bi-lingual word embeddings onto a shared vector space.
arXiv Detail & Related papers (2020-11-26T09:51:29Z) - Unsupervised Distillation of Syntactic Information from Contextualized
Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations.
To this end, we automatically generate groups of sentences which are structurally similar but semantically different.
We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - Refinement of Unsupervised Cross-Lingual Word Embeddings [2.4366811507669124]
Cross-lingual word embeddings aim to bridge the gap between high-resource and low-resource languages.
We propose a self-supervised method to refine the alignment of unsupervised bilingual word embeddings.
arXiv Detail & Related papers (2020-02-21T10:39:53Z) - A Common Semantic Space for Monolingual and Cross-Lingual
Meta-Embeddings [10.871587311621974]
This paper presents a new technique for creating monolingual and cross-lingual meta-embeddings.
Existing word vectors are projected to a common semantic space using linear transformations and averaging.
The resulting cross-lingual meta-embeddings also exhibit excellent cross-lingual transfer learning capabilities.
arXiv Detail & Related papers (2020-01-17T15:42:29Z) - Robust Cross-lingual Embeddings from Parallel Sentences [65.85468628136927]
We propose a bilingual extension of the CBOW method which leverages sentence-aligned corpora to obtain robust cross-lingual word representations.
Our approach significantly improves crosslingual sentence retrieval performance over all other approaches.
It also achieves parity with a deep RNN method on a zero-shot cross-lingual document classification task.
arXiv Detail & Related papers (2019-12-28T16:18:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.