Task-Specific Dependency-based Word Embedding Methods
- URL: http://arxiv.org/abs/2110.13376v1
- Date: Tue, 26 Oct 2021 03:09:41 GMT
- Title: Task-Specific Dependency-based Word Embedding Methods
- Authors: Chengwei Wei, Bin Wang, C.-C. Jay Kuo
- Abstract summary: Two task-specific dependency-based word embedding methods are proposed for text classification.
The first one, called the dependency-based word embedding (DWE), chooses keywords and neighbor words of a target word in the dependency parse tree as contexts to build the word-context matrix.
The second method, named class-enhanced dependency-based word embedding (CEDWE), learns from word-context as well as word-class co-occurrence statistics.
- Score: 32.75244210656976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Two task-specific dependency-based word embedding methods are proposed for
text classification in this work. In contrast with universal word embedding
methods that work for generic tasks, we design task-specific word embedding
methods to offer better performance in a specific task. Our methods follow the
PPMI matrix factorization framework and derive word contexts from the
dependency parse tree. The first one, called the dependency-based word
embedding (DWE), chooses keywords and neighbor words of a target word in the
dependency parse tree as contexts to build the word-context matrix. The second
method, named class-enhanced dependency-based word embedding (CEDWE), learns
from word-context as well as word-class co-occurrence statistics. DWE and CEDWE
are evaluated on popular text classification datasets to demonstrate their
effectiveness. It is shown by experimental results they outperform several
state-of-the-art word embedding methods.
Related papers
- A Comprehensive Empirical Evaluation of Existing Word Embedding
Approaches [5.065947993017158]
We present the characteristics of existing word embedding approaches and analyze them with regard to many classification tasks.
Traditional approaches mostly use matrix factorization to produce word representations, and they are not able to capture the semantic and syntactic regularities of the language very well.
On the other hand, Neural-network-based approaches can capture sophisticated regularities of the language and preserve the word relationships in the generated word representations.
arXiv Detail & Related papers (2023-03-13T15:34:19Z) - Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings.
RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z) - LexSubCon: Integrating Knowledge from Lexical Resources into Contextual
Embeddings for Lexical Substitution [76.615287796753]
We introduce LexSubCon, an end-to-end lexical substitution framework based on contextual embedding models.
This is achieved by combining contextual information with knowledge from structured lexical resources.
Our experiments show that LexSubCon outperforms previous state-of-the-art methods on LS07 and CoInCo benchmark datasets.
arXiv Detail & Related papers (2021-07-11T21:25:56Z) - Deriving Word Vectors from Contextualized Language Models using
Topic-Aware Mention Selection [46.97185212695267]
We propose a method for learning word representations that follows this basic strategy.
We take advantage of contextualized language models (CLMs) rather than bags of word vectors to encode contexts.
We show that this simple strategy leads to high-quality word vectors, which are more predictive of semantic properties than word embeddings and existing CLM-based strategies.
arXiv Detail & Related papers (2021-06-15T08:02:42Z) - SemGloVe: Semantic Co-occurrences for GloVe from BERT [55.420035541274444]
GloVe learns word embeddings by leveraging statistical information from word co-occurrence matrices.
We propose SemGloVe, which distills semantic co-occurrences from BERT into static GloVe word embeddings.
arXiv Detail & Related papers (2020-12-30T15:38:26Z) - Cross-lingual Word Sense Disambiguation using mBERT Embeddings with
Syntactic Dependencies [0.0]
Cross-lingual word sense disambiguation (WSD) tackles the challenge of disambiguating ambiguous words across languages given context.
BERT embedding model has been proven to be effective in contextual information of words.
This project investigates how syntactic information can be added into the BERT embeddings to result in both semantics- and syntax-incorporated word embeddings.
arXiv Detail & Related papers (2020-12-09T20:22:11Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - Comparative Analysis of Word Embeddings for Capturing Word Similarities [0.0]
Distributed language representation has become the most widely used technique for language representation in various natural language processing tasks.
Most of the natural language processing models that are based on deep learning techniques use already pre-trained distributed word representations, commonly called word embeddings.
selecting the appropriate word embeddings is a perplexing task since the projected embedding space is not intuitive to humans.
arXiv Detail & Related papers (2020-05-08T01:16:03Z) - Exploring the Combination of Contextual Word Embeddings and Knowledge
Graph Embeddings [0.0]
Embeddings of knowledge bases (KB) capture the explicit relations between entities denoted by words, but are not able to directly capture the syntagmatic properties of these words.
We propose a new approach using contextual and KB embeddings jointly at the same level.
arXiv Detail & Related papers (2020-04-17T17:49:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.