Identifying Intensity of the Structure and Content in Tweets and the Discriminative Power of Attributes in Context with Referential Translation Machines
- URL: http://arxiv.org/abs/2407.05154v1
- Date: Sat, 6 Jul 2024 18:58:10 GMT
- Title: Identifying Intensity of the Structure and Content in Tweets and the Discriminative Power of Attributes in Context with Referential Translation Machines
- Authors: Ergun Biçici,
- Abstract summary: We use referential translation machines (RTMs) to identify the similarity between an attribute and two words in English.
RTMs are also used to predict the intensity of the structure and content in tweets in English, Arabic, and Spanish.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We use referential translation machines (RTMs) to identify the similarity between an attribute and two words in English by casting the task as machine translation performance prediction (MTPP) between the words and the attribute word and the distance between their similarities for Task 10 with stacked RTM models. RTMs are also used to predict the intensity of the structure and content in tweets in English, Arabic, and Spanish in Task 1 where MTPP is between the tweets and the set of words for the emotion selected from WordNet affect emotion lists. Stacked RTM models obtain encouraging results in both.
Related papers
- Tomato, Tomahto, Tomate: Measuring the Role of Shared Semantics among Subwords in Multilingual Language Models [88.07940818022468]
We take an initial step on measuring the role of shared semantics among subwords in the encoder-only multilingual language models (mLMs)
We form "semantic tokens" by merging the semantically similar subwords and their embeddings.
inspections on the grouped subwords show that they exhibit a wide range of semantic similarities.
arXiv Detail & Related papers (2024-11-07T08:38:32Z) - Exploring State Space and Reasoning by Elimination in Tsetlin Machines [14.150011713654331]
The Tsetlin Machine (TM) has gained significant attention in Machine Learning (ML)
TM is utilised to construct word embedding and describe target words using clauses.
To enhance the descriptive capacity of these clauses, we study the concept of Reasoning by Elimination (RbE) in clauses' formulation.
arXiv Detail & Related papers (2024-07-12T10:58:01Z) - Predicting Word Similarity in Context with Referential Translation Machines [0.0]
We identify the similarity between two words in English by casting the task as machine translation performance prediction (MTPP)
We use referential translation machines (RTMs) which allows a common representation of training and test sets.
RTMs can achieve the top results in Graded Word Similarity in Context (GWSC) task.
arXiv Detail & Related papers (2024-07-07T09:36:41Z) - Cross-lingual Contextualized Phrase Retrieval [63.80154430930898]
We propose a new task formulation of dense retrieval, cross-lingual contextualized phrase retrieval.
We train our Cross-lingual Contextualized Phrase Retriever (CCPR) using contrastive learning.
On the phrase retrieval task, CCPR surpasses baselines by a significant margin, achieving a top-1 accuracy that is at least 13 points higher.
arXiv Detail & Related papers (2024-03-25T14:46:51Z) - A General and Flexible Multi-concept Parsing Framework for Multilingual Semantic Matching [60.51839859852572]
We propose to resolve the text into multi concepts for multilingual semantic matching to liberate the model from the reliance on NER models.
We conduct comprehensive experiments on English datasets QQP and MRPC, and Chinese dataset Medical-SM.
arXiv Detail & Related papers (2024-03-05T13:55:16Z) - Verifying Properties of Tsetlin Machines [18.870370171271126]
We present an exact encoding of TsMs into propositional logic and formally verify properties of TsMs using a SAT solver.
We consider notions of robustness and equivalence from the literature and adapt them for TsMs.
In our experiments, we employ the MNIST and IMDB datasets for (respectively) image and sentiment classification.
arXiv Detail & Related papers (2023-03-25T13:17:21Z) - Retrofitting Multilingual Sentence Embeddings with Abstract Meaning
Representation [70.58243648754507]
We introduce a new method to improve existing multilingual sentence embeddings with Abstract Meaning Representation (AMR)
Compared with the original textual input, AMR is a structured semantic representation that presents the core concepts and relations in a sentence explicitly and unambiguously.
Experiment results show that retrofitting multilingual sentence embeddings with AMR leads to better state-of-the-art performance on both semantic similarity and transfer tasks.
arXiv Detail & Related papers (2022-10-18T11:37:36Z) - More Than Words: Collocation Tokenization for Latent Dirichlet
Allocation Models [71.42030830910227]
We propose a new metric for measuring the clustering quality in settings where the models differ.
We show that topics trained with merged tokens result in topic keys that are clearer, more coherent, and more effective at distinguishing topics than those unmerged models.
arXiv Detail & Related papers (2021-08-24T14:08:19Z) - mT6: Multilingual Pretrained Text-to-Text Transformer with Translation
Pairs [51.67970832510462]
We improve multilingual text-to-text transfer Transformer with translation pairs (mT6)
We explore three cross-lingual text-to-text pre-training tasks, namely, machine translation, translation pair span corruption, and translation span corruption.
Experimental results show that the proposed mT6 improves cross-lingual transferability over mT5.
arXiv Detail & Related papers (2021-04-18T03:24:07Z) - TMR: Evaluating NER Recall on Tough Mentions [1.2183405753834562]
We propose the Tough Mentions Recall (TMR) metrics to supplement traditional named entity recognition (NER) evaluation.
TMR metrics examine recall on specific subsets of "tough" mentions.
We demonstrate the usefulness of these metrics by evaluating corpora of English, Spanish, and Dutch using five recent neural architectures.
arXiv Detail & Related papers (2021-03-23T05:04:14Z) - Incorporate Semantic Structures into Machine Translation Evaluation via
UCCA [9.064153799336536]
We define words carrying important semantic meanings in sentences as semantic core words.
We propose an MT evaluation approach named Semantically Weighted Sentence Similarity (SWSS)
arXiv Detail & Related papers (2020-10-17T06:47:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.