#REVAL: a semantic evaluation framework for hashtag recommendation
- URL: http://arxiv.org/abs/2305.18330v1
- Date: Wed, 24 May 2023 07:10:56 GMT
- Title: #REVAL: a semantic evaluation framework for hashtag recommendation
- Authors: Areej Alsini, Du Q. Huynh and Amitava Datta
- Abstract summary: We propose a novel semantic evaluation framework for hashtag recommendation, called #REval.
#REval includes an internal module referred to as BERTag, which automatically learns the hashtag embeddings.
Our experiments on three large datasets show that #REval gave more meaningful hashtag synonyms for hashtag recommendation evaluation.
- Score: 6.746400031322727
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic evaluation of hashtag recommendation models is a fundamental task
in many online social network systems. In the traditional evaluation method,
the recommended hashtags from an algorithm are firstly compared with the ground
truth hashtags for exact correspondences. The number of exact matches is then
used to calculate the hit rate, hit ratio, precision, recall, or F1-score. This
way of evaluating hashtag similarities is inadequate as it ignores the semantic
correlation between the recommended and ground truth hashtags. To tackle this
problem, we propose a novel semantic evaluation framework for hashtag
recommendation, called #REval. This framework includes an internal module
referred to as BERTag, which automatically learns the hashtag embeddings. We
investigate on how the #REval framework performs under different word embedding
methods and different numbers of synonyms and hashtags in the recommendation
using our proposed #REval-hit-ratio measure. Our experiments of the proposed
framework on three large datasets show that #REval gave more meaningful hashtag
synonyms for hashtag recommendation evaluation. Our analysis also highlights
the sensitivity of the framework to the word embedding technique, with #REval
based on BERTag more superior over #REval based on FastText and Word2Vec.
Related papers
- RIGHT: Retrieval-augmented Generation for Mainstream Hashtag
Recommendation [76.24205422163169]
We propose RetrIeval-augmented Generative Mainstream HashTag Recommender (RIGHT)
RIGHT consists of three components: 1) a retriever seeks relevant hashtags from the entire tweet-hashtags set; 2) a selector enhances mainstream identification by introducing global signals; and 3) a generator incorporates input tweets and selected hashtags to directly generate the desired hashtags.
Our method achieves significant improvements over state-of-the-art baselines. Moreover, RIGHT can be easily integrated into large language models, improving the performance of ChatGPT by more than 10%.
arXiv Detail & Related papers (2023-12-16T14:47:03Z) - SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation [72.10931780019297]
Existing watermarking algorithms are vulnerable to paraphrase attacks because of their token-level design.
We propose SemStamp, a robust sentence-level semantic watermarking algorithm based on locality-sensitive hashing (LSH)
Experimental results show that our novel semantic watermark algorithm is not only more robust than the previous state-of-the-art method on both common and bigram paraphrase attacks, but also is better at preserving the quality of generation.
arXiv Detail & Related papers (2023-10-06T03:33:42Z) - Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - Hashtag-Guided Low-Resource Tweet Classification [31.810562621519804]
We propose a novel Hashtag-guided Tweet Classification model (HashTation)
HashTation automatically generates meaningful hashtags for the input tweet to provide useful auxiliary signals for tweet classification.
Experiments show that HashTation achieves significant improvements on seven low-resource tweet classification tasks.
arXiv Detail & Related papers (2023-02-20T18:21:02Z) - HashSet -- A Dataset For Hashtag Segmentation [19.016545782774003]
We argue that model performance should be assessed on a wider variety of hashtags.
We propose HashSet, a dataset comprising of: a) 1.9k manually annotated dataset; b) 3.3M loosely supervised dataset.
We show that the performance of SOTA models for Hashtag drops substantially on proposed dataset.
arXiv Detail & Related papers (2022-01-18T04:40:45Z) - Semantic-Preserving Adversarial Text Attacks [85.32186121859321]
We propose a Bigram and Unigram based adaptive Semantic Preservation Optimization (BU-SPO) method to examine the vulnerability of deep models.
Our method achieves the highest attack success rates and semantics rates by changing the smallest number of words compared with existing methods.
arXiv Detail & Related papers (2021-08-23T09:05:18Z) - Attend and Select: A Segment Attention based Selection Mechanism for
Microblog Hashtag Generation [69.73215951112452]
A hashtag is formed by tokens or phrases that may originate from various fragmentary segments of the original text.
We propose an end-to-end Transformer-based generation model which consists of three phases: encoding, segments-selection, and decoding.
We introduce two large-scale hashtag generation datasets, which are newly collected from Chinese Weibo and English Twitter.
arXiv Detail & Related papers (2021-06-06T15:13:58Z) - Text-to-hashtag Generation using Seq2seq Learning [0.0]
We studied if models based on BiLSTM and BERT can generate hashtags in Brazilian portuguese that can be used in websites.
We processed a corpus of reviews and titles of products as inputs and we generated hashtags as outputs.
arXiv Detail & Related papers (2021-02-01T15:28:27Z) - Hit ratio: An Evaluation Metric for Hashtag Recommendation [6.746400031322727]
We propose a new metric which we call hit ratio for hashtag recommendation.
Most of the research in the area of hashtag recommendation have used classical metrics such as hit rate, precision, recall, and F1-score.
A comparison of hit ratio with the classical evaluation metrics reveals their limitations.
arXiv Detail & Related papers (2020-10-03T02:07:41Z) - On Identifying Hashtags in Disaster Twitter Data [55.17975121160699]
We construct a unique dataset of disaster-related tweets annotated with hashtags useful for filtering actionable information.
Using this dataset, we investigate Long Short Term Memory-based models within a Multi-Task Learning framework.
The best performing model achieves an F1-score as high as 92.22%.
arXiv Detail & Related papers (2020-01-05T22:37:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.