A Fair and Comprehensive Comparison of Multimodal Tweet Sentiment
Analysis Methods
- URL: http://arxiv.org/abs/2106.08829v1
- Date: Wed, 16 Jun 2021 14:44:48 GMT
- Title: A Fair and Comprehensive Comparison of Multimodal Tweet Sentiment
Analysis Methods
- Authors: Gullal S. Cheema and Sherzod Hakimov and Eric M\"uller-Budack and
Ralph Ewerth
- Abstract summary: We present a comprehensive experimental evaluation and comparison with six state-of-the-art methods.
Results are presented for two different publicly available benchmark datasets of tweets and corresponding images.
- Score: 3.8142537449670963
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Opinion and sentiment analysis is a vital task to characterize subjective
information in social media posts. In this paper, we present a comprehensive
experimental evaluation and comparison with six state-of-the-art methods, from
which we have re-implemented one of them. In addition, we investigate different
textual and visual feature embeddings that cover different aspects of the
content, as well as the recently introduced multimodal CLIP embeddings.
Experimental results are presented for two different publicly available
benchmark datasets of tweets and corresponding images. In contrast to the
evaluation methodology of previous work, we introduce a reproducible and fair
evaluation scheme to make results comparable. Finally, we conduct an error
analysis to outline the limitations of the methods and possibilities for the
future work.
Related papers
- Beyond Coarse-Grained Matching in Video-Text Retrieval [50.799697216533914]
We introduce a new approach for fine-grained evaluation.
Our approach can be applied to existing datasets by automatically generating hard negative test captions.
Experiments on our fine-grained evaluations demonstrate that this approach enhances a model's ability to understand fine-grained differences.
arXiv Detail & Related papers (2024-10-16T09:42:29Z) - Neural Multimodal Topic Modeling: A Comprehensive Evaluation [18.660262940980477]
This paper presents the first systematic and comprehensive evaluation of multimodal topic modeling.
We propose two novel topic modeling solutions and two novel evaluation metrics.
Overall, our evaluation on an unprecedented rich and diverse collection of datasets indicates that both of our models generate coherent and diverse topics.
arXiv Detail & Related papers (2024-03-26T01:29:46Z) - A Large-Scale Empirical Study on Improving the Fairness of Image Classification Models [22.522156479335706]
This paper conducts the first large-scale empirical study to compare the performance of existing state-of-the-art fairness improving techniques.
Our findings reveal substantial variations in the performance of each method across different datasets and sensitive attributes.
Different fairness evaluation metrics, due to their distinct focuses, yield significantly different assessment results.
arXiv Detail & Related papers (2024-01-08T06:53:33Z) - End-to-End Evaluation for Low-Latency Simultaneous Speech Translation [55.525125193856084]
We propose the first framework to perform and evaluate the various aspects of low-latency speech translation under realistic conditions.
This includes the segmentation of the audio as well as the run-time of the different components.
We also compare different approaches to low-latency speech translation using this framework.
arXiv Detail & Related papers (2023-08-07T09:06:20Z) - A Call to Reflect on Evaluation Practices for Age Estimation: Comparative Analysis of the State-of-the-Art and a Unified Benchmark [2.156208381257605]
We offer an extensive comparative analysis for state-of-the-art facial age estimation methods.
We find that the performance differences between the methods are negligible compared to the effect of other factors.
We propose using FaRL as the backbone model and demonstrate its effectiveness on all public datasets.
arXiv Detail & Related papers (2023-07-10T14:02:31Z) - Better Understanding Differences in Attribution Methods via Systematic Evaluations [57.35035463793008]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions.
We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods.
We use these evaluation schemes to study strengths and shortcomings of some widely used attribution methods over a wide range of models.
arXiv Detail & Related papers (2023-03-21T14:24:58Z) - Towards Better Understanding Attribution Methods [77.1487219861185]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions.
We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods.
We also propose a post-processing smoothing step that significantly improves the performance of some attribution methods.
arXiv Detail & Related papers (2022-05-20T20:50:17Z) - On the Faithfulness Measurements for Model Interpretations [100.2730234575114]
Post-hoc interpretations aim to uncover how natural language processing (NLP) models make predictions.
To tackle these issues, we start with three criteria: the removal-based criterion, the sensitivity of interpretations, and the stability of interpretations.
Motivated by the desideratum of these faithfulness notions, we introduce a new class of interpretation methods that adopt techniques from the adversarial domain.
arXiv Detail & Related papers (2021-04-18T09:19:44Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z) - On the Ambiguity of Rank-Based Evaluation of Entity Alignment or Link
Prediction Methods [27.27230441498167]
We take a closer look at the evaluation of two families of methods for enriching information from knowledge graphs: Link Prediction and Entity Alignment.
In particular, we demonstrate that all existing scores can hardly be used to compare results across different datasets.
We show that this leads to various problems in the interpretation of results, which may support misleading conclusions.
arXiv Detail & Related papers (2020-02-17T12:26:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.