Related papers: Deep Learning Brasil -- NLP at SemEval-2020 Task 9: Overview of Sentiment Analysis of Code-Mixed Tweets

Related papers

LENS: Multi-level Evaluation of Multimodal Reasoning with Large Language Models [59.0256377330646]
Lens is a benchmark with 3.4K contemporary images and 60K+ human-authored questions covering eight tasks and 12 daily scenarios.<n>This dataset intrinsically supports to evaluate MLLMs to handle image-invariable prompts, from basic perception to compositional reasoning.<n>We evaluate 15+ frontier MLLMs such as Qwen2.5-VL-72B, InternVL3-78B, GPT-4o and two reasoning models QVQ-72B-preview and Kimi-VL.
arXiv Detail & Related papers (2025-05-21T15:06:59Z)
LecEval: An Automated Metric for Multimodal Knowledge Acquisition in Multimedia Learning [58.98865450345401]
We introduce LecEval, an automated metric grounded in Mayer's Cognitive Theory of Multimedia Learning.<n>LecEval assesses effectiveness using four rubrics: Content Relevance (CR), Expressive Clarity (EC), Logical Structure (LS) and Audience Engagement (AE)<n>We curate a large-scale dataset of over 2,000 slides from more than 50 online course videos, annotated with fine-grained human ratings.
arXiv Detail & Related papers (2025-05-04T12:06:47Z)
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning [58.7966588457529]
InfiMM-WebMath-40B is a high-quality dataset of interleaved image-text documents. It comprises 24 million web pages, 85 million associated image URLs, and 40 billion text tokens, all meticulously extracted and filtered from CommonCrawl. Our evaluations on text-only benchmarks show that, despite utilizing only 40 billion tokens, our dataset significantly enhances the performance of our 1.3B model. Our models set a new state-of-the-art among open-source models on multi-modal math benchmarks such as MathVerse and We-Math.
arXiv Detail & Related papers (2024-09-19T08:41:21Z)
Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data. We design a simple but effective ensemble-based framework that combines various transfer learning techniques. We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z)
UrduFake@FIRE2020: Shared Track on Fake News Identification in Urdu [62.6928395368204]
This paper gives the overview of the first shared task at FIRE 2020 on fake news detection in the Urdu language. The goal is to identify fake news using a dataset composed of 900 annotated news articles for training and 400 news articles for testing. The dataset contains news in five domains: (i) Health, (ii) Sports, (iii) Showbiz, (iv) Technology, and (v) Business.
arXiv Detail & Related papers (2022-07-25T03:46:51Z)
Overview of the Shared Task on Fake News Detection in Urdu at FIRE 2020 [62.6928395368204]
Task was posed as a binary classification task, in which the goal is to differentiate between real and fake news. We provided a dataset divided into 900 annotated news articles for training and 400 news articles for testing. 42 teams from 6 different countries (India, China, Egypt, Germany, Pakistan, and the UK) registered for the task.
arXiv Detail & Related papers (2022-07-25T03:41:32Z)
Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups. We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective. Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z)
HFL at SemEval-2022 Task 8: A Linguistics-inspired Regression Model with Data Augmentation for Multilingual News Similarity [16.454545004093735]
This paper describes our system designed for SemEval-2022 Task 8: Multilingual News Article Similarity. We proposed a linguistics-inspired model trained with a few task-specific strategies. Our system ranked 1st on the leaderboard while achieving a Pearson's Correlation Coefficient of 0.818 on the official evaluation set.
arXiv Detail & Related papers (2022-04-11T03:08:37Z)
Sentiment Analysis of Code-Mixed Social Media Text (Hinglish) [4.081440927534578]
Various stages involved in performing the sentiment analysis were data consolidation, data cleaning, data transformation and modelling. The models were created using various machine learning algorithms such as SVM, KNN, Decision Trees, Random Forests, Naive Bayes, Logistic Regression, and ensemble voting classifiers.
arXiv Detail & Related papers (2021-02-24T09:15:34Z)
Palomino-Ochoa at SemEval-2020 Task 9: Robust System based on Transformer for Code-Mixed Sentiment Classification [1.6244541005112747]
We present a transfer learning system to perform a mixed Spanish-English sentiment classification task. Our proposal uses the state-of-the-art language model BERT and embed it within a ULMFiT transfer learning pipeline.
arXiv Detail & Related papers (2020-11-18T18:25:58Z)
Phonemer at WNUT-2020 Task 2: Sequence Classification Using COVID Twitter BERT and Bagging Ensemble Technique based on Plurality Voting [0.0]
We develop a system that automatically identifies whether an English Tweet related to the novel coronavirus (COVID-19) is informative or not. Our final approach achieved an F1-score of 0.9037 and we were ranked sixth overall with F1-score as the evaluation criteria.
arXiv Detail & Related papers (2020-10-01T10:54:54Z)
NLP-CIC at SemEval-2020 Task 9: Analysing sentiment in code-switching language using a simple deep-learning classifier [63.137661897716555]
Code-switching is a phenomenon in which two or more languages are used in the same message. We use a standard convolutional neural network model to predict the sentiment of tweets in a blend of Spanish and English languages.
arXiv Detail & Related papers (2020-09-07T19:57:09Z)
LIMSI_UPV at SemEval-2020 Task 9: Recurrent Convolutional Neural Network for Code-mixed Sentiment Analysis [8.8561720398658]
This paper describes the participation of LIMSI UPV team in SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text. The proposed approach competed in SentiMix Hindi-English subtask, that addresses the problem of predicting the sentiment of a given Hindi-English code-mixed tweet. We propose Recurrent Convolutional Neural Network that combines both the recurrent neural network and the convolutional network to better capture the semantics of the text.
arXiv Detail & Related papers (2020-08-30T13:52:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.