Humor@IITK at SemEval-2021 Task 7: Large Language Models for Quantifying
Humor and Offensiveness
- URL: http://arxiv.org/abs/2104.00933v1
- Date: Fri, 2 Apr 2021 08:22:02 GMT
- Title: Humor@IITK at SemEval-2021 Task 7: Large Language Models for Quantifying
Humor and Offensiveness
- Authors: Aishwarya Gupta, Avik Pal, Bholeshwar Khurana, Lakshay Tyagi, Ashutosh
Modi
- Abstract summary: This paper explores whether large neural models and their ensembles can capture the intricacies associated with humor/offense detection and rating.
Our experiments on the SemEval-2021 Task 7: HaHackathon show that we can develop reasonable humor and offense detection systems with such models.
- Score: 2.251416625953577
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Humor and Offense are highly subjective due to multiple word senses, cultural
knowledge, and pragmatic competence. Hence, accurately detecting humorous and
offensive texts has several compelling use cases in Recommendation Systems and
Personalized Content Moderation. However, due to the lack of an extensive
labeled dataset, most prior works in this domain haven't explored large neural
models for subjective humor understanding. This paper explores whether large
neural models and their ensembles can capture the intricacies associated with
humor/offense detection and rating. Our experiments on the SemEval-2021 Task 7:
HaHackathon show that we can develop reasonable humor and offense detection
systems with such models. Our models are ranked third in subtask 1b and
consistently ranked around the top 33% of the leaderboard for the remaining
subtasks.
Related papers
- Is AI fun? HumorDB: a curated dataset and benchmark to investigate graphical humor [8.75275650545552]
HumorDB is an image-only dataset specifically designed to advance visual humor understanding.
The dataset enables evaluation through binary classification, range regression, and pairwise comparison tasks.
HumorDB shows potential as a valuable benchmark for powerful large multimodal models.
arXiv Detail & Related papers (2024-06-19T13:51:40Z) - Getting Serious about Humor: Crafting Humor Datasets with Unfunny Large Language Models [27.936545041302377]
Large language models (LLMs) can generate synthetic data for humor detection via editing texts.
We benchmark LLMs on an existing human dataset and show that current LLMs display an impressive ability to 'unfun' jokes.
We extend our approach to a code-mixed English-Hindi humor dataset, where we find that GPT-4's synthetic data is highly rated by bilingual annotators.
arXiv Detail & Related papers (2024-02-23T02:58:12Z) - ExPUNations: Augmenting Puns with Keywords and Explanations [88.58174386894913]
We augment an existing dataset of puns with detailed crowdsourced annotations of keywords.
This is the first humor dataset with such extensive and fine-grained annotations specifically for puns.
We propose two tasks: explanation generation to aid with pun classification and keyword-conditioned pun generation.
arXiv Detail & Related papers (2022-10-24T18:12:02Z) - Towards Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First Results [84.37263300062597]
Humor is a substantial element of human social behavior, affect, and cognition.
Current methods of humor detection have been exclusively based on staged data, making them inadequate for "real-world" applications.
We contribute to addressing this deficiency by introducing the novel Passau-Spontaneous Football Coach Humor dataset, comprising about 11 hours of recordings.
arXiv Detail & Related papers (2022-09-28T17:36:47Z) - Do Androids Laugh at Electric Sheep? Humor "Understanding" Benchmarks
from The New Yorker Caption Contest [70.40189243067857]
Large neural networks can now generate jokes, but do they really "understand" humor?
We challenge AI models with three tasks derived from the New Yorker Cartoon Caption Contest.
We find that both types of models struggle at all three tasks.
arXiv Detail & Related papers (2022-09-13T20:54:00Z) - RuMedBench: A Russian Medical Language Understanding Benchmark [58.99199480170909]
The paper describes the open Russian medical language understanding benchmark covering several task types.
We prepare the unified format labeling, data split, and evaluation metrics for new tasks.
A single-number metric expresses a model's ability to cope with the benchmark.
arXiv Detail & Related papers (2022-01-17T16:23:33Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z) - MagicPai at SemEval-2021 Task 7: Method for Detecting and Rating Humor
Based on Multi-Task Adversarial Training [4.691435917434472]
This paper describes MagicPai's system for SemEval 2021 Task 7, HaHackathon: Detecting and Rating Humor and Offense.
This task aims to detect whether the text is humorous and how humorous it is.
We mainly present our solution, a multi-task learning model based on adversarial examples.
arXiv Detail & Related papers (2021-04-21T03:23:02Z) - Uncertainty and Surprisal Jointly Deliver the Punchline: Exploiting
Incongruity-Based Features for Humor Recognition [0.6445605125467573]
We break down any joke into two distinct components: the set-up and the punchline.
Inspired by the incongruity theory of humor, we model the set-up as the part developing semantic uncertainty.
With increasingly powerful language models, we were able to feed the set-up along with the punchline into the GPT-2 language model.
arXiv Detail & Related papers (2020-12-22T13:48:09Z) - Dutch Humor Detection by Generating Negative Examples [5.888646114353371]
Humor detection is usually modeled as a binary classification task, trained to predict if the given text is a joke or another type of text.
We propose using text generation algorithms for imitating the original joke dataset to increase the difficulty for the learning algorithm.
We compare the humor detection capabilities of classic neural network approaches with the state-of-the-art Dutch language model RobBERT.
arXiv Detail & Related papers (2020-10-26T15:15:10Z) - CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language
Learning [78.3857991931479]
We present GROLLA, an evaluation framework for Grounded Language Learning with Attributes.
We also propose a new dataset CompGuessWhat?! as an instance of this framework for evaluating the quality of learned neural representations.
arXiv Detail & Related papers (2020-06-03T11:21:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.