LOLgorithm: Integrating Semantic,Syntactic and Contextual Elements for Humor Classification
- URL: http://arxiv.org/abs/2408.06335v1
- Date: Mon, 12 Aug 2024 17:52:11 GMT
- Title: LOLgorithm: Integrating Semantic,Syntactic and Contextual Elements for Humor Classification
- Authors: Tanisha Khurana, Kaushik Pillalamarri, Vikram Pande, Munindar Singh,
- Abstract summary: We categorize features into syntactic, semantic, and contextual dimensions, including lexicons, structural statistics, Word2Vec, WordNet, and phonetic style.
Our proposed model, Colbert, utilizes BERT embeddings and parallel hidden layers to capture sentence congruity.
SHAP interpretations and decision trees identify influential features, revealing that a holistic approach improves humor detection accuracy on unseen data.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper explores humor detection through a linguistic lens, prioritizing syntactic, semantic, and contextual features over computational methods in Natural Language Processing. We categorize features into syntactic, semantic, and contextual dimensions, including lexicons, structural statistics, Word2Vec, WordNet, and phonetic style. Our proposed model, Colbert, utilizes BERT embeddings and parallel hidden layers to capture sentence congruity. By combining syntactic, semantic, and contextual features, we train Colbert for humor detection. Feature engineering examines essential syntactic and semantic features alongside BERT embeddings. SHAP interpretations and decision trees identify influential features, revealing that a holistic approach improves humor detection accuracy on unseen data. Integrating linguistic cues from different dimensions enhances the model's ability to understand humor complexity beyond traditional computational methods.
Related papers
- Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue [71.15186328127409]
Paralinguistics-enhanced Generative Pretrained Transformer (ParalinGPT)
Model takes the conversational context of text, speech embeddings, and paralinguistic attributes as input prompts within a serialized multitasking framework.
We utilize the Switchboard-1 corpus, including its sentiment labels as the paralinguistic attribute, as our spoken dialogue dataset.
arXiv Detail & Related papers (2023-12-23T18:14:56Z) - Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics
Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions.
This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z) - Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - Corpus-Guided Contrast Sets for Morphosyntactic Feature Detection in
Low-Resource English Varieties [3.3536302616846734]
We present a human-in-the-loop approach to generate and filter effective contrast sets via corpus-guided edits.
We show that our approach improves feature detection for both Indian English and African American English, demonstrate how it can assist linguistic research, and release our fine-tuned models for use by other researchers.
arXiv Detail & Related papers (2022-09-15T21:19:31Z) - Compositional Generalization in Grounded Language Learning via Induced
Model Sparsity [81.38804205212425]
We consider simple language-conditioned navigation problems in a grid world environment with disentangled observations.
We design an agent that encourages sparse correlations between words in the instruction and attributes of objects, composing them together to find the goal.
Our agent maintains a high level of performance on goals containing novel combinations of properties even when learning from a handful of demonstrations.
arXiv Detail & Related papers (2022-07-06T08:46:27Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - TextEssence: A Tool for Interactive Analysis of Semantic Shifts Between
Corpora [14.844685568451833]
We introduce TextEssence, an interactive system designed to enable comparative analysis of corpora using embeddings.
TextEssence includes visual, neighbor-based, and similarity-based modes of embedding analysis in a lightweight, web-based interface.
arXiv Detail & Related papers (2021-03-19T21:26:28Z) - Advancing Humor-Focused Sentiment Analysis through Improved
Contextualized Embeddings and Model Architecture [0.0]
Humor allows us to express thoughts and feelings conveniently and effectively.
As language models become ubiquitous through virtual-assistants and IOT devices, the need to develop humor-aware models rises exponentially.
arXiv Detail & Related papers (2020-11-23T22:30:32Z) - A Common Semantic Space for Monolingual and Cross-Lingual
Meta-Embeddings [10.871587311621974]
This paper presents a new technique for creating monolingual and cross-lingual meta-embeddings.
Existing word vectors are projected to a common semantic space using linear transformations and averaging.
The resulting cross-lingual meta-embeddings also exhibit excellent cross-lingual transfer learning capabilities.
arXiv Detail & Related papers (2020-01-17T15:42:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.