Can Pre-trained Language Models Understand Chinese Humor?
- URL: http://arxiv.org/abs/2407.04105v1
- Date: Thu, 4 Jul 2024 18:13:38 GMT
- Title: Can Pre-trained Language Models Understand Chinese Humor?
- Authors: Yuyan Chen, Zhixu Li, Jiaqing Liang, Yanghua Xiao, Bang Liu, Yunwen Chen,
- Abstract summary: This paper is the first work that systematically investigates the humor understanding ability of pre-trained language models (PLMs)
We construct a comprehensive Chinese humor dataset, which can fully meet all the data requirements of the proposed evaluation framework.
Our empirical study on the Chinese humor dataset yields some valuable observations, which are of great guiding value for future optimization of PLMs in humor understanding and generation.
- Score: 74.96509580592004
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Humor understanding is an important and challenging research in natural language processing. As the popularity of pre-trained language models (PLMs), some recent work makes preliminary attempts to adopt PLMs for humor recognition and generation. However, these simple attempts do not substantially answer the question: {\em whether PLMs are capable of humor understanding?} This paper is the first work that systematically investigates the humor understanding ability of PLMs. For this purpose, a comprehensive framework with three evaluation steps and four evaluation tasks is designed. We also construct a comprehensive Chinese humor dataset, which can fully meet all the data requirements of the proposed evaluation framework. Our empirical study on the Chinese humor dataset yields some valuable observations, which are of great guiding value for future optimization of PLMs in humor understanding and generation.
Related papers
- Small But Funny: A Feedback-Driven Approach to Humor Distillation [19.498647865012426]
We study the effect of assigning a dual role to the Large Language Models (LLMs) - as a "teacher" generating data, and a "critic" evaluating the student's performance.
Our experiments on humor generation reveal that the incorporation of feedback significantly narrows the performance gap between SLMs and their larger counterparts.
arXiv Detail & Related papers (2024-02-28T07:02:38Z) - Getting Serious about Humor: Crafting Humor Datasets with Unfunny Large Language Models [27.936545041302377]
Large language models (LLMs) can generate synthetic data for humor detection via editing texts.
We benchmark LLMs on an existing human dataset and show that current LLMs display an impressive ability to 'unfun' jokes.
We extend our approach to a code-mixed English-Hindi humor dataset, where we find that GPT-4's synthetic data is highly rated by bilingual annotators.
arXiv Detail & Related papers (2024-02-23T02:58:12Z) - Exploring In-Context Learning of Textless Speech Language Model for Speech Classification Tasks [98.5311231450689]
In-context learning (ICL) has played an essential role in utilizing large language models (LLMs)
This study is the first work exploring ICL for speech classification tasks with textless speech LM.
arXiv Detail & Related papers (2023-10-19T05:31:45Z) - An Empirical Study of Instruction-tuning Large Language Models in
Chinese [32.5288378307064]
This paper makes an in-depth empirical study of instruction-tuning LLMs in Chinese, which can serve as a cookbook.
Specifically, we systematically explore the impact of LLM bases, parameter-efficient methods, instruction data types.
We also conduct experiment to study the impact of other factors, e.g., chain-of-thought data and human-value alignment.
arXiv Detail & Related papers (2023-10-11T09:18:09Z) - Towards Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First Results [84.37263300062597]
Humor is a substantial element of human social behavior, affect, and cognition.
Current methods of humor detection have been exclusively based on staged data, making them inadequate for "real-world" applications.
We contribute to addressing this deficiency by introducing the novel Passau-Spontaneous Football Coach Humor dataset, comprising about 11 hours of recordings.
arXiv Detail & Related papers (2022-09-28T17:36:47Z) - An Interpretability Evaluation Benchmark for Pre-trained Language Models [37.16893581395874]
We propose a novel evaluation benchmark providing with both English and Chinese annotated data.
It tests LMs abilities in multiple dimensions, i.e., grammar, semantics, knowledge, reasoning and computation.
It contains perturbed instances for each original instance, so as to use the rationale consistency under perturbations as the metric for faithfulness.
arXiv Detail & Related papers (2022-07-28T08:28:09Z) - Pre-Trained Language Models for Interactive Decision-Making [72.77825666035203]
We describe a framework for imitation learning in which goals and observations are represented as a sequence of embeddings.
We demonstrate that this framework enables effective generalization across different environments.
For test tasks involving novel goals or novel scenes, initializing policies with language models improves task completion rates by 43.6%.
arXiv Detail & Related papers (2022-02-03T18:55:52Z) - Language Models as Zero-Shot Planners: Extracting Actionable Knowledge
for Embodied Agents [111.33545170562337]
We investigate the possibility of grounding high-level tasks, expressed in natural language, to a chosen set of actionable steps.
We find that if pre-trained LMs are large enough and prompted appropriately, they can effectively decompose high-level tasks into low-level plans.
We propose a procedure that conditions on existing demonstrations and semantically translates the plans to admissible actions.
arXiv Detail & Related papers (2022-01-18T18:59:45Z) - Intrinsic Knowledge Evaluation on Chinese Language Models [5.293979881130493]
This paper proposes four tasks on syntactic, semantic, commonsense, and factual knowledge, aggregating to a total of $39,308$ questions.
Our probes and knowledge data prove to be a reliable benchmark for evaluating pre-trained Chinese LMs.
arXiv Detail & Related papers (2020-11-29T04:34:39Z) - oLMpics -- On what Language Model Pre-training Captures [84.60594612120173]
We propose eight reasoning tasks, which require operations such as comparison, conjunction, and composition.
A fundamental challenge is to understand whether the performance of a LM on a task should be attributed to the pre-trained representations or to the process of fine-tuning on the task data.
arXiv Detail & Related papers (2019-12-31T12:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.