Fugu-MT 論文翻訳(概要): lmfaoooo at SemEval-2026 Task 1: Humor Is an Audience. Preference Modeling for Constrained Humor Generation

論文の概要: lmfaoooo at SemEval-2026 Task 1: Humor Is an Audience. Preference Modeling for Constrained Humor Generation

arxiv url: http://arxiv.org/abs/2606.00022v1
Date: Tue, 14 Apr 2026 12:56:02 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-15 07:09:36.580603
Title: lmfaoooo at SemEval-2026 Task 1: Humor Is an Audience. Preference Modeling for Constrained Humor Generation
Title（参考訳）: Lmfaoooo at SemEval-2026 Task 1: Humor is audience. Preference Modeling for Constrained Humor Generation
Authors: Alexey Tikhonov, Alexey Ivanov,
Abstract要約: 本稿では,厳密な制約下でのユーモア生成に焦点を当てたSemEval-2026 Task-1(MWAHAHA)について述べる。このタスクは、1-on-1アリーナスタイルの比較において、人間の嗜好判断を介して提出されたシステムを評価する。 MWAHAHAの英語・中国語サブタスクでは1位,スペイン語サブタスクでは2位にランクインした。
参考スコア（独自算出の注目度）: 5.924227288651974
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Humor generation remains difficult not only because producing fluent, novel jokes is hard, but because "funny" is audience-dependent and supervision is noisy -- preferences vary with audience, context, and culture, and annotator agreement is often low. In this paper, we describe our system for the SemEval-2026 Task-1 (MWAHAHA), which focuses on humor generation under explicit constraints. The task evaluates submitted systems via human preference judgments in 1-on-1 arena-style comparisons. We adopt a "generate-many -> select-best" strategy. First, we generate a diverse pool of candidates per instance using multi-step prompting, model ensembling, and diversity-oriented decoding. Second, we select outputs using a preference model that approximates a "reader" by learning from human comparisons rather than absolute funniness scores. To support this approach, we release 2.5K human pairwise judgments collected through the Humor Arena prototype. We further propose an interpretable pipeline that converts labeled comparisons into a preference model. Across three preference datasets, our models consistently outperform baselines and show stronger cross-domain transfer. Finally, we apply the learned preference model to rank candidates for the MWAHAHA setting and release intermediate artifacts (candidate pools and rankings) to facilitate follow-up work. Our system ranked 1st in the English and Chinese subtasks of MWAHAHA and 2nd in the Spanish subtask.
Abstract（参考訳）: ユーモア生成は、流動的で斬新なジョークを生み出すことが難しいだけでなく、"面白い"が観客に依存しており、監督がうるさいため、オーディエンス、コンテキスト、文化によって好みが異なり、アノテータの合意は低いことが多いため、依然として難しいままである。本稿では,厳密な制約下でのユーモア生成に着目したSemEval-2026 Task-1(MWAHAHA)について述べる。このタスクは、1-on-1アリーナスタイルの比較において、人間の嗜好判断を介して提出されたシステムを評価する。我々は、"ジェネレート・マネ -> セレクト・ベスト"戦略を採用しています。まず、マルチステッププロンプト、モデルアンサンブル、多様性指向のデコーディングを用いて、インスタンス毎に多様な候補プールを生成する。第2に、絶対的な面白さスコアではなく、人間の比較から学習することで「読み手」を近似する選好モデルを用いて出力を選択する。このアプローチを支援するために,Humor Arenaのプロトタイプを用いて収集した2.5万個の人間対の判断を公表する。さらに,ラベル付き比較を選好モデルに変換する解釈可能なパイプラインを提案する。 3つの選好データセットを通して、我々のモデルは一貫してベースラインを上回り、より強力なクロスドメイン転送を示す。最後に、学習した選好モデルを用いて、MWAHA設定の候補をランク付けし、中間成果物(候補プールとランキング)を解放し、フォローアップ作業を容易にする。 MWAHAHAの英語・中国語サブタスクでは1位,スペイン語サブタスクでは2位にランクインした。

論文の概要: lmfaoooo at SemEval-2026 Task 1: Humor Is an Audience. Preference Modeling for Constrained Humor Generation

関連論文リスト