Fugu-MT 論文翻訳(概要): Your AI Travel Agent Would Book You a Bullfight: An Agentic Benchmark for Implicit Animal Welfare in Frontier AI Models

論文の概要: Your AI Travel Agent Would Book You a Bullfight: An Agentic Benchmark for Implicit Animal Welfare in Frontier AI Models

arxiv url: http://arxiv.org/abs/2606.18142v2
Date: Wed, 17 Jun 2026 06:58:17 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-18 13:57:35.227023
Title: Your AI Travel Agent Would Book You a Bullfight: An Agentic Benchmark for Implicit Animal Welfare in Frontier AI Models
Title（参考訳）: AIトラベルエージェントは、フロンティアAIモデルの動物福祉のためのエージェントベンチマーク
Authors: Jasmine Brazilek, Joel Christoph, Miles Tidmarsh, Carol Kline, Oliver Tullio, Arturs Kanepajs,
Abstract要約: 我々は、AIエージェントが、ユーザーに代わって行動する際に、動物による搾取を含むオプションを避けるかどうかを測定する最初のエージェントベンチマークであるTAC(Travel Agent Compassion)を紹介する。全てのモデルが64%以下で、最高のパフォーマー(Claude Opus 4.7)は53%である。本稿では,文化ドメイン間のカテゴリレベルの変動,テキスト応答型福祉ベンチマークの限界,およびEUの汎用AIコード・オブ・プラクティス・システムリスク・フレームワークについて論じる。
参考スコア（独自算出の注目度）: 0.030786914102688596
License: http://creativecommons.org/licenses/by/4.0/
Abstract: AI agents are moving from advisors to actors, booking travel, planning menus, and running procurement on behalf of users. Existing benchmarks for AI and animal welfare evaluate model text responses to question-answer prompts, leaving open whether the welfare reasoning surfaced in those responses transfers to agentic deployment where the model must take actions with tools. We introduce TAC (Travel Agent Compassion), the first agentic benchmark measuring whether AI agents avoid options involving animal exploitation when acting on behalf of users. TAC presents an AI agent with twelve hand-authored travel booking scenarios across six categories of animal exploitation, augmented to forty-eight samples to control for price, rating, and position confounds. We evaluate seven frontier models from four labs. Every model scores below the chance level of sixty-four percent, with the best performer (Claude Opus 4.7) at fifty-three percent. A single welfare-aware sentence in the system prompt yields gains of forty-seven to sixty-three percentage points in Claude and GPT-5.5, twenty-six points in GPT-5.2, and under twelve points in DeepSeek and Gemini. An auxiliary Inspect Scout audit of 288 base-condition transcripts from the top two performers, using Gemini 2.5 Flash Lite as judge, flags zero transcripts for evaluation awareness, suggesting the below-chance rates do not stem from the models recognising the evaluation. We discuss implications for category-level variation across cultural domains, the limits of text-response welfare benchmarks, and the EU General-Purpose AI Code of Practice systemic risk framework.
Abstract（参考訳）: AIエージェントはアドバイザからアクターに移行し、旅行の予約、メニューの計画、ユーザーに代わって調達を行う。既存のAIと動物福祉のベンチマークでは、質問回答のプロンプトに対するモデルテキスト応答を評価し、これらの応答で浮上した福祉推論が、モデルがツールで行動しなければならないエージェントデプロイメントに転送されるかどうかを公開している。我々は、AIエージェントが、ユーザーに代わって行動する際に、動物による搾取を含むオプションを避けるかどうかを測定する最初のエージェントベンチマークであるTAC(Travel Agent Compassion)を紹介する。 TACは、動物利用の6つのカテゴリにまたがって、手書きの旅行予約シナリオを12つ用意したAIエージェントを提示する。 4つの実験室から7つのフロンティアモデルを評価する。全てのモデルが64%以下で、最高のパフォーマー(Claude Opus 4.7)は53%である。このシステムでは、Claude と GPT-5.5 の47～63パーセント、GPT-5.2 の26ポイント、DeepSeek と Gemini の12ポイント未満の利得を得る。 Gemini 2.5 Flash Liteを審査として使用した上位2人のパフォーマーからの288のベースコンディショナリストの補助的監査では、評価意識のためにゼロトランスクリプトをフラグ付け、以下のレートは評価を認識するモデルに由来するものではないことを示唆している。本稿では,文化ドメイン間のカテゴリレベルの変動,テキスト応答型福祉ベンチマークの限界,およびEUの汎用AIコード・オブ・プラクティス・システムリスク・フレームワークについて論じる。

論文の概要: Your AI Travel Agent Would Book You a Bullfight: An Agentic Benchmark for Implicit Animal Welfare in Frontier AI Models

関連論文リスト