Fugu-MT 論文翻訳(概要): Beauty in the Eye of AI: Aligning LLMs and Vision Models with Human Aesthetics in Network Visualization

論文の概要: Beauty in the Eye of AI: Aligning LLMs and Vision Models with Human Aesthetics in Network Visualization

arxiv url: http://arxiv.org/abs/2604.03417v1
Date: Fri, 03 Apr 2026 19:30:34 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-07 15:49:18.562649
Title: Beauty in the Eye of AI: Aligning LLMs and Vision Models with Human Aesthetics in Network Visualization
Title（参考訳）: AIの目における美: ネットワークの可視化における人間の美学によるLLMと視覚モデルのアライメント
Authors: Peng Zhang, Xuefeng Li, Xiaoqi Wang, Han-Wei Shen, Yifan Hu,
Abstract要約: 人間の判断のためのプロキシとして,大規模言語モデル (LLM) と視覚モデル (VM) について検討する。画像埋め込みなどの多種多様な入力形式を組み合わさったプロンプトエンジニアリングにより,LLMと人間のアライメントが大幅に向上することを示す。以上の結果から,AIが人間のラベル付けのスケーラブルなプロキシとして機能する可能性が示唆された。
参考スコア（独自算出の注目度）: 47.53880889741314
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Network visualization has traditionally relied on heuristic metrics, such as stress, under the assumption that optimizing them leads to aesthetic and informative layouts. However, no single metric consistently produces the most effective results. A data-driven alternative is to learn from human preferences, where annotators select their favored visualization among multiple layouts of the same graphs. These human-preference labels can then be used to train a generative model that approximates human aesthetic preferences. However, obtaining human labels at scale is costly and time-consuming. As a result, this generative approach has so far been tested only with machine-labeled data. In this paper, we explore the use of large language models (LLMs) and vision models (VMs) as proxies for human judgment. Through a carefully designed user study involving 27 participants, we curated a large set of human preference labels. We used this data both to better understand human preferences and to bootstrap LLM/VM labelers. We show that prompt engineering that combines few-shot examples and diverse input formats, such as image embeddings, significantly improves LLM-human alignment, and additional filtering by the confidence score of the LLM pushes the alignment to human-human levels. Furthermore, we demonstrate that carefully trained VMs can achieve VM-human alignment at a level comparable to that between human annotators. Our results suggest that AI can feasibly serve as a scalable proxy for human labelers.
Abstract（参考訳）: ネットワークの可視化は伝統的にストレスのようなヒューリスティックな指標に依存してきた。しかし、一つの計量が常に最も効果的な結果を生み出すことはない。データ駆動の代替手段は人間の好みから学び、アノテータは同じグラフの複数のレイアウトの中で好きな視覚化を選択することである。これらの人間の嗜好ラベルは、人間の美的嗜好を近似する生成モデルを訓練するために使用することができる。しかし、人間のラベルを大規模に取得するにはコストと時間を要する。結果として、この生成的アプローチは、これまでマシンラベル付きデータでのみテストされてきた。本稿では,人間の判断のためのプロキシとして,大規模言語モデル (LLM) と視覚モデル (VM) の利用について検討する。 27名の被験者を対象とする慎重にデザインされたユーザスタディを通じて,人間の嗜好ラベルを多数収集した。私たちはこのデータを使って、人間の好みをよりよく理解し、LLM/VMラベルをブートストラップしました。画像埋め込みなどの多彩な入力形式と少数ショットの例を組み合わせた迅速なエンジニアリングにより,LLMと人間のアライメントが大幅に向上し,LLMの信頼性スコアによる追加フィルタリングによって人間-人間のレベルへのアライメントが促進されることを示す。さらに、慎重に訓練されたVMは、人間のアノテータと同等のレベルで、VMと人間のアライメントを達成できることを示す。以上の結果から,AIが人間のラベル付けのスケーラブルなプロキシとして機能する可能性が示唆された。

論文の概要: Beauty in the Eye of AI: Aligning LLMs and Vision Models with Human Aesthetics in Network Visualization

関連論文リスト