Fugu-MT 論文翻訳(概要): A Geometric Profile of Semantic Information in Text: Frame-Conditional Uniqueness and a Trade-Off Triangle for Scalar Summaries

論文の概要: A Geometric Profile of Semantic Information in Text: Frame-Conditional Uniqueness and a Trade-Off Triangle for Scalar Summaries

arxiv url: http://arxiv.org/abs/2606.11222v1
Date: Wed, 27 May 2026 04:37:26 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-15 07:09:36.850473
Title: A Geometric Profile of Semantic Information in Text: Frame-Conditional Uniqueness and a Trade-Off Triangle for Scalar Summaries
Title（参考訳）: テキスト中の意味情報の幾何学的プロファイル:スカラー・サプリメントのためのフレーム・コンディショナル・ユニークさとトレードオフ・トライアングル
Authors: Dmitriy Kompaneets,
Abstract要約: テキストの文の埋め込み構造から意味的内容を測定するフレームワークを開発する。推奨のランク正規化構成は、28の順序チェックのうち25をポイント推定としてパスする。別個の変動結果は、幅座標を行列点過程の対数行列式に接続する。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: How much meaning does a text carry? Shannon's theory measures uncertainty over symbols and is intentionally indifferent to meaning, while pairwise metrics such as BERTScore compare two texts rather than characterizing one. We develop a geometric framework that measures semantic content from the structure of a text's sentence embeddings. The framework has three parts. First, within a fixed embedding and baseline, six natural axioms uniquely determine a scalar measure up to scale, a frame-conditional uniqueness theorem. The resulting scalar is empirically too coarse, motivating a richer representation. Second, we propose a three-coordinate semantic profile capturing novelty (displacement from generic discourse), breadth (diversity of distinct ideas), and integration (connectedness among them), together with a discrete minimal unit (the semantic quantum) whose resolution is fixed by a clustering threshold $τ$. Third, we prove a no-go theorem: no scalar summary of the profile can simultaneously satisfy analytic stability under paraphrase and concatenation, ordinal robustness across text scales, and cross-representation comparability. We exhibit two practical scalars, $S_{\mathrm{minmax}}$ and $S_{\mathrm{rank}}$, each occupying a distinct corner of this trade-off triangle. Validation across 23 synthetic categories, 5 Project Gutenberg novels, and 3 embedding models confirms the trade-off. The recommended rank-normalized configuration passes 25 of 28 ordinal checks as point estimates (21 of 28 after Benjamini-Hochberg correction), outperforming seven baselines including unigram entropy and a BERTScore-based novelty signal. A separate variational result connects the breadth coordinate to the log-determinant of a determinantal point process (Spearman $ρ= 0.985$ over 507 Gutenberg chapters), giving an optimization-theoretic foundation for breadth.
Abstract（参考訳）: テキストはどの程度の意味を持ちますか。シャノンの理論は記号に対する不確かさを測り、意図的に意味に無関心であり、BERTScoreのような対の指標は2つのテキストを特徴付けるのではなく比較する。テキストの文の埋め込み構造から意味的内容を測定する幾何学的枠組みを開発する。フレームワークには3つの部分があります。まず、固定埋め込みとベースラインの中で、6つの自然な公理がスカラー測度をスケールまで一意に決定し、フレーム条件の特異性定理である。結果として生じるスカラーは経験的に粗いので、よりリッチな表現を動機付けている。第2に,新奇性(一般的な言論から逸脱),広さ(異なるアイデアの多様性),統合性(相互接続性)を捉えた3座標意味プロファイルと,クラスタリングしきい値$τ$で解像度を固定した離散最小単位(セマンティック量子)を提案する。第3に、我々はノーゴー定理を証明している: プロファイルのスカラー的な要約は、パラフレーズと連結の下で解析的安定性を同時に満たすことができず、テキストスケールをまたいだ順序的堅牢性、および相互表現の可視性である。我々は2つの実用的なスカラー、$S_{\mathrm{minmax}}$と$S_{\mathrm{rank}}$を示し、それぞれがこのトレードオフ三角形の異なる角を占める。 23の合成カテゴリ、5つのプロジェクト・グーテンベルクの小説、3つの埋め込みモデルはトレードオフを確認します。推奨のランク正規化構成は28の正規チェックのうち25を点推定として通過し(ベンジャミン=ホックベルク補正後の28の21)、ユニグラムエントロピーとBERTScoreベースのノベルティ信号を含む7つのベースラインを上回っている。別個の変分結果は、幅座標を行列点過程の対数行列式 (Spearman $ρ= 0.985$ over 507 Gutenberg chapters) に接続し、幅の最適化理論の基礎を与える。

論文の概要: A Geometric Profile of Semantic Information in Text: Frame-Conditional Uniqueness and a Trade-Off Triangle for Scalar Summaries

関連論文リスト