Fugu-MT 論文翻訳(概要): CogScale: Scalable Benchmark for Sequence Processing

論文の概要: CogScale: Scalable Benchmark for Sequence Processing

arxiv url: http://arxiv.org/abs/2605.19758v1
Date: Tue, 19 May 2026 12:32:52 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-20 15:03:09.327627
Title: CogScale: Scalable Benchmark for Sequence Processing
Title（参考訳）: CogScale: シーケンス処理のためのスケーラブルなベンチマーク
Authors: Yannis Bendi-Ouis, Romain de Coudenhove, Xavier Hinaut,
Abstract要約: 新しいアーキテクチャをテストするには、しばしば大量のデータセットやモデルにスケールアップする必要がある。特定の認知能力と記憶能力の分離と評価を目的とした14のスケーラブルな合成タスクのベンチマークであるCogScaleを提案する。その結果,従来のRNNとEcho State Networksは厳格なパラメータ予算内で基本的保持を保ちながら,注目機構と最新の状態空間モデルのみが常に高い性能を維持していることがわかった。
参考スコア（独自算出の注目度）: 1.8853398065417313
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The ability to maintain and manipulate information over time is a fundamental aspect of living beings and Artificial Intelligence. While modern models have achieved remarkable success in tasks like natural language processing, evaluating the capacity of novel architectures to process sequential information remains computationally expensive and time-consuming. Testing a new architecture often requires scaling up to massive datasets and models, leading to vast computational costs and slow iteration cycles. In this paper, we propose CogScale, a benchmark of 14 scalable synthetic tasks designed to isolate and evaluate specific cognitive and memory abilities at different parametrizable scales. By providing a standardized, lightweight framework, CogScale allows researchers to rapidly validate architectural innovations before committing to large-scale training. To establish a solid baseline, we evaluate seven distinct architectures: Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), xLSTM, Echo State Network (ESN), Mamba, Transformer Decoder, and Transformer Encoder-Decoder. These evaluations are conducted under strict parameter budgets (1k, 10k, and 100k) and across different difficulty levels and scales. Our results show that while classical RNNs and Echo State Networks excel at basic retention within strict parameter budgets, only attention mechanisms and modern state-space models consistently maintain high performance as reasoning complexity and task difficulty scale.
Abstract（参考訳）: 情報を維持し、操作する能力は、生物と人工知能の基本的な側面である。現代のモデルは自然言語処理のようなタスクにおいて顕著な成功を収めてきたが、シーケンシャルな情報を処理する新しいアーキテクチャの能力の評価は、計算コストと時間を要するままである。新しいアーキテクチャをテストするには、しばしば大量のデータセットやモデルにスケールアップする必要がある。本稿では,14のスケーラブルな合成タスクのベンチマークであるCogScaleを提案する。 CogScaleは、標準化された軽量なフレームワークを提供することで、大規模トレーニングにコミットする前に、アーキテクチャのイノベーションを迅速に検証することができる。 GRU(Gated Recurrent Unit)、Long Short-Term Memory(LSTM)、xLSTM、Echo State Network(ESN)、Mamba、Transformer Decoder、Transformer Encoder-Decoderの7つのアーキテクチャを評価した。これらの評価は厳格なパラメータ予算(1k, 10k, 100k)で実施され, 難易度とスケールが異なる。この結果から,従来のRNNとEcho State Networksは厳格なパラメータ予算内で基本的保持を保ちながら,注意機構と最新の状態空間モデルのみが,推論複雑性とタスク難易度尺度として常に高い性能を維持していることがわかった。

論文の概要: CogScale: Scalable Benchmark for Sequence Processing

関連論文リスト