Fugu-MT 論文翻訳(概要): Universal Smoothness via Bernstein Polynomials: A Constructive Approximation Approach for Activation Functions

論文の概要: Universal Smoothness via Bernstein Polynomials: A Constructive Approximation Approach for Activation Functions

arxiv url: http://arxiv.org/abs/2605.02591v1
Date: Mon, 04 May 2026 13:38:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-05 20:33:50.308971
Title: Universal Smoothness via Bernstein Polynomials: A Constructive Approximation Approach for Activation Functions
Title（参考訳）: ベルンシュタイン多項式による普遍的平滑性:活性化関数に対する構成的近似法
Authors: Wentao Zhang, Yutong Zhang, Yifan Zhu, Wentao Mo,
Abstract要約: ディープニューラルネットワークの有効性は、非線形活性化関数の設計に大きく依存している。提案手法は厳密な連続微分可能性と1の非拡張リプシッツ定数を保証する。このアプローチは、標準画像分類ベンチマークにおける最先端のベースラインを一貫して上回る。
参考スコア（独自算出の注目度）: 16.856453018275467
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The efficacy of deep neural networks is heavily reliant on the design of non-linear activation functions, yet existing approaches often struggle to balance optimization stability with computational efficiency. While piecewise linear functions offer inference speed, they suffer from optimization instability due to non-differentiability at the origin, whereas smooth counterparts typically incur significant computational overhead through their reliance on transcendental operations. To address these limitations, this paper proposes a general smoothing framework based on constructive approximation theory and introduces the Bernstein Linear Unit (BerLU). This novel activation function utilizes Bernstein polynomials to construct a differentiable quadratic transition region that effectively eliminates singularities while maintaining a piecewise linear structure. Theoretical analysis demonstrates that the proposed method guarantees strictly continuous differentiability and a non-expansive Lipschitz constant of one, which ensures stable gradient propagation and prevents the gradient explosion problems common in deep architectures. Comprehensive empirical evaluations across representative Vision Transformer and Convolutional Neural Network architectures confirm that this approach consistently outperforms state-of-the-art baselines on standard image classification benchmarks while delivering superior computational and memory efficiency.
Abstract（参考訳）: ディープニューラルネットワークの有効性は非線形活性化関数の設計に大きく依存しているが、既存のアプローチでは最適化安定性と計算効率のバランスをとるのに苦労することが多い。分数線形関数は推論速度を提供するが、原点における微分不可能性による最適化の不安定性に悩まされる一方、スムーズな関数は超越演算に依存することで計算上のオーバーヘッドを生じさせるのが一般的である。これらの制約に対処するために,構成近似理論に基づく一般的な平滑化フレームワークを提案し,Berstein Linear Unit (BerLU) を提案する。この新しい活性化関数はベルンシュタイン多項式を利用して微分可能な二次遷移領域を構築し、断片的線形構造を維持しながら特異性を効果的に排除する。理論的解析により,提案手法は厳密な連続的な微分可能性と拡張不可能なリプシッツ定数を保証し,安定な勾配伝播を保証し,深層構造に共通する勾配爆発問題を防止できることを示した。代表的なVision TransformerとConvolutional Neural Networkアーキテクチャ間の総合的な経験的評価により、このアプローチが標準画像分類ベンチマークの最先端ベースラインを一貫して上回り、優れた計算とメモリ効率を提供することを確認した。

論文の概要: Universal Smoothness via Bernstein Polynomials: A Constructive Approximation Approach for Activation Functions

関連論文リスト