Fugu-MT 論文翻訳(概要): Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML

論文の概要: Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML

arxiv url: http://arxiv.org/abs/2603.24916v1
Date: Thu, 26 Mar 2026 01:08:52 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-27 20:52:48.036628
Title: Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML
Title（参考訳）: once-for-all Channel Mixers (HYPERTINYPW): TinyML の生成圧縮
Authors: Yassien Shaalan,
Abstract要約: 提案するHYPER-TINYPWは圧縮・アズ・ジェネレーション方式で、ほとんどのPW重みを生成された重みに置き換える。共有マイクロMLPは、レイヤごとの小さなコードからロード時に一度PWカーネルを合成し、それらをキャッシュし、標準的な整数演算子で実行する。商用のMCUランタイムを保存し、ワンオフでのみ追加する。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Deploying neural networks on microcontrollers is constrained by kilobytes of flash and SRAM, where 1x1 pointwise (PW) mixers often dominate memory even after INT8 quantization across vision, audio, and wearable sensing. We present HYPER-TINYPW, a compression-as-generation approach that replaces most stored PW weights with generated weights: a shared micro-MLP synthesizes PW kernels once at load time from tiny per-layer codes, caches them, and executes them with standard integer operators. This preserves commodity MCU runtimes and adds only a one-off synthesis cost; steady-state latency and energy match INT8 separable CNN baselines. Enforcing a shared latent basis across layers removes cross-layer redundancy, while keeping PW1 in INT8 stabilizes early, morphology-sensitive mixing. We contribute (i) TinyML-faithful packed-byte accounting covering generator, heads/factorization, codes, kept PW1, and backbone; (ii) a unified evaluation with validation-tuned t* and bootstrap confidence intervals; and (iii) a deployability analysis covering integer-only inference and boot versus lazy synthesis. On three ECG benchmarks (Apnea-ECG, PTB-XL, MIT-BIH), HYPER-TINYPW shifts the macro-F1 versus flash Pareto frontier: at about 225 kB it matches a roughly 1.4 MB CNN while being 6.31x smaller (84.15% fewer bytes), retaining at least 95% of large-model macro-F1. Under 32-64 kB budgets it sustains balanced detection where compact baselines degrade. The mechanism applies broadly to other 1D biosignals, on-device speech, and embedded sensing tasks where per-layer redundancy dominates, indicating a wider role for compression-as-generation in resource-constrained ML systems. Beyond ECG, HYPER-TINYPW transfers to TinyML audio: on Speech Commands it reaches 96.2% test accuracy (98.2% best validation), supporting broader applicability to embedded sensing workloads where repeated linear mixers dominate memory.
Abstract（参考訳）: マイクロコントローラへのニューラルネットワークの展開は、1x1ポイントワイド(PW)ミキサーが視覚、オーディオ、ウェアラブルセンサーをまたいだINT8量子化後にもメモリを支配している場合、キロバイトのフラッシュとSRAMによって制限される。共有マイクロMLPは、小さな層ごとのコードからロード時にPWカーネルを1回合成し、それらをキャッシュし、標準整数演算子で実行する。これは、コモディティなMCUランタイムを保存し、ワンオフ合成コストのみを追加し、安定した状態のレイテンシとエネルギーはINT8の分離可能なCNNベースラインにマッチする。層間の共有潜伏基底を強制することは、層間冗長性を排除し、また、INT8のPW1を保ち、モルフォロジーに敏感な混合を早期に安定化させる。コントリビューション (i)TinyML対応のパックバイト会計で、ジェネレータ、ヘッド/ファクトリゼーション、コード、保留PW1及びバックボーンをカバーしている。二検証調整されたt*及びブートストラップ信頼区間による統一評価三整数のみの推論とブートと遅延合成を対象とするデプロイ可能性分析。 ECGベンチマーク(Apnea-ECG, PTB-XL, MIT-BIH)では、HYPER-TINYPWはマクロF1とフラッシュParetoフロンティアをシフトする。 32-64kBの予算の下では、コンパクトなベースラインが劣化するバランスの取れた検出が維持される。このメカニズムは、他の1Dバイオシグナー、オンデバイス音声、層ごとの冗長性が支配される組込みセンシングタスクに広く適用され、リソース制約MLシステムにおける圧縮・アズ・ジェネレーションの幅広い役割を示す。 ECG以外にも、HYPER-TINYPWはTinyMLオーディオに転送される: 音声コマンドでは96.2%のテスト精度(98.2%のベストバリデーション)に達し、繰り返しリニアミキサーがメモリを支配している組み込みセンシングワークロードへの適用性をサポートする。

論文の概要: Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML

関連論文リスト