論文の概要: Learning and Adaptation in Millimeter-Wave: a Dual Timescale Variational
- arxiv url: http://arxiv.org/abs/2107.05466v1
- Date: Sun, 27 Jun 2021 19:04:18 GMT
- ステータス: 処理完了
- システム内更新日: 2021-07-18 12:20:50.988669
- Title: Learning and Adaptation in Millimeter-Wave: a Dual Timescale Variational
- Title(参考訳): ミリ波の学習と適応:2次元時間スケール変動フレームワーク
- Authors: Muddassar Hussain, Nicolo Michelusi
- Abstract要約: ミリ波車両ネットワークはビームトレーニングのオーバーヘッドが大きいため、狭ビーム通信が可能である。
- 参考スコア(独自算出の注目度): 4.162663632560141
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Millimeter-wave vehicular networks incur enormous beam-training overhead to
enable narrow-beam communications. This paper proposes a learning and
adaptation framework in which the dynamics of the communication beams are
learned and then exploited to design adaptive beam-training with low overhead:
on a long-timescale, a deep recurrent variational autoencoder (DR-VAE) uses
noisy beam-training observations to learn a probabilistic model of beam
dynamics; on a short-timescale, an adaptive beam-training procedure is
formulated as a partially observable (PO-) Markov decision process (MDP) and
optimized via point-based value iteration (PBVI) by leveraging beam-training
feedback and a probabilistic prediction of the strongest beam pair provided by
the DR-VAE. In turn, beam-training observations are used to refine the DR-VAE
via stochastic gradient ascent in a continuous process of learning and
adaptation. The proposed DR-VAE mobility learning framework learns accurate
beam dynamics: it reduces the Kullback-Leibler divergence between the ground
truth and the learned beam dynamics model by 86% over the Baum-Welch algorithm
and by 92\% over a naive mobility learning approach that neglects feedback
errors. The proposed dual-timescale approach yields a negligible loss of
spectral efficiency compared to a genie-aided scheme operating under error-free
feedback and foreknown mobility model. Finally, a low-complexity policy is
proposed by reducing the POMDP to an error-robust MDP. It is shown that the
PBVI- and error-robust MDP-based policies improve the spectral efficiency by
85% and 67%, respectively, over a policy that scans exhaustively over the
dominant beam pairs, and by 16% and 7%, respectively, over a state-of-the-art
POMDP policy.
- Abstract(参考訳): ミリ波車両ネットワークはビームトレーニングのオーバーヘッドが大きいため、狭ビーム通信が可能である。
This paper proposes a learning and adaptation framework in which the dynamics of the communication beams are learned and then exploited to design adaptive beam-training with low overhead: on a long-timescale, a deep recurrent variational autoencoder (DR-VAE) uses noisy beam-training observations to learn a probabilistic model of beam dynamics; on a short-timescale, an adaptive beam-training procedure is formulated as a partially observable (PO-) Markov decision process (MDP) and optimized via point-based value iteration (PBVI) by leveraging beam-training feedback and a probabilistic prediction of the strongest beam pair provided by the DR-VAE.
提案するdr-vaeモビリティ学習フレームワークは、正確なビームダイナミクスを学習する: 基底真理と学習ビームダイナミクスモデルの間のkullback-leiblerの発散を、baum-welchアルゴリズムを86%、フィードバックエラーを無視するナイーブモビリティ学習アプローチを92%削減する。
最後に,POMDP を誤差破壊 MDP に還元することで,低複雑さ政策を提案する。
その結果, pbviおよびエラーロバストmdpに基づく政策は, 主ビーム対で徹底的に走査する政策に対して, スペクトル効率を85%, 67%, 最先端pomdp政策で16%, 7%向上させた。
