Related papers: Learning and Adaptation in Millimeter-Wave: a Dual Timescale Variational Framework

Learning and Adaptation in Millimeter-Wave: a Dual Timescale Variational Framework

URL: http://arxiv.org/abs/2107.05466v1
Date: Sun, 27 Jun 2021 19:04:18 GMT
Title: Learning and Adaptation in Millimeter-Wave: a Dual Timescale Variational Framework
Authors: Muddassar Hussain, Nicolo Michelusi
Abstract summary: Millimeter-wave vehicular networks incur enormous beam-training overhead to enable narrow-beam communications. This paper proposes a learning and adaptation framework in which the dynamics of the communication beams are learned and then exploited to design adaptive beam-training with low overhead.
Score: 4.162663632560141
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Millimeter-wave vehicular networks incur enormous beam-training overhead to enable narrow-beam communications. This paper proposes a learning and adaptation framework in which the dynamics of the communication beams are learned and then exploited to design adaptive beam-training with low overhead: on a long-timescale, a deep recurrent variational autoencoder (DR-VAE) uses noisy beam-training observations to learn a probabilistic model of beam dynamics; on a short-timescale, an adaptive beam-training procedure is formulated as a partially observable (PO-) Markov decision process (MDP) and optimized via point-based value iteration (PBVI) by leveraging beam-training feedback and a probabilistic prediction of the strongest beam pair provided by the DR-VAE. In turn, beam-training observations are used to refine the DR-VAE via stochastic gradient ascent in a continuous process of learning and adaptation. The proposed DR-VAE mobility learning framework learns accurate beam dynamics: it reduces the Kullback-Leibler divergence between the ground truth and the learned beam dynamics model by 86% over the Baum-Welch algorithm and by 92\% over a naive mobility learning approach that neglects feedback errors. The proposed dual-timescale approach yields a negligible loss of spectral efficiency compared to a genie-aided scheme operating under error-free feedback and foreknown mobility model. Finally, a low-complexity policy is proposed by reducing the POMDP to an error-robust MDP. It is shown that the PBVI- and error-robust MDP-based policies improve the spectral efficiency by 85% and 67%, respectively, over a policy that scans exhaustively over the dominant beam pairs, and by 16% and 7%, respectively, over a state-of-the-art POMDP policy.

Related papers

Knowledge Distillation for mmWave Beam Prediction Using Sub-6 GHz Channels [18.712418156283437]
We propose a framework for sub-6 GHz channel-mmWave beam mapping based on the knowledge distillation (KD) technique.<n>We show that the proposed student models achieve the teacher's beam prediction accuracy and spectral efficiency while reducing trainable parameters and computational complexity by 99%.
arXiv Detail & Related papers (2026-02-04T16:15:32Z)
RapidUn: Influence-Driven Parameter Reweighting for Efficient Large Language Model Unlearning [5.265976319881303]
We introduce RapidUn, an influence-driven and parameter-efficient unlearning framework.<n>It first estimates per-sample influence through a fast estimation module, then maps these scores into adaptive update weights.<n>On Mistral-7B and Llama-3-8B across Dolly-15k and Alpaca-57k, RapidUn achieves up to 100 times higher efficiency than full retraining.
arXiv Detail & Related papers (2025-12-04T05:00:52Z)
Token-Level Inference-Time Alignment for Vision-Language Models [58.41370989069588]
Vision-Language Models (VLMs) have become essential backbones of modern multimodal intelligence.<n>We present TITA, a lightweight framework that freezes the base VLM and instead trains a reward model to approximate its distribution.<n>During inference, implicit preference signals are extracted as log-probability ratios between the reward model and the target VLM, yielding dense autoregressive feedback.
arXiv Detail & Related papers (2025-10-20T09:58:03Z)
EDFFDNet: Towards Accurate and Efficient Unsupervised Multi-Grid Image Registration [17.190325630307097]
We propose an Exponential-Decay Free-Form Deformation Network (EDFFDNet), which employs free-form deformation with an exponential-decay basis function.<n>By transforming dense interactions into sparse ones, ASMA reduces parameters and improves accuracy.<n>Experiments demonstrate that EDFFDNet reduces parameters, memory, and total runtime by 70.5%, 32.6%, and 33.7%, respectively.<n>EDFFDNet-2 further improves PSNR by 1.06 dB while maintaining lower computational costs.
arXiv Detail & Related papers (2025-09-09T12:30:51Z)
Digital Twin-Assisted Explainable AI for Robust Beam Prediction in mmWave MIMO Systems [18.49800990388549]
This paper proposes a robust and explainable deep learning (DL)-based beam alignment engine for mmWave systems.<n>The framework reduces real-world data needs by 70%, beam training overhead by 62%, and improves outlier detection by up to 8.5x.<n> Experimental results show that the proposed framework reduces real-world data needs by 70%, beam training overhead by 62%, and improves outlier detection by up to 8.5x.
arXiv Detail & Related papers (2025-07-12T09:56:20Z)
Reinforce LLM Reasoning through Multi-Agent Reflection [8.088795955922656]
We introduce DPSDP, a reinforcement learning algorithm that trains an actor-critic LLM system to iteratively refine answers via direct preference learning on self-generated data.<n>Theoretically, DPSDP can match the performance of any policy within the training distribution.<n>For example, on benchmark MATH 500, majority voting over five refinement steps increases first-turn accuracy from 58.2% to 63.2% with Ministral-based models.
arXiv Detail & Related papers (2025-06-10T02:43:47Z)
Accelerating Learned Image Compression Through Modeling Neural Training Dynamics [11.729071258457138]
This paper takes a step forward in accelerating the training of LIC methods by modeling the neural training dynamics.<n>We first propose a Sensitivity-aware True and Dummy Embedding Training mechanism (STDET) that clusters LIC model parameters into few separate modes.<n>By further utilizing the stable intra-mode correlations throughout training and parameter sensitivities, we gradually embed non-reference parameters, reducing the number of trainable parameters.
arXiv Detail & Related papers (2025-05-23T17:03:13Z)
Joint Transmit and Pinching Beamforming for Pinching Antenna Systems (PASS): Optimization-Based or Learning-Based? [89.05848771674773]
A novel antenna system ()-enabled downlink multi-user multiple-input single-output (MISO) framework is proposed. It consists of multiple waveguides, which equip numerous low-cost antennas, named (PAs) The positions of PAs can be reconfigured to both spanning large-scale path and space.
arXiv Detail & Related papers (2025-02-12T18:54:10Z)
Efficient Remote Photoplethysmography with Temporal Derivative Modules and Time-Shift Invariant Loss [6.381149074212898]
We present a lightweight neural model for remote heart rate estimation. We focus on the efficient-temporal learning of facial photoplethysmography. Compared to existing models, our approach shows competitive accuracy with a much lower number of parameters and lower computational cost.
arXiv Detail & Related papers (2022-03-21T11:08:06Z)
Unit-Modulus Wireless Federated Learning Via Penalty Alternating Minimization [64.76619508293966]
Wireless federated learning (FL) is an emerging machine learning paradigm that trains a global parametric model from distributed datasets via wireless communications. This paper proposes a wireless FL framework, which uploads local model parameters and computes global model parameters via wireless communications.
arXiv Detail & Related papers (2021-08-31T08:19:54Z)
Discriminator Augmented Model-Based Reinforcement Learning [47.094522301093775]
It is common in practice for the learned model to be inaccurate, impairing planning and leading to poor performance. This paper aims to improve planning with an importance sampling framework that accounts for discrepancy between the true and learned dynamics.
arXiv Detail & Related papers (2021-03-24T06:01:55Z)
Adaptive Gradient Method with Resilience and Momentum [120.83046824742455]
We propose an Adaptive Gradient Method with Resilience and Momentum (AdaRem) AdaRem adjusts the parameter-wise learning rate according to whether the direction of one parameter changes in the past is aligned with the direction of the current gradient. Our method outperforms previous adaptive learning rate-based algorithms in terms of the training speed and the test error.
arXiv Detail & Related papers (2020-10-21T14:49:00Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
Optimization-driven Deep Reinforcement Learning for Robust Beamforming in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver. We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming. We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)
Mixed Reinforcement Learning with Additive Stochastic Uncertainty [19.229447330293546]
Reinforcement learning (RL) methods often rely on massive exploration data to search optimal policies, and suffer from poor sampling efficiency. This paper presents a mixed RL algorithm by simultaneously using dual representations of environmental dynamics to search the optimal policy. The effectiveness of the mixed RL is demonstrated by a typical optimal control problem of non-affine nonlinear systems.
arXiv Detail & Related papers (2020-02-28T08:02:34Z)
Millimeter Wave Communications with an Intelligent Reflector: Performance Optimization and Distributional Reinforcement Learning [119.97450366894718]
A novel framework is proposed to optimize the downlink multi-user communication of a millimeter wave base station. A channel estimation approach is developed to measure the channel state information (CSI) in real-time. A distributional reinforcement learning (DRL) approach is proposed to learn the optimal IR reflection and maximize the expectation of downlink capacity.
arXiv Detail & Related papers (2020-02-24T22:18:54Z)
Learnable Bernoulli Dropout for Bayesian Deep Learning [53.79615543862426]
Learnable Bernoulli dropout (LBD) is a new model-agnostic dropout scheme that considers the dropout rates as parameters jointly optimized with other model parameters. LBD leads to improved accuracy and uncertainty estimates in image classification and semantic segmentation.
arXiv Detail & Related papers (2020-02-12T18:57:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.