Abstract: In this letter, we investigate the hybrid beamforming based on deep
reinforcement learning (DRL) for millimeter Wave (mmWave) multi-user (MU)
multiple-input-single-output (MISO) system. A multi-agent DRL method is
proposed to solve the exploration efficiency problem in DRL. In the proposed
method, prioritized replay buffer and more informative reward are applied to
accelerate the convergence. Simulation results show that the proposed
architecture achieves higher spectral efficiency and less time consumption than
the benchmarks, thus is more suitable for practical applications.