Abstract: The use of deep learning-based techniques for approximating secure encoding
functions has attracted considerable interest in wireless communications due to
impressive results obtained for general coding and decoding tasks for wireless
communication systems. Of particular importance is the development of
model-free techniques that work without knowledge about the underlying channel.
Such techniques utilize for example generative adversarial networks to estimate
and model the conditional channel distribution, mutual information estimation
as a reward function, or reinforcement learning. In this paper, the approach of
reinforcement learning is studied and, in particular, the policy gradient
method for a model-free approach of neural network-based secure encoding is
investigated. Previously developed techniques for enforcing a certain co-set
structure on the encoding process can be combined with recent reinforcement
learning approaches. This new approach is evaluated by extensive simulations,
and it is demonstrated that the resulting decoding performance of an
eavesdropper is capped at a certain error level.