||The authors present a model-free policy-based reinforcement learning
model that introduces perturbations on the pattern of a metasurface.
The objective is to learn a policy that changes the size of the
patches, and therefore the impedance in the sides of an artificially structured
material. The proposed iterative model assigns the highest reward
when the patch sizes allow the transmission along a constrained path
and penalties when the patch sizes make the surface wave radiate to
the sides of the metamaterial. After convergence, the proposed
model learns an optimal patch pattern that achieves lateral confinement
along the metasurface. Simulation results show that the proposed
learned-pattern can effectively guide the electromagnetic wave
through a metasurface, maintaining its instantaneous eigenstate when
the homogeneity is perturbed. Moreover, the pattern learned to
prevent reflections by changing the patch sizes adiabatically. The
reflection coefficient S1, 2 shows that most of the power gets transferred
from the source to the destination with the proposed design.