Patent attributes
Hybrid use of dual policies is provided to improve a communication system. In a multiple access scenario, when an inactive user equipment (UE) transitions to an active state, it may be become a burden to a radio cell on which it was previously camping. In some embodiments, hybrid load balancing is provided using a hierarchical machine learning paradigm based on reinforcement learning in which an LSTM generates a goal for one policy influencing cell reselection so that another policy influencing handover over active UEs can be assisted. The communication system as influenced by the policies is modeled as a Markov decision process (MDP). The policies controlling the active UEs and inactive UEs are coupled, and measureable system characteristics are improved. In some embodiments, policy actions depend at least in part on energy saving.